Parsing rules for chats

General usage

Passing parse argument to Chat.complete method, you can parse the result in a specific way.

from mllm import Chat
chat = Chat()
chat += "Output a JSON dict with keys 'a' and 'b' and values 1 and 2"
res = chat.complete(parse="dict")
print(res['a'])

Advantages of using the built-in parsing rules:

  • Automated retry when the parsing fails
  • Robust parsing rules that can handle various outputs.

dict

Parse the result as a JSON dictionary using json.loads. This rule will ignore the '''json surrounding the output.

list

Similar to dict, but parse the result as a python list. The parser will find the first [ and the last ] in the output and try to parse the content in between.

obj

Similar to dict, but parse using ast.literal_eval. This rule is useful when the output is a python object.

quotes

Parse the result as a string. This rule will ignore the ```xxx surrounding the output. This rule is useful when you want the LLM to output codes.

from mllm import Chat
chat = Chat()
chat += "Output a python code for quicksort. Start your answer with ```python"
res = chat.complete(parse="quotes")
print(res)

colon

Capture the contents after the first colon : in the output. This rule is useful when you want to limit the topic of the output.

from mllm import Chat
chat = Chat()
chat += "Summarize the following text:<text> This is a test.</text>"
chat += "Start your answer with Summary:"
res = chat.complete(parse="colon")
print(res)

Automated correction

Some good LLMs do not support a JSON mode, such as claude models. They usually output JSON with small semantic errors. We designed an auto-correction rule to fix these errors by inputting these bad JSON into a cheap LLM that supports JSON mode.

You have to turn on this feature by setting parse_options.correct_json_by_model = True.

from mllm.config import parse_options
# You have to enable this option before using the `correct_json_by_model` rule
parse_options.correct_json_by_model = True
# parse_options.cheap_model = "gpt-4o-mini" # The default model is gpt-4o-mini
Last Updated:
Contributors: Zijian Zhang