The current wave of LLMs default to conversational natural language — languages that humans communicate in like English. Parsing natural language is an extremely difficult task, no matter how much you pamper a prompt with rules like "respond in the form a bulleted list". Natural language might have structure, but it's hard for typical software to reconstruct it from raw text.
Surprisingly, we can ask LLMs to respond in the form of JSON, and they generally respond with something sensible!
ranslate the following request into JSON.
Could I get a blueberry muffin and a grande latte?
Respond only in JSON that satisfies the Response type: