Structured Outputs
is a feature that ensures that the model will output a valid JSON object that adheres to a provided JSON Schema.
Initially everyone was amazed of using LLMs to generate text outputs. But as these llm generations became natively integrated with code within more complex AI systems — developers needed more structured outputs to enable upstream tasks without complex data transformations.
JSON mode was the first feature that enabled valid JSON objects as outputs… but they didn’t adhere to specific schema, so the models still made mistakes and hallucinated.
Enter Structured Outputs.
Now, developers can include a JSON schema in addition to the prompt, guiding the model to generate a JSON object that precisely matches the specified schema.
You can force the model to follow a given JSON schema with the Gemini 1.5 Pro, and Gemini 1.5 Flash models JSON mode feature. To do that simply specify the schema for the JSON response in the response_schema
property of your model configuration.
You can find more information here.
Structured Outputs are available in the latest OpenAI models, starting with GPT-4o
, including gpt-4o-mini-2024-07-18
and later, and gpt-4o-2024-08-06
and later, while older models like gpt-4-turbo
use JSON mode instead.
Structured Outputs is available in two forms in the OpenAI API: When using function calling
and when using a json_schema
response format.
But when should you use one or the other? To put it simply:
In this guide we’ll focus on using Structured Outputs as response_format with the OpenAI response_format parameter, and you can find more examples of using it with function calling on this link.
To use Structured Outputs in the response_format follow these steps:
1. Define your JSON schema in the prompt
2. Add your JSON schema in the API call like so:
response_format: { "type": "json_schema", "json_schema": … , "strict": true }
3. Parse the generated structured data into your app
The following types are supported for Structured Outputs:
required
.anyOf
.#
to indicate root recursion.Let’s look at some real-world examples where using Structured Outputs as a final response can be useful.
One really interesting way to use Structured Outputs is to generate valid HTML code.
Wait, how is this possible?
Well, you can create valid HTML code by defining the schema as a recursive data structure, which breaks down the HTML elements into a tree-like format.
In this structure, each element (like <div>, <p>, etc.) can contain nested elements, and you apply constraints (like enums) to limit the values of specific attributes or elements. For example, you could use an enum to restrict the values for an HTML tag like <button> to only allow valid button types like “submit,” “reset,” or “button.”
Here’s an example output:
{
"type": "form",
"label": "User Profile Form",
"children": [
{
"type": "field",
"label": "First Name",
"attributes": [
{ "name": "type", "value": "text" },
{ "name": "name", "value": "firstName" },
{ "name": "placeholder", "value": "Enter your first name" }
]
},
{
"type": "field",
"label": "Last Name",
"attributes": [
{ "name": "type", "value": "text" },
{ "name": "name", "value": "lastName" },
{ "name": "placeholder", "value": "Enter your last name" }
]
},
{
"type": "button",
"label": "Submit",
"attributes": [
{ "name": "type", "value": "submit" }
]
}
],
"attributes": [
{ "name": "method", "value": "post" },
{ "name": "action", "value": "/submit-profile" }
]
}
By using Structured Inputs, the model generates this JSON object, which can then be directly parsed and displayed in your UI for the user.
Another great use of structured outputs is when you want the model to work through a problem and output each step as parameters in a valid JSON object. Let’s imagine that you want the model to guide a user to solve a specific math problem.
With Structured Outputs and a defined JSON schema, the model can output intermediate steps, allowing you to parse them in your UI and guide users through solving the problem with clear, structured data.
The output can look something like this:
{
"steps": [
{
"explanation": "Start with the equation 8x + 7 = -23.",
"output": "8x + 7 = -23"
},
{
"explanation": "Subtract 7 from both sides to isolate the term with the variable.",
"output": "8x = -23 - 7"
},
{
"explanation": "Simplify the right side of the equation.",
"output": "8x = -30"
},
{
"explanation": "Divide both sides by 8 to solve for x.",
"output": "x = -30 / 8"
},
{
"explanation": "Simplify the fraction.",
"output": "x = -15 / 4"
}
],
"final_answer": "x = -15 / 4"
}
One very frequent use of JSON outputs is when you want to extract data from your files. With Structured Data extraction the model should be more reliable to follow your schema and extract the right data from your PDFs.
Let’s say you want to extract key information from a contract or invoice stored in a PDF. You provide this JSON Schema:
{
"type": "object",
"properties": {
"invoice_number": { "type": "string", "description": "The invoice number." },
"date": { "type": "string", "description": "The date of the invoice." },
"total_amount": { "type": "number", "description": "The total amount to be paid." },
"billing_address": { "type": "string", "description": "The billing address for the invoice." },
"due_date": { "type": "string", "description": "The due date of the invoice." }
},
"required": ["invoice_number", "date", "total_amount", "billing_address", "due_date"]
}
And the model will then output:
{
"invoice_number": "INV-123456",
"date": "2024-09-01",
"total_amount": 1500.75,
"billing_address": "123 Main St, Cityville, NY, 10001",
"due_date": "2024-10-01"
}
You can define a JSON schema to classify inputs into multiple categories that you’re moderating for. Let’s say you are moderating for offensive language and spam. You might use the following schema:
{
"type": "object",
"properties": {
"offensive": { "type": "boolean", "description": "Indicates if the content is offensive" },
"spam": { "type": "boolean", "description": "Indicates if the content is spam" },
"contains_personal_info": { "type": "boolean", "description": "Indicates if the content contains personal information" }
},
"required": ["offensive", "spam", "contains_personal_info"]
}
The model will output a valid JSON object based on the schema, like this:
{
"offensive": true,
"spam": false,
"contains_personal_info": true
}
To improve model generations, here are some tips for good JSON schema definitions: