Skip to main content

Structured Output from LLMs: JSON Mode, Pydantic, and the Instructor Library in Python

Intermediate90 min3 exercises70 XP
0/3 exercises

Get reliable, typed data from any LLM — from fragile [JSON](/python/python-json/) prompts to schema-constrained generation with automatic validation and retries.

You ask an LLM to extract product information from a customer review, and it returns a nicely worded paragraph. Helpful for a human, useless for your database. You need a JSON object with product_name, rating, and issues — not prose. And you need it in that exact shape every single time. The code downstream will crash the moment a field is missing or the rating shows up as "four out of five" instead of 4.

This tutorial walks you through three levels of forcing LLMs to return structured data: prompt-based JSON extraction (fragile), OpenAI's built-in JSON mode and structured output (reliable), and the Instructor library for complex schemas with automatic retry on validation failure. Every JSON-mode code block runs directly in your browser.

The Problem with Free-Text LLM Output

LLMs generate text. Your application needs data. That mismatch is the root cause of most integration headaches I've seen in production AI systems. The model returns "The sentiment is positive" when your code expects {"sentiment": "positive", "confidence": 0.92}. You end up writing brittle regex parsers or string-splitting hacks that break the moment the model rephrases its answer.

There are three approaches to solving this, each more robust than the last:

ApproachReliabilityComplexityBest For
Prompt engineering ("reply in JSON")Low — model may add commentaryMinimalQuick prototypes
response_format with JSON modeMedium — guaranteed valid JSONLowSimple flat schemas
response_format with strict schemasHigh — exact field names and typesMediumProduction systems
Instructor + PydanticHighest — validation + retriesMediumComplex nested schemas

We will build up from the weakest approach to the strongest, so you understand exactly what each layer adds.

Setup — OpenAI Client and API Key

Install packages and create the client
Loading editor...

Approach 1: Prompt-Based JSON Extraction

The simplest approach: just ask the model to respond in JSON. No special parameters, no libraries. I still use this for throwaway scripts where I don't care if it occasionally fails.

Prompt-based JSON extraction — no guarantees
Loading editor...

This works most of the time. The problem is "most of the time" is not good enough for production. The model might wrap the JSON in markdown code fences (`

...

), add a preamble like "Here is the extracted data:", or occasionally return malformed JSON. I've had a model return trailing commas that break json.loads()` — valid in JavaScript, invalid in Python.

Approach 2: JSON Mode with response_format

OpenAI provides a response_format parameter that forces the model to return valid JSON. No code fences, no preamble, no trailing commas. The output is guaranteed to be parseable by json.loads(). This is the minimum you should use for any code that runs unattended.

JSON mode — guaranteed valid JSON
Loading editor...

Two things to notice. First, response_format={"type": "json_object"} guarantees the output is valid JSON — no parsing errors. Second, you still need to mention "JSON" in the system message. OpenAI requires this; if you set json_object mode without mentioning JSON in the prompt, the API returns an error.

Here is a comparison of what can go wrong with each approach so far:

Prompt-only — multiple failure modes
# All of these can happen:
# 1. "Here is the JSON: {...}"  (preamble)
# 2. ```json\n{...}\n```    (code fences)
# 3. {"rating": "7/10"}        (wrong type)
# 4. {"product": "..."}        (wrong field name)
# 5. {trailing_comma: true,}   (invalid JSON)

try:
    data = json.loads(raw)
except json.JSONDecodeError:
    # Now what? Retry? Regex? Give up?
    pass
JSON mode — only schema issues remain
# Eliminated: preamble, code fences, invalid JSON
# Still possible:
# 1. {"rating": "7/10"}     (wrong type)
# 2. {"product": "..."}     (wrong field name)
# 3. Missing fields

# But json.loads() always succeeds
data = json.loads(response.choices[0].message.content)
# You still need to validate the shape

Approach 3: Strict Structured Output with JSON Schema

This is the big upgrade. Instead of just asking for "some valid JSON," you hand OpenAI a JSON Schema that defines the exact fields, types, and structure. The model is constrained at the token-generation level — it literally cannot produce output that violates your schema. Field names, types, required fields, enum values — all enforced.

Strict structured output — schema-constrained generation
Loading editor...

The response is guaranteed to match your schema. product_name will always be a string. rating will always be an integer. pros and cons will always be arrays of strings. The "strict": True flag enables constrained decoding — the model cannot deviate from the schema even if it "wants" to.

There are a few constraints when using strict mode. All fields listed in properties must appear in required. You must set "additionalProperties": false. Optional fields use {"type": ["string", "null"]} instead of just omitting them from required. These constraints exist because the grammar-based approach needs a fully deterministic schema.

Nested and Complex Schemas

Real-world data is rarely flat. You might need an array of objects, nested structures, or enum-constrained fields. Structured output handles all of these. The example below demonstrates three patterns at once: an array of objects (segments), a nullable field (growth_pct), and an enum constraint (sentiment).

Nested schema — objects within arrays, enums, nullable fields
Loading editor...

Notice "growth_pct": {"type": ["number", "null"]}. This is how you make a field optional in strict mode — it must still be present in the output, but its value can be null. The automotive segment has no growth percentage mentioned in the article, so the model correctly returns null for that field.

Exercise: Build a JSON Schema for a Recipe
Write Code

Write a function called build_recipe_schema that returns a dictionary representing a JSON Schema for a recipe. The schema must define an object with these fields:

  • name (string, required)
  • cuisine (string, must be one of: "italian", "mexican", "indian", "chinese", "japanese", "french", "other") (required)
  • prep_time_minutes (integer, required)
  • ingredients (array of objects, each with item (string) and quantity (string), both required, no additional properties) (required)
  • vegetarian (boolean, required)
  • All fields are required. Set additionalProperties to false at both the top level and inside the ingredient objects.

    Return only the `"schema"` portion — the object containing "type", "properties", "required", and "additionalProperties".

    Loading editor...

    Pydantic — Validating LLM Output in Python

    JSON Schema works at the API level, but once the data lands in your Python code, you want a proper Python object — not a raw dictionary. Pydantic gives you exactly that: define a class with typed fields, pass in a dictionary, and get back a validated object with autocomplete, type checking, and clear error messages when something is wrong.

    If you have used dataclasses, Pydantic will feel familiar. The key difference: Pydantic validates and coerces data on creation. A dataclass with rating: int silently accepts "7" as a string. A Pydantic model with rating: int either converts "7" to 7 or raises a validation error — your choice.

    Pydantic BaseModel — typed Python objects from dictionaries
    Loading editor...

    The Field(ge=1, le=10) adds a constraint: rating must be between 1 and 10 inclusive. If the LLM returns rating: 15, Pydantic raises a ValidationError with a clear message. This is your safety net — even if the LLM returns valid JSON with valid types, the values themselves might be nonsensical.

    Catching Validation Errors

    Pydantic catches invalid LLM output
    Loading editor...

    Two errors caught in one pass: rating exceeds the maximum, and cons is missing entirely. In a production pipeline, you'd catch this ValidationError, log it, and either retry the LLM call or fall back to a default. The error messages are specific enough for automated handling — you get the field name, the constraint that was violated, and the invalid value.

    Combining Pydantic Models with OpenAI Structured Output

    Here is where the pieces snap together. You define a Pydantic model, convert it to a JSON Schema for the API call, then validate the response with the same model. One source of truth for both the LLM constraint and the Python validation.

    Pydantic generates the JSON Schema for you
    Loading editor...

    Pydantic's .model_json_schema() generates the JSON Schema automatically from your type annotations. Field descriptions become "description" entries in the schema, which helps the model understand what each field should contain. No hand-writing JSON Schema — the Python class is the single source of truth.

    To use this with OpenAI's strict structured output, you need to adapt the schema slightly. Strict mode requires "additionalProperties": false and all properties in "required". Here is a helper that does the conversion:

    Full pipeline — Pydantic model to API call to validated object
    Loading editor...
    Exercise: Build a Pydantic Model for Job Postings
    Write Code

    Create a Pydantic model called JobPosting that validates job posting data extracted by an LLM. The model should have:

  • title (string, required)
  • company (string, required)
  • salary_min (integer, must be >= 0, required)
  • salary_max (integer, must be >= 0, required)
  • remote (boolean, required)
  • required_skills (list of strings, required)
  • Then write a function parse_job_posting(data: dict) that:

    1. Tries to create a JobPosting from the dictionary

    2. Returns the JobPosting object if validation succeeds

    3. Returns None if validation fails

    Also add a validation check: if salary_max < salary_min, it should fail validation. Use Pydantic's model_validator for this.

    Loading editor...

    The Instructor Library — Pydantic-Native Structured Output

    So far we have been building the pipeline ourselves: define a Pydantic model, convert it to JSON Schema, make the API call, parse the JSON, validate with Pydantic. The Instructor library wraps all of this into a single function call. You pass a Pydantic model as response_model, and Instructor handles the schema generation, API call, parsing, validation, and — crucially — automatic retries when validation fails.

    Instructor — one call, validated Pydantic object
    Loading editor...

    That's it. No json.loads(), no manual schema conversion, no validation step. The return value is a typed Pydantic object. If you hover over profile.name in your IDE, you get str. If you try profile.age + "hello", your type checker flags it immediately.

    Automatic Retries on Validation Failure

    This is the feature that made me switch to Instructor for production work. When the LLM returns data that fails Pydantic validation, Instructor automatically retries the request with the validation error appended to the messages. The model sees what went wrong and corrects itself. You set max_retries and let it handle the loop.

    Automatic retry on validation failure
    Loading editor...

    If the model returns ticker: "aapl" (lowercase), Pydantic's field_validator rejects it. Instructor catches the ValidationError, appends a message like "Validation error: Ticker must be uppercase, got 'aapl'" to the conversation, and resends. On the second attempt the model usually gets it right. Three retries covers virtually every edge case I've encountered.

    Extracting Lists of Objects

    A common pattern: extract multiple structured items from a single text. Instructor handles this cleanly with a wrapper model containing a list field.

    Extracting a list of structured objects
    Loading editor...

    The Optional[str] = None pattern handles missing information gracefully. Not every contact has both email and phone. The model fills in what it finds and leaves the rest as None. Your code checks with a simple if contact.email — no KeyError, no missing-field surprises.

    Choosing the Right Approach for Your Project

    After working with all three approaches in production systems, here is my decision framework:

    SituationRecommended Approach
    Quick prototype, non-criticalPrompt-based JSON
    Simple schema, one-off extractionJSON mode ("type": "json_object")
    Production system, flat schemasStrict structured output ("type": "json_schema")
    Complex/nested schemas, needs validationPydantic + strict structured output
    Multi-provider, needs retries, complex validationInstructor library

    For most projects, I start with strict structured output and Pydantic validation. If I find myself writing retry logic or fighting with edge cases, I upgrade to Instructor. The prompt-only approach is fine for Jupyter notebooks and one-off scripts — just don't deploy it.

    Real-World Example: Extracting Structured Data from Emails

    Let me put everything together with a realistic example. You have incoming customer emails and need to route them to the right department with structured metadata. This is exactly the kind of task I've built multiple times in production.

    Production email analysis pipeline
    Loading editor...

    This pattern handles hundreds of emails per minute. The structured output guarantees that your routing logic downstream never gets a surprise field name or missing value. The Pydantic validation catches any edge cases where the model might produce technically valid JSON that doesn't make business sense.

    Exercise: Validate and Clean Structured LLM Output
    Write Code

    Write a function called validate_and_clean that takes a list of dictionaries (simulating batch LLM output) and returns a tuple of (valid_items, error_count).

    Each dictionary represents a book extracted by an LLM. Create a Pydantic model called Book with:

  • title (string)
  • author (string)
  • year (integer, between 1000 and 2030 inclusive)
  • genre (string)
  • page_count (integer, must be > 0)
  • The function should:

    1. Try to validate each dictionary as a Book

    2. Collect all valid Book objects

    3. Return (valid_books, error_count) where valid_books is a list of validated Book objects and error_count is the number of items that failed validation.

    Loading editor...

    Common Mistakes and How to Fix Them

    Mistake 1: Forgetting "JSON" in the System Message

    Mistake 1 — missing JSON mention with json_object mode
    Loading editor...

    When using "type": "json_object", OpenAI requires the word "JSON" to appear somewhere in the messages. This is a safety check — it prevents accidental activation of JSON mode. With "type": "json_schema" (strict mode), this requirement does not apply because the schema itself makes the intent clear.

    Mistake 2: Not Setting additionalProperties to False

    Mistake 2 — missing additionalProperties in strict mode
    Loading editor...

    Mistake 3: Using Optional Fields Wrong in Strict Mode

    In strict mode, every property must be in the required list. You cannot make a field optional by omitting it from required. Instead, use a union type with null:

    Wrong — omitting from required
    {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "nickname": {"type": "string"}
        },
        "required": ["name"],  # nickname optional
        "additionalProperties": false
    }
    # Error: strict mode requires all
    # properties in "required"
    Correct — nullable type
    {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "nickname": {"type": ["string", "null"]}
        },
        "required": ["name", "nickname"],
        "additionalProperties": false
    }
    # Works: nickname is always present
    # but can be null

    Frequently Asked Questions

    Does structured output cost more tokens?

    Structured output uses the same pricing as regular completions — you pay per input and output token. The JSON Schema is included in the system prompt tokens, so very complex schemas add a small cost. In practice, the schema overhead is negligible compared to the actual content. OpenAI caches schemas across requests, so repeated calls with the same schema don't reprocess it.

    Can I use structured output with streaming?

    Yes. Both json_object and json_schema modes work with streaming. You receive JSON tokens incrementally and parse the complete object when the stream finishes. Instructor also supports streaming with partial validation — you can process fields as they arrive.

    Streaming with JSON mode
    Loading editor...

    What happens if the LLM cannot fill a required field?

    In strict mode, the model must provide a value for every required field. If the source text does not contain enough information, the model will generate its best guess or a placeholder (like an empty string or null for nullable fields). To handle this cleanly, make fields that might be missing nullable ({"type": ["string", "null"]} in JSON Schema, or Optional[str] = None in Pydantic) and check for None in your application code.

    Is Instructor worth the extra dependency?

    For simple schemas with a single LLM provider, Pydantic plus OpenAI's built-in structured output is sufficient. Instructor becomes valuable when you need automatic retries on validation failure, multi-provider support (switch between OpenAI, Anthropic, and Gemini without changing your extraction code), or complex nested schemas where the manual schema conversion gets tedious. If you are building a system that extracts structured data as a core feature, Instructor pays for itself in reduced boilerplate.

    Summary

    Structured output transforms LLMs from text generators into data extraction engines. The progression is clear: prompt-based JSON is quick but fragile, JSON mode guarantees valid syntax, strict schemas guarantee correct structure, and Pydantic adds Python-level type safety with meaningful error messages. The Instructor library ties everything together with automatic retries and multi-provider support.

    The pattern you will use most often: define a Pydantic model with Field constraints, convert it to a strict JSON Schema for the API call, and validate the response with the same model. One class, three jobs — schema generation, API constraint, and data validation.

    References

  • OpenAI documentation — Structured Outputs. Link
  • Pydantic documentation — Models. Link
  • Pydantic documentation — Field validation. Link
  • Instructor library documentation. Link
  • OpenAI API reference — Chat Completions. Link
  • JSON Schema specification. Link
  • Related Tutorials