LangChain Output Parsers: Extract Structured Data from LLM Responses
Your LCEL chain works. The LLM responds. And then you stare at a raw string wondering how to pull the product name, price, and rating out of a paragraph that changes shape every time you run it. I spent a frustrating afternoon writing regex to parse LLM output before discovering that LangChain already solved this problem — with output parsers that slot right into the pipe operator.
LangChain output parsers plug into the end of your chain with the same pipe operator you already know. They handle the format instructions, the parsing, and the validation — so your downstream code gets a typed Python object instead of a string you have to disassemble yourself.
Why You Need LangChain Output Parsers
Without a parser, every LCEL chain returns an AIMessage object. You can grab .content from it, but that gives you a string — and strings are where bugs hide. Consider a chain that extracts three fields from a customer review. Sometimes the model returns valid JSON. Sometimes it wraps it in markdown code fences. Sometimes it adds a polite preamble before the JSON.
StrOutputParser — The Simplest Output Parser
The simplest parser strips the AIMessage wrapper and hands you back a plain Python string. I use StrOutputParser on almost every chain because, without it, you get an AIMessage object instead of the text your application actually needs.
Running that prints something like <class 'langchain_core.messages.ai.AIMessage'> followed by the full object with content, response_metadata, and token usage data. Not what you want to pass to the next function.
The result is now <class 'str'> and just the sentence itself. The pipe operator makes parsing feel natural: prompt produces a formatted message, the LLM produces an AIMessage, and the parser extracts the content string.
JsonOutputParser — Getting Dictionaries from LLMs
Strings are fine for chatbots. But the moment you need to store data, route decisions, or feed output into another function, you need a dictionary. JsonOutputParser handles the messy middle ground — it strips markdown fences, parses the JSON, and returns a Python dict.
That prints a block of text telling the LLM to return valid JSON. The key insight: you do not write these instructions yourself. The parser generates them, and you inject them into your prompt template using a {format_instructions} variable.
The next block builds a complete extraction chain. The prompt template lists four fields we want — product_name, rating, pros, and cons — and appends the parser's format instructions. When we invoke the chain with a headphone review, the parser strips any markdown fences from the response and returns a plain Python dict.
The result is a plain Python dictionary with keys like product_name, rating, pros, and cons. You access values with result["product_name"] or iterate over result["pros"] directly — no string parsing. The model might occasionally vary the exact wording inside lists, but the structure stays consistent because the format instructions told it what to produce.
PydanticOutputParser — Validated, Typed Output
This is the parser I reach for in any production chain. PydanticOutputParser takes a Pydantic model class, generates detailed format instructions from the field names, types, and descriptions, and then validates the LLM output against that model. If a field is missing or has the wrong type, you get a clear ValidationError instead of a silent bug three functions downstream.
Start by defining a ProductReview Pydantic model with five fields: product_name (string), rating (integer with 1-5 constraints via ge/le), pros and cons (lists of strings), and a summary. Each field uses Field(description=...) so that the parser can tell the LLM exactly what to extract.
The Field(description=...) matters here. LangChain includes those descriptions in the format instructions it sends to the LLM, so descriptive field docs produce better extractions. I always write descriptions even for obvious fields like product_name — the LLM uses them as disambiguation hints.
The format instructions include the full JSON schema with field names, types, descriptions, and constraints. The LLM receives a detailed specification it can follow.
The chain below sends a Kindle Paperwhite review through a prompt template, the LLM, and the PydanticOutputParser. The parser validates the response against ProductReview and returns a typed model instance. You access fields with dot notation — result.product_name, result.rating — instead of dictionary keys.
The result is a ProductReview instance, not a dictionary. You get IDE autocompletion and type checking for free. If the LLM returns a rating of 6, Pydantic raises a ValidationError because of the le=5 constraint — the bug surfaces immediately instead of propagating downstream.
Injecting Format Instructions with partial
Passing format_instructions manually every time you call invoke() gets tedious fast. The cleaner pattern is to use prompt.partial() to bake the instructions into the template once:
The partial() call fills in format_instructions ahead of time. From this point on, every call to chain.invoke() only needs to provide the dynamic variables — review in this case. This is the pattern I use in every production chain that has a parser.
When the Parser Fails — Handling Malformed Output
LLMs are stochastic. Even with format instructions, a model might occasionally return malformed JSON — an unclosed bracket, a trailing comma, or a conversational preamble before the JSON block. Here is what happens and how to handle it.
The OutputParserException includes llm_output — the raw string the model returned before parsing failed. This is invaluable for debugging. You can log it, send it to an error tracker, or feed it into LangSmith for tracing. Use LangChain callbacks to capture parser failures automatically across all your chains.
Auto-Recovery with OutputFixingParser
Catching exceptions is fine for logging. But in a production pipeline, you want automatic recovery. OutputFixingParser wraps any parser and, on failure, sends the error message plus the malformed output back to the LLM with a "fix this" prompt. It costs one extra API call but recovers from most formatting errors.
RetryOutputParser — Retry with the Original Prompt
OutputFixingParser only shows the LLM the broken output and the error. RetryOutputParser goes further — it also includes the original prompt, giving the model full context to produce a correct response. I reach for this when the parser failure stems from the model misunderstanding the task, not just the format.
CommaSeparatedListOutputParser — Quick Lists
This prints: "Your response should be a comma separated list of items. For example: foo, bar, baz". Simple and effective when you need a flat list without the overhead of JSON schemas.
The result is a Python list of strings. I find this parser useful for brainstorming chains, tag extraction, and anywhere you need a flat enumeration without the ceremony of a Pydantic model.
Real-World Pattern: Multi-Step Chain with Parsers
Output parsers shine when you chain multiple steps. Here is a practical pattern I use often: a two-step support pipeline. The first chain classifies a customer message into a SupportTicket with four fields — category (billing/technical/account), urgency (low/medium/high), a one-sentence summary, and a boolean requires_escalation flag. The second chain reads those typed fields and generates an appropriate response.
The first chain parses a raw customer message into a SupportTicket with category, urgency, and escalation flag. The second chain uses those fields to generate an appropriate response:
The structured intermediate step — the SupportTicket — is what makes this pattern powerful. Without it, you would feed raw text into the response generator and hope the LLM figures out the urgency and category on its own. With the parsed ticket, you control exactly what information the response chain sees. This same pattern applies to RAG pipelines where you parse retrieved context into structured objects before generating answers.
Advanced Pydantic Patterns: Enums, Optional Fields, and Nested Models
Pydantic models can express much richer schemas than flat key-value pairs. For broader output formatting strategies, see our dedicated guide. Here are three patterns I use regularly that make LLM output significantly more reliable.
Constraining Values with Enums
When a field should be one of a fixed set of values — like a sentiment label or a priority level — use a Python Enum. The parser includes the allowed values in the format instructions, and Pydantic rejects anything outside that set.
The format instructions now include "sentiment": must be one of: positive, negative, neutral, mixed. The LLM knows its options, and Pydantic enforces them. No more "Somewhat Positive" or "5/10" sneaking through.
Nested Models for Complex Structures
When a single flat model cannot represent your data, nest one Pydantic model inside another. The example below defines an Address model with street, city, state, and country fields, then embeds it inside a Person model. The Optional[str] annotation on email means the LLM can return null when no email is mentioned in the source text, avoiding hallucinated values.
The Optional[str] with default=None handles fields that might not appear in the input text. If the email is not mentioned, the model returns null for that field instead of hallucinating one. Nested models like Address inside Person let you represent hierarchical data without flattening everything into one level.
Create a Pydantic model called JobPosting that represents a parsed job listing. It should have these fields:
title (str): The job titlecompany (str): The company namesalary_min (int): Minimum salarysalary_max (int): Maximum salaryremote (bool): Whether the position is remoteskills (List[str]): Required skillsAll fields must have Field(description=...) for good format instructions. Then create a PydanticOutputParser from your model and print the format instructions.
Building a Custom Output Parser
Sometimes the built-in parsers do not fit your use case. Maybe you want to extract key-value pairs from a custom format, or parse markdown tables, or split a response at a specific delimiter. LangChain lets you build custom parsers by subclassing BaseOutputParser.
Three methods make a custom parser: parse() does the actual conversion, get_format_instructions() tells the LLM what format to use, and _type provides a string identifier. The generic type parameter BaseOutputParser[List[str]] declares what parse() returns.
Custom parsers plug into LCEL chains exactly like built-in ones — the pipe operator does not care whether you wrote the parser or LangChain did. I have built custom parsers for XML extraction, markdown table parsing, and splitting multi-part LLM responses at --- delimiters.
Common Output Parser Mistakes and How to Fix Them
After debugging parser failures across many projects, these are the three mistakes I see most often.
Mistake 1: Forgetting Format Instructions in the Prompt
# The prompt never tells the LLM to return JSON
prompt = ChatPromptTemplate.from_template(
"Extract the product name and price from: {text}"
)
# Parser expects JSON but LLM returns prose
chain = prompt | llm | parser # Crashes at parse timeprompt = ChatPromptTemplate.from_template(
"Extract the product name and price from: {text}\n"
"{format_instructions}"
).partial(
format_instructions=parser.get_format_instructions()
)
chain = prompt | llm | parser # Works reliablyMistake 2: Pydantic Fields Without Descriptions
Field descriptions are not just documentation — they are part of the format instructions sent to the LLM. Think of them as prompt engineering for your schema. A field named qty with no description leaves the model guessing. A field with Field(description="Quantity ordered, as an integer") removes all ambiguity.
Mistake 3: Using the Wrong Parser for the Job
| If you need... | Use this parser |
|---|---|
| Raw text string | StrOutputParser |
| Simple flat JSON (no validation) | JsonOutputParser |
| Validated, typed objects | PydanticOutputParser |
| A Python list | CommaSeparatedListOutputParser |
| Something else entirely | Custom BaseOutputParser subclass |
Create a custom output parser called KeyValueParser that inherits from BaseOutputParser[dict]. It should parse text in the format:
Name: Alice
Age: 30
City: New Yorkinto a Python dictionary {"Name": "Alice", "Age": "30", "City": "New York"}.
Implement the parse() method that splits on newlines, then splits each line at the first : to get key-value pairs. Strip whitespace from both keys and values. Skip empty lines.
Create an Enum called Priority with values LOW, MEDIUM, HIGH, and CRITICAL. Then create a Pydantic model called BugReport with these fields:
title (str): Bug titlepriority (Priority): The priority enumsteps_to_reproduce (List[str]): Steps to reproduce the bugis_regression (bool): Whether this is a regression, default FalseThen create a sample BugReport instance and print its priority value.
Choosing the Right Parser — Quick Reference
| Parser | Returns | Validates Schema? | Best For |
|---|---|---|---|
StrOutputParser | str | No | Chat responses, text generation |
JsonOutputParser | dict | No (valid JSON only) | Quick prototyping, flexible schemas |
PydanticOutputParser | Pydantic model | Yes (types + constraints) | Production chains, APIs |
CommaSeparatedListOutputParser | List[str] | No | Tag extraction, brainstorming |
OutputFixingParser | Wraps any parser | Wraps inner parser | Auto-retry on parse failures |
Custom BaseOutputParser | Anything | Your logic | Non-standard formats |
Summary
Output parsers transform raw LLM text into structured Python objects. StrOutputParser strips the AIMessage wrapper. JsonOutputParser returns dictionaries. PydanticOutputParser adds type validation and schema constraints. And custom parsers handle everything else.
Three things to remember: always include format instructions in your prompt, always add descriptions to Pydantic fields, and always wrap parser calls in try/except for production code.
Where to Go Next
Explore LangChain chains to combine parsers with routing and branching. Dive into LangChain tools to let your LLM call external functions with parsed arguments. If you are building retrieval pipelines, see RAG with LangChain for combining parsers with document retrieval. For tracing and debugging parser failures in production, LangSmith provides full chain visibility.
Frequently Asked Questions
Can I use output parsers with streaming?
StrOutputParser supports streaming natively — tokens flow through as they arrive. JsonOutputParser also supports streaming via chain.astream(), yielding partial parsed objects as JSON accumulates. PydanticOutputParser does not support streaming because it needs the complete JSON before validation.
If you need streaming with validated output, use JsonOutputParser for the stream and validate the final result with Pydantic separately.
Do output parsers work with non-OpenAI models?
Yes. Output parsers are model-agnostic — they operate on the text content of any AIMessage, regardless of which LLM produced it. The format instructions go into the prompt, and any model that can follow instructions will produce parseable output. I have used PydanticOutputParser successfully with Claude, Gemini, Llama, and Mistral through LangChain model switching.
Should I use output parsers or with_structured_output()?
LangChain provides llm.with_structured_output(MyModel), which uses provider-native structured output (like OpenAI's response_format). If your provider supports it, with_structured_output() is more reliable because the constraint is enforced at the model level, not just through prompt instructions.
Use with_structured_output() when you are locked to a single provider that supports it. Use output parsers when you need provider-agnostic chains, custom parsing logic beyond JSON schemas, or when your provider does not support native structured output.
What about DatetimeOutputParser and XMLOutputParser?
LangChain includes several specialized parsers beyond the ones covered above. DatetimeOutputParser extracts dates from LLM responses and returns Python datetime objects. XMLOutputParser parses XML-formatted responses into dictionaries. Both follow the same create-inject-pipe pattern.
I rarely use these specialized parsers in production because PydanticOutputParser with a datetime field or a nested model handles most real-world scenarios more flexibly. But DatetimeOutputParser is handy for one-off date extraction tasks where defining a full Pydantic model feels like overkill.
References
with_structured_output() as an alternative