LCEL Mastery: Compose LLM Chains with the Pipe Operator in Python

Intermediate90 min3 exercises50 XP

Prerequisites

0/3 exercises

You have a prompt template, an LLM, and an output parser. You could glue them together with five lines of imperative code — variable here, function call there. Or you could write prompt | model | parser and be done. That pipe operator is LCEL, and once you see how it works, you won't go back.

What Is LCEL and Why Does LangChain Use It?

LCEL stands for LangChain Expression Language. It is a declarative way to compose LangChain components into pipelines using the | (pipe) operator. Prompts, models, parsers, retrievers, custom functions — chain them together, and data flows from left to right.

I think of it as Unix pipes for LLM applications. In a shell, you write cat file.txt | grep error | wc -l and data flows left to right through each command. LCEL does the same thing: data flows through each component, and you can read the entire pipeline at a glance.

The LCEL mental model

Loading editor...

The key insight: every component in LangChain implements a Runnable interface. A Runnable has three core methods — .invoke(), .stream(), and .batch(). When you pipe Runnables together, the resulting chain is itself a Runnable. You get invoke/stream/batch on the entire pipeline for free.

Your First LCEL Chain — Prompt, Model, Parser

The most common LCEL pattern connects three components. `ChatPromptTemplate` formats your input variables into messages. ChatOpenAI sends those messages to the OpenAI API and returns an AIMessage. `StrOutputParser` extracts the plain text content from that AIMessage.

I reach for this three-piece chain constantly. The code below creates each component separately, then composes them with |. When you call chain.invoke({...}), your dictionary flows through the prompt, the resulting messages hit the model, and the parser hands you back a clean string.

The classic prompt | model | parser chain

Loading editor...

What makes this more than syntactic sugar is what happens behind the scenes. Call chain.stream({"concept": "generators"}) and tokens stream through the parser in real time — zero streaming logic on your part. Call chain.batch([...]) and inputs run in parallel. The composition is doing real work.

The Runnable Interface — invoke, stream, batch

Every LCEL chain — no matter how complex — exposes the same three methods. .invoke(input) runs the chain once with a single input and returns the result. .batch([input_1, input_2, ...]) runs the chain on multiple inputs concurrently, returning a list of results. .stream(input) yields output tokens one at a time as they arrive from the model.

The code below builds the same prompt | model | parser chain and demonstrates all three calling methods. The chain definition is identical each time — only the method changes. I find this uniform interface the core design win of LCEL: build once, run any way you need.

Three ways to run any LCEL chain

Loading editor...

Every LCEL chain supports all three methods, no matter how complex. Building an API? Use stream so users see tokens arrive live. Processing a batch of documents? Use batch to parallelize. Simple script? Plain invoke.

RunnableSequence — Building Multi-Step Pipelines

When you write a | b | c, LangChain creates a RunnableSequence. Each step takes the output of the previous step as input. The sequence is itself a Runnable, so you can nest sequences inside other sequences.

The example below chains two separate pipelines: a joke generator that outputs a string and a joke analyzer that expects a {"joke": ...} dictionary. Because the types don't match, a RunnableLambda in the middle reshapes the output. Watch for this shape-bridging pattern — you will use it constantly in multi-chain LCEL compositions.

Chaining two LCEL pipelines together

Loading editor...

Notice the RunnableLambda in the middle. The joke chain outputs a plain string, but the analysis chain expects a dictionary with a "joke" key. The lambda reshapes the data to bridge that gap. LCEL is strict about input/output types, and lambdas are the glue.

RunnableParallel — Running Steps Side by Side

Sometimes you need to run multiple operations on the same input simultaneously. Maybe you want to translate text into three languages at once, or generate a summary and a list of key points in parallel. That's what RunnableParallel is for.

The code below creates three independent analysis chains (summary, keywords, sentiment) and wraps them in a RunnableParallel. When invoked, all three chains receive the same {text} input, run concurrently, and return their results as a dictionary with keys matching the names you chose.

Analyzing text three ways in parallel

Loading editor...

I use this pattern heavily when building real applications. A common case: you have a user query and you need to both retrieve documents from a vector database AND classify the query intent before deciding how to respond. Running those in parallel instead of sequentially cuts your latency roughly in half.

Build a Data Pipeline with the Pipe Pattern

Write Code

LCEL uses the pipe operator to chain functions together. Implement a pipe function that takes a value and a list of functions, and applies each function in sequence — just like the | operator chains Runnables.

Write a function pipe(value, *functions) that:

1. Takes an initial value and any number of functions

2. Applies the first function to the value

3. Passes the result to the second function, and so on

4. Returns the final result

Then create three simple transformation functions and pipe them together.

Loading editor...

RunnableLambda — Injecting Custom Python Logic

Not every step in a chain is a prompt or a model call. Sometimes you need to clean text, extract a field, log intermediate results, or run arbitrary Python logic. RunnableLambda wraps any Python function into a Runnable so it slots into an LCEL chain.

The chain below uses two custom functions: clean_and_prepare at the front strips whitespace, lowercases the input, counts words, and returns a dictionary with text, word_count, and instruction keys for the prompt template. format_output at the end wraps the raw model response in a structured display with header lines. Both are plain Python functions that take one argument and return one value.

Custom preprocessing and postprocessing with RunnableLambda

Loading editor...

RunnablePassthrough — Forwarding and Augmenting Data

Here is a problem you will hit quickly: your chain needs some input data to pass through unchanged while also computing new fields. For example, in a RAG pipeline you want to pass the user's original question to the prompt while also retrieving relevant documents. RunnablePassthrough solves this.

The code below simulates a RAG pipeline. RunnablePassthrough.assign(context=...) keeps the original question key intact and adds a new context key by calling fake_retriever. The downstream prompt template receives both {question} and {context}, which is exactly the pattern every RAG chain in LangChain uses.

Using RunnablePassthrough.assign() in a RAG-style chain

Loading editor...

Without RunnablePassthrough (verbose)

# Must manually construct the full dict
chain = (
    RunnableLambda(lambda x: {
        "question": x["question"],
        "context": fake_retriever(x["question"])
    })
    | prompt
    | model
    | parser
)

With RunnablePassthrough.assign (clean)

# Passthrough keeps existing keys, adds new ones
chain = (
    RunnablePassthrough.assign(
        context=lambda x: fake_retriever(x["question"])
    )
    | prompt
    | model
    | parser
)

RunnableBranch — Conditional Routing in Chains

Not every input should follow the same path. A chatbot might route billing questions to one chain and technical questions to another. RunnableBranch adds if/elif/else logic to LCEL pipelines — you define condition-chain pairs, and the first matching condition routes the input to its corresponding chain.

The example below classifies user questions by topic and routes them to specialized chains. RunnableBranch takes a list of (condition_function, runnable) tuples followed by a default runnable. Each condition receives the input and returns True or False. The first True condition wins.

Routing questions to specialized chains with RunnableBranch

Loading editor...

The last argument to RunnableBranch is always the default — no condition function, just a Runnable. If none of the conditions match, the input goes here. This mirrors Python's if/elif/else structure.

Fallbacks and Retries — Building Resilient Chains

LLM APIs fail. Rate limits, timeouts, server errors — these are not edge cases, they are Tuesday. In my experience, any production LLM application that does not handle failures will crash within the first week. LCEL has built-in support for both retries and fallbacks.

Retries — Try Again on Transient Failures

.with_retry() wraps any Runnable with automatic retry logic. The code below configures the model to retry up to 3 times with exponential backoff and jitter. Exponential backoff means wait times grow (1s, 2s, 4s...), and jitter adds randomness so multiple failing clients don't hammer the API simultaneously.

Adding retry logic to a model call

Loading editor...

Fallbacks — Switch to a Backup Model

When retries aren't enough — maybe the entire provider is down — you need a fallback to a different model. .with_fallbacks() takes a list of alternative Runnables. If the primary raises an exception, LangChain tries each fallback in order until one succeeds.

Falling back from one model to another

Loading editor...

In production, I typically chain gpt-4o-mini to gpt-4o to a third provider so there is always a model available. The model switching tutorial covers multi-provider setups in depth.

Implement a Parallel Merge Function

Write Code

RunnableParallel runs multiple functions on the same input and collects results into a dictionary. Implement a parallel_run function that mimics this behavior.

Write a function parallel_run(input_data, **functions) that:

1. Takes an input value and keyword arguments where each value is a function

2. Calls every function with the same input

3. Returns a dictionary mapping each keyword name to its function's result

Example: parallel_run(10, doubled=lambda x: x*2, squared=lambda x: x**2) returns {"doubled": 20, "squared": 100}

Loading editor...

Real-World Example: A Multi-Step Document Analyzer

This is the kind of pipeline I build most often at work. You have raw text — maybe loaded via document loaders — and you need a structured analysis report with summary, entities, and difficulty rating, all in one pass.

The pipeline below combines every LCEL concept from this tutorial into a single chain. It works in three stages: (1) a RunnableLambda preprocesses the raw text — stripping whitespace, counting words and characters; (2) RunnablePassthrough.assign forwards the cleaned data while running three parallel LLM analyses (summary, entity extraction, difficulty rating) via an implicit RunnableParallel; (3) a final RunnableLambda formats all results into a readable report.

Complete document analyzer pipeline

Loading editor...

Common Mistakes and How to Fix Them

Mistake 1: Input/Output Type Mismatch

This is the single most frequent LCEL error, and I see it in almost every first-time LCEL project. A prompt template expects a dictionary, but the previous step outputs a string. Or a model outputs an AIMessage, but you pipe it into a function expecting a string.

Broken — string fed to prompt expecting dict

# StrOutputParser returns a string
chain_1 = prompt | model | StrOutputParser()

# ChatPromptTemplate expects a dict with keys
chain_2 = another_prompt | model | StrOutputParser()

# This fails: chain_1 output is a string, chain_2 input needs a dict
broken = chain_1 | chain_2  # TypeError!

Fixed — reshape with RunnableLambda

chain_1 = prompt | model | StrOutputParser()
chain_2 = another_prompt | model | StrOutputParser()

# Bridge the gap with a lambda
fixed = (
    chain_1
    | RunnableLambda(lambda text: {"input": text})
    | chain_2
)

Mistake 2: Forgetting That StrOutputParser Discards Metadata

StrOutputParser extracts just the text content from an AIMessage. Token usage, finish reason, and all other metadata are gone after that point. If you need metadata downstream, skip StrOutputParser and extract what you need with a custom lambda.

Keeping metadata when you need it

Loading editor...

Mistake 3: Using Lambda Expressions That Are Hard to Debug

Inline lambdas are convenient but invisible in stack traces. When a chain with five lambdas fails, the error says "error in <lambda>" — no indication of which one broke. For anything beyond trivial transforms, use named functions.

Hard to debug — anonymous lambdas

chain = (
    RunnableLambda(lambda x: x["text"].strip())
    | RunnableLambda(lambda x: {"query": x, "k": 5})
    | RunnableLambda(lambda x: retriever(x))  # which lambda failed?
)

Easy to debug — named functions

def clean_text(x):
    return x["text"].strip()

def prepare_query(text):
    return {"query": text, "k": 5}

def retrieve_docs(params):
    return retriever(params)

chain = (
    RunnableLambda(clean_text)
    | RunnableLambda(prepare_query)
    | RunnableLambda(retrieve_docs)
)

Performance Tips and How LCEL Works Internally

I have never seen LCEL overhead be the bottleneck — the pipe operator creates a RunnableSequence at definition time, and .invoke() just loops through steps. The real performance wins come from how you structure your chains.

Batch processing with concurrency control

Loading editor...

You can also fine-tune model behavior per-chain using .bind(). This attaches fixed kwargs — like stop sequences or response format — to a model without changing the model object itself. Useful when the same model appears in multiple chains with different constraints.

Using .bind() to customize model behavior per chain

Loading editor...

Build a Chain with Fallback Logic

Write Code

Implement a with_fallback function that wraps a primary function with fallback behavior — just like LCEL's .with_fallbacks() method.

Write a function with_fallback(primary, *fallbacks) that:

1. Returns a new function

2. When called, the new function tries the primary function first

3. If the primary raises any exception, it tries each fallback function in order

4. Returns the result of the first function that succeeds

5. If all functions fail, raises the last exception

Also implement with_retry(func, max_attempts=3) that:

1. Returns a new function that retries func up to max_attempts times

2. If all attempts fail, raises the last exception

Loading editor...

Frequently Asked Questions

Can I use LCEL with non-LangChain functions?

Yes. Wrap any Python callable in RunnableLambda and it becomes a full LCEL citizen with .invoke(), .batch(), and .stream() support. You can also use the @chain decorator for the same effect with cleaner syntax.

Wrapping any Python function as a Runnable

Loading editor...

Is LCEL faster than calling components manually?

For a single sequential chain, LCEL and manual calls have nearly identical performance. The bottleneck is the LLM API call, not chain overhead. Where LCEL wins is RunnableParallel and .batch(), which parallelize work you would otherwise thread manually. LCEL also handles streaming propagation across multi-step chains automatically.

What is the difference between RunnableSequence and the old LLMChain?

LLMChain was the original LangChain API for combining a prompt and a model. It has been deprecated in favor of LCEL. The old LLMChain(prompt=prompt, llm=model) is now simply prompt | model. LCEL is more composable, more transparent (no hidden state), and supports streaming and batching natively.

How do I debug an LCEL chain that returns unexpected results?

Insert a RunnableLambda that prints intermediate values at the point where you suspect the issue. This is the chain equivalent of adding print statements.

Inserting debug taps into an LCEL chain

Loading editor...

For production debugging, use LangSmith. It captures the full trace of every chain execution — inputs, outputs, latency, and token usage at each step. It is the dedicated observability tool for LangChain apps.

When should I use RunnableBranch vs. a simple if/else?

Use RunnableBranch when the routing logic is part of a larger LCEL pipeline and you want streaming, batching, and tracing to work automatically. Use a plain if/else inside a RunnableLambda when the branching is simple and you prefer readability over LCEL integration.

Simple routing with RunnableLambda

Loading editor...

Where to Go Next

You now have the full LCEL toolkit: pipe composition, parallel fan-out, custom lambdas, passthrough augmentation, conditional branching, and resilient fallbacks. Here are the natural next steps depending on what you are building.

References

LangChain Documentation — LangChain Expression Language (LCEL). Link

LangChain Documentation — Runnable Interface. Link

LangChain Documentation — RunnableParallel. Link

LangChain Documentation — RunnablePassthrough. Link

LangChain Documentation — RunnableBranch. Link

LangChain Documentation — How-to: Chain Runnables. Link

LangChain Blog — LCEL Design Philosophy. Link

LangSmith Documentation — Tracing and Debugging. Link

LCEL Mastery: Compose LLM Chains with the Pipe Operator in Python

What Is LCEL and Why Does LangChain Use It?

Your First LCEL Chain — Prompt, Model, Parser

The Runnable Interface — invoke, stream, batch

RunnableSequence — Building Multi-Step Pipelines

RunnableParallel — Running Steps Side by Side

RunnableLambda — Injecting Custom Python Logic

RunnablePassthrough — Forwarding and Augmenting Data

RunnableBranch — Conditional Routing in Chains

Fallbacks and Retries — Building Resilient Chains

Retries — Try Again on Transient Failures

Fallbacks — Switch to a Backup Model

Real-World Example: A Multi-Step Document Analyzer

Common Mistakes and How to Fix Them

Mistake 1: Input/Output Type Mismatch

Mistake 2: Forgetting That StrOutputParser Discards Metadata

Mistake 3: Using Lambda Expressions That Are Hard to Debug

Performance Tips and How LCEL Works Internally

Frequently Asked Questions

Can I use LCEL with non-LangChain functions?

Is LCEL faster than calling components manually?

What is the difference between RunnableSequence and the old LLMChain?

How do I debug an LCEL chain that returns unexpected results?

When should I use RunnableBranch vs. a simple if/else?

Where to Go Next

References

Related Tutorials

Save your progress across devices