LangChain Quickstart: Install, Configure, and Build Your First Chain in Python
You have already called the OpenAI API directly. It worked. But the moment you wanted to swap to Claude, add a prompt template, or chain two steps together, you ended up rewriting half your code. LangChain solves that problem — it gives every LLM the same interface, so switching providers is one line, not one afternoon.
What Is LangChain and Why Should You Use It?
LangChain is a Python framework that wraps LLM providers behind a single, consistent interface. Instead of learning OpenAI's client, Anthropic's client, and Google's client separately, you learn one pattern — model.invoke(messages) — and LangChain handles the rest.
I started using LangChain when a project required testing the same prompt across three providers. Without it, I had three completely different code paths. With LangChain, the only thing that changed was the model class name.
The framework is split into several packages. langchain-core provides the base abstractions — message types, the Runnable interface, prompt templates. Provider packages like langchain-openai and langchain-anthropic implement those abstractions for specific APIs. The main langchain package ties everything together with higher-level utilities.
The key benefit is the Runnable interface — every LLM, prompt template, and output parser implements the same three methods: .invoke(), .stream(), and .batch(). Once you learn this pattern, you know how to call anything in LangChain.
Installation and API Key Configuration
I recommend starting with OpenAI since it is the most widely used, but the setup pattern is identical for all providers. The entire installation is one pip command.
Every LLM provider needs an API key. LangChain reads these from environment variables by default, so you set them once and every model instance picks them up automatically.
With the key set, your first LangChain call takes three lines. ChatOpenAI wraps the OpenAI chat completions API, temperature=0 makes responses deterministic, and .invoke() sends the prompt and returns an AIMessage object containing the model's reply.
The model returns an AIMessage object, not a raw string. The .content attribute holds the text. This message object also carries metadata — token counts, model name, and response headers — which matters when you start tracking costs.
The Runnable Interface: invoke, stream, and batch
This is the single most important concept in LangChain. Every component — models, prompt templates, output parsers, chains — implements the Runnable interface with three methods: .invoke() for single calls, .batch() for parallel processing, and .stream() for token-by-token output.
The next block demonstrates invoke() for a single question and batch() for sending three questions concurrently. With batch(), LangChain fires all requests at once instead of waiting for each to finish, cutting total time to roughly a single call's latency.
I reach for batch() any time I need to process more than two or three inputs. The speed difference is dramatic — fifty summarization requests finish in the time of one sequential call.
Streaming is where LangChain shines for user-facing applications. Instead of waiting for the full response, .stream() yields AIMessageChunk objects as tokens arrive from the provider. The first chunk typically shows up within 200-400ms, making your app feel responsive even when the full response takes seconds.
Messages: How LangChain Structures LLM Input
When you passed a plain string to model.invoke() above, LangChain silently wrapped it in a HumanMessage. For anything beyond a single prompt — system instructions, multi-turn conversations, tool results — you build message lists explicitly.
The next block creates a two-message list: a SystemMessage that sets the model's persona and a HumanMessage with the user's question. LangChain sends both to the provider in a single API call. If you have used the OpenAI SDK directly, these map to the role field in chat completions — but as typed Python objects instead of raw dictionaries.
For multi-turn conversations, you include the full history as a message list. Each AIMessage represents a previous model response. The model does not remember prior calls — you feed the entire conversation back every time, and the model uses that context to generate a coherent reply.
The Model Abstraction: Swap Providers in One Line
This is where LangChain earns its keep. The block below creates three model instances — OpenAI, Claude, and Ollama — then runs the exact same prompt through all three using .invoke(). Every model returns an AIMessage with a .content attribute. Your application code never touches provider-specific logic.
In my projects, I keep the model class in a configuration file. When a client asks to switch from GPT-4o to Claude, I change one config value. The rest of the application — prompts, chains, output parsing — stays untouched.
Ollama runs entirely on your machine — no API key, no network calls, no usage costs. The block below shows how to configure it: pull a model with the ollama pull command, then create a ChatOllama instance pointing at the default local URL. The returned AIMessage has the same structure as any cloud provider's response.
LangChain uses message objects to structure LLM input. Write a pure Python function format_messages(system_prompt, user_message) that returns a list of dictionaries in the OpenAI chat format. Each dictionary must have "role" and "content" keys.
The function should return a list with exactly two dictionaries:
1. A system message with role set to "system"
2. A user message with role set to "user"
Prompt Templates: Reusable, Dynamic Prompts
The template below defines a system message with a {topic} variable and a human message with a {question} variable. Calling .invoke() on the template fills in both placeholders and returns a ChatPromptValue — formatted messages ready for a model. The template never calls an LLM itself; it only prepares the input.
For simpler cases where you do not need a system message, from_template() creates a single human message template from one string.
Build Your First Chain with the Pipe Operator
A chain connects multiple Runnables together. In LangChain, you compose chains using the pipe operator | — pronounced "pipe into." The output of the left side becomes the input of the right side, exactly like Unix pipes.
The block below builds a three-step chain: a ChatPromptTemplate that formats the input, a ChatOpenAI model that generates a response, and a StrOutputParser that extracts the .content string from the AIMessage. The dictionary {"concept": "list comprehensions"} enters the prompt, flows through the model, and comes out as a plain Python string.
That prompt | model | parser line is LCEL — the LangChain Expression Language. I find this the most elegant part of the framework. Because every component is a Runnable, the chain itself is also a Runnable, which means it automatically supports .invoke(), .stream(), and .batch(). To learn more about advanced chain patterns, see our LCEL deep-dive tutorial.
The next block streams the chain's output token by token (each chunk is a string fragment thanks to the StrOutputParser), then batch-processes three concepts in parallel. You get streaming and concurrency for free — no extra configuration needed.
Real-World Example: A Translation Chain That Works with Any Provider
Here is a practical scenario: you are building a translation feature for an app. You want a clean function that takes text, a source language, and a target language — and returns the translation. The function should work with any LLM provider without changing the translation logic.
The chain below uses a system message to enforce translator behavior, a human message template with three variables, and a StrOutputParser to return clean text. The prompt tells the model to output only the translation — no commentary, no explanation.
To switch the translation chain to Claude, you replace the model assignment with ChatAnthropic. The translate_prompt and parser variables are reused as-is. The chain is reassembled with the new model in the middle, and the calling code stays identical — same .invoke(), same dictionary of variables, same string output.
This pattern — define the chain once, swap the model as needed — is how production LangChain applications are structured. The business logic lives in the chain definition. The model choice lives in configuration.
Response Metadata and Token Tracking
When you skip the StrOutputParser and work with the raw AIMessage, you get access to metadata that matters for production — token counts, model information, and timing. I always check token counts during development because costs can surprise you. For a deeper look at managing API spend, see our LLM API costs guide.
LangChain chains pass data through a sequence of steps. Simulate this pattern in pure Python.
Write a function run_chain(steps, initial_input) that takes a list of functions and an initial input value. It should pass the input through each function in order, where each function's output becomes the next function's input. Return the final result.
For example, if steps = [step_a, step_b] and initial_input = "hello", the function should compute step_b(step_a("hello")).
LangChain prompt templates fill placeholder variables into strings at runtime. Write a pure Python function render_template(template_str, variables) that replaces {variable_name} placeholders in a template string with values from a dictionary.
The function should:
1. Take a template string containing {placeholder} markers
2. Take a dictionary mapping placeholder names to values
3. Return the filled-in string
4. Raise a KeyError if a placeholder in the template has no matching key in the dictionary
Common Mistakes and How to Fix Them
These are the mistakes I see most often when developers pick up LangChain for the first time. Each one wastes at least an hour if you do not know the fix.
Mistake 1: Forgetting the API Key Environment Variable
from langchain_openai import ChatOpenAI
# Forgot to set OPENAI_API_KEY
model = ChatOpenAI(model="gpt-4o-mini")
response = model.invoke("Hello")
# Error: AuthenticationError: No API key providedimport os
os.environ["OPENAI_API_KEY"] = "sk-your-key"
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-mini")
response = model.invoke("Hello")
print(response.content) # WorksMistake 2: Printing the AIMessage Instead of .content
response = model.invoke("What is Python?")
print(response)
# content='Python is a...' additional_kwargs={} ...
# Messy output with metadata you don't wantresponse = model.invoke("What is Python?")
print(response.content)
# Python is a high-level programming language...Mistake 3: Using the Wrong Import Path
LangChain restructured its packages when moving to v0.2. Older tutorials and Stack Overflow answers use import paths that are deprecated or broken. Here is the mapping you need.
Mistake 4: Missing Template Variables in invoke()
template = ChatPromptTemplate.from_template(
"Translate {text} to {language}"
)
# Forgot to include "language"
result = template.invoke({"text": "Hello"})
# KeyError: 'language'template = ChatPromptTemplate.from_template(
"Translate {text} to {language}"
)
result = template.invoke({
"text": "Hello",
"language": "Spanish",
})
print(result) # WorksQuick Error Reference
These four error messages cover most first-week frustrations. Bookmark this section.
LangChain vs Direct SDK: When to Use What
The most common question I hear is whether LangChain is worth the extra dependency. The answer depends on what you are building. A single-provider script with one API call does not need LangChain. A multi-step pipeline with prompt templates and provider switching absolutely does.
# Single provider, simple call, no chaining
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Hello"}],
)
print(response.choices[0].message.content)
# Simpler, fewer dependencies, full control# Multi-provider, templates, chains, output parsing
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
chain = (
ChatPromptTemplate.from_template("Summarize: {text}")
| ChatOpenAI(model="gpt-4o-mini")
| StrOutputParser()
)
# Swap model, add steps, stream, batch — all freeFrequently Asked Questions
What is the difference between langchain and langchain-core?
langchain-core contains the base abstractions — the Runnable interface, message types, prompt templates, and output parsers. It has minimal dependencies. langchain builds on top of langchain-core and adds higher-level constructs like chains, agents, and retrieval utilities. For the patterns in this tutorial, langchain-core plus a provider package is enough.
Can I use async/await with LangChain?
Yes. Every Runnable provides async versions of all methods: ainvoke(), astream(), and abatch(). Use them in async frameworks like FastAPI or when you need concurrent LLM calls without thread pools.
How do I handle rate limits and retries?
LangChain's ChatOpenAI has built-in retry logic with exponential backoff for transient errors. You can configure it with max_retries (default is 2). For more control, use .with_retry() available on any Runnable.
Should I use LangChain or LangGraph for agents?
LangGraph. The LangChain team now recommends LangGraph for all agent workflows because it gives you explicit control over state, branching, and loops. LangChain itself is best for the linear chain patterns covered in this tutorial — prompt templates, model calls, and output parsing.
What to Learn Next
You now have LangChain's core pattern: create a Runnable, call .invoke(), chain with |. Every advanced feature builds on this foundation. Here is where to go depending on what you are building.
Complete Code
Here is a single script combining all the core concepts — model creation, message formatting, prompt templates, chain composition, streaming, and token tracking. Copy this as your starting point and modify it.