Skip to main content

LangChain Prompt Templates: Dynamic Prompts with Variables and Chat History

Intermediate60 min3 exercises55 XP
0/3 exercises

Every real LLM application sends the same prompt structure over and over — a system instruction, some context, maybe a few examples — with only the specifics changing each time. Hard-coding those prompts as f-[strings](/python/python-string-formatting/) works for a demo. It falls apart the moment you need to swap models, track history, or share prompt logic across a team. LangChain prompt templates fix this by turning prompts into composable, reusable objects.

What Are Prompt Templates and Why Use Them?

A prompt template is a reusable blueprint for generating prompts. You define it once with placeholder variables, then fill in those variables at call time.

I spent a lot of time early on building LLM apps with raw f-strings. It worked until I had three endpoints using slightly different versions of the same prompt. A single wording change meant editing code in three places. Templates eliminate that problem entirely.

Here is the core idea — a hard-coded prompt vs. a template:

Hard-coded f-string
topic = "recursion"
difficulty = "beginner"

prompt = f"""You are a Python tutor.
Explain {topic} at a {difficulty} level.
Use one concrete example."""

print(prompt)
LangChain template
from langchain_core.prompts import ChatPromptTemplate

template = ChatPromptTemplate.from_messages([
    ("system", "You are a Python tutor."),
    ("human", "Explain {topic} at a {difficulty} level. Use one concrete example."),
])

prompt = template.invoke({"topic": "recursion", "difficulty": "beginner"})
print(prompt.to_string())

Both produce the same text. The difference shows up when your application grows. The template version validates its inputs, composes with other LangChain components via the pipe operator, and separates prompt structure from data.

ChatPromptTemplate — The Core Building Block

Creating a ChatPromptTemplate
Loading editor...
Input variables: ['language', 'question']

LangChain scans the template strings and extracts every {variable_name} it finds. When you call .invoke(), it requires exactly those keys. Pass fewer and you get a KeyError; pass extra and they are silently ignored.

To produce the final messages, call .invoke() with a dictionary:

Python
Loading editor...
[system] You are an expert Python developer. Answer concisely.
[human] What is the difference between a list and a tuple?

Each message carries a .type property ("system", "human", or "ai") and a .content string. This maps directly to the message format every LLM provider expects.

Message Types and Role Tuples

You have already seen "system" and "human". The third role — "ai" — unlocks few-shot examples by embedding previous model responses directly into the template.

Few-shot examples with the "ai" role
Loading editor...
[system] You are a helpful coding assistant.
[human] What does len() do?
[ai] len() returns the number of items in a container.
[human] What does enumerate() do?

The "ai" messages act as few-shot examples. The model sees a previous Q&A exchange and matches that style for the new question. This is one of the simplest ways to steer output format without writing elaborate system instructions.


Exercise 1: Build a Translation Prompt
Write Code

Create a function called build_translation_prompt that takes source_lang (str), target_lang (str), and text (str) as arguments.

The function should simulate what ChatPromptTemplate does: format and return the system message string "You are a professional translator from {source_lang} to {target_lang}.".

Since LangChain cannot run in the browser, use pure Python string formatting.

Loading editor...

Partial Variables and Default Values

Sometimes you know one variable at deploy time but another only at call time. You might fix the output language when you configure the app but let users provide the question. LangChain's .partial() method handles this cleanly.

Partial template binding
Loading editor...
Always respond in English.

After calling .partial(), the returned template no longer requires the bound variable. You get a new template object — the original stays unchanged. I use this constantly for multi-tenant apps where each tenant has different system instructions but the same prompt skeleton.

You can also bind a callable that runs at invoke time. A common pattern is injecting the current date:

Python
Loading editor...

The date updates automatically on every call. No need to thread it through your application code.

MessagesPlaceholder — Injecting Chat History

This is the feature that makes prompt templates genuinely powerful for conversational apps. A MessagesPlaceholder reserves a slot where a list of messages gets injected at runtime.

Without it, you would need to manually concatenate history strings and track role labels yourself. With MessagesPlaceholder, you pass typed message objects and LangChain inserts them in the right position.

Injecting conversation history
Loading editor...
[system] You are a helpful assistant. Be concise.
[human] What is Python?
[ai] A high-level programming language.
[human] What is it used for?
[ai] Web dev, data science, AI, and scripting.
[human] Which area has the most job openings?

The history messages slot in between the system message and the new user message. That is exactly where every LLM provider expects conversation context. The template stays the same whether the history has 0 or 50 messages.


Exercise 2: Simulate MessagesPlaceholder
Write Code

Create a function called build_chat_messages that takes:

  • system_msg (str) — the system instruction
  • history (list of tuples) — each tuple is (role, content)
  • user_input (str) — the new user message
  • Return a list of (role, content) tuples in this order: system message, then all history messages, then the new user message.

    This simulates what ChatPromptTemplate with MessagesPlaceholder produces.

    Loading editor...

    Composing Templates — Piping into an LLM

    A prompt template on its own just produces message objects. The real payoff comes when you pipe it into a model and an output parser using LCEL.

    Template | Model | Parser chain
    Loading editor...

    The | operator connects each step. Data flows left to right: the template produces messages, the model generates a response, and the parser extracts the text. The entire chain is itself a Runnable — you can invoke, stream, or batch it.

    This is where templates earn their keep. Want to swap gpt-4o-mini for Claude or Gemini? The prompt template stays identical. You only change the model object. The template does not care which provider consumes the messages.

    Building a Reusable Prompt Library

    In any serious LLM project, you end up with dozens of prompts for different tasks. Scattering template definitions across your codebase makes them hard to find and easy to accidentally duplicate. A clean pattern is to centralize all templates in one module.

    Centralized prompt library module
    Loading editor...
    Python
    Loading editor...

    Every developer on the team imports from the same prompts.py. Changing a system instruction means editing one file, not hunting through endpoint handlers. I have seen teams with 40+ endpoints where each had its own inline prompt — maintaining that was a nightmare.

    Real-World Example: Multi-Language Code Reviewer

    Here is a practical application combining templates, variables, history, and chains. We will build a code review assistant that accepts a programming language, a code snippet, and optional conversation history for follow-up questions.

    Code reviewer template
    Loading editor...

    The template uses {language} in both the system message and the code fence. The MessagesPlaceholder with optional=True means the first review works without history.

    First review — no history
    Loading editor...

    The model spots the division-by-zero risk and likely suggests a dictionary dispatch pattern.

    For a follow-up, pass the previous exchange as history:

    Follow-up with history
    Loading editor...

    The model has the full context of the original review. It generates the refactored version without needing the code repeated. The template handles conversation continuity while your application code stays clean.

    Common Mistakes and How to Fix Them

    1. Forgetting to Escape Literal Braces

    LangChain uses {variable} syntax for placeholders. If your prompt contains literal curly braces — JSON examples, dictionary literals — you must double them.

    Breaks — LangChain sees {name} as a variable
    template = ChatPromptTemplate.from_messages([
        ("human", 'Return JSON like: {"name": "value"}'),
    ])
    # KeyError: 'name'
    Works — doubled braces become literals
    template = ChatPromptTemplate.from_messages([
        ("human", 'Return JSON like: {{"name": "value"}}'),
    ])
    # Produces: Return JSON like: {"name": "value"}

    2. Passing a String to MessagesPlaceholder

    The MessagesPlaceholder variable expects a list of message objects. Passing a raw string gives a confusing type error.

    Python
    Loading editor...

    3. Variable Name Typos

    If your template uses {user_question} but you pass {"question": "..."}, LangChain raises a KeyError. Always check template.input_variables to see the exact keys it expects.

    Python
    Loading editor...

    4. Putting MessagesPlaceholder in the Wrong Position

    History should go between the system message and the current human message. Placing it after the human message means the model sees the question before the context.

    Python
    Loading editor...

    Exercise 3: Partial Variable Binding
    Write Code

    Create a function called make_partial_prompt that takes base_template (str with {style} and {topic} placeholders) and style (str).

    Return a new function that accepts only topic (str) and returns the fully formatted string.

    This simulates LangChain's .partial() method: binding one variable early and the other later.

    Loading editor...

    Frequently Asked Questions

    Can I load prompt templates from YAML or JSON files?

    LangChain supports loading templates from files using load_prompt() from langchain_core.prompts. However, the YAML schema for ChatPromptTemplate can be brittle across versions. I prefer defining templates in Python modules — you get IDE autocompletion, type checking, and readable version control diffs.

    How do I use Jinja2 syntax instead of f-string syntax?

    Pass template_format="jinja2" when creating a PromptTemplate. This gives you {{ variable }} syntax plus conditionals and loops. Note that ChatPromptTemplate.from_messages() does not support Jinja2 directly — you need ChatPromptTemplate.from_template() for individual message strings.

    Python
    Loading editor...
    List 3 facts about Python.

    Is there a limit on how many messages a template can hold?

    LangChain itself has no hard limit. The constraint comes from the model's context window. A template with few-shot examples and a MessagesPlaceholder can easily produce hundreds of messages. The template will not complain, but the model will reject anything beyond its token limit.

    Always pair your templates with a history-trimming strategy in production.

    Do prompt templates work with non-OpenAI models?

    Yes. Prompt templates are model-agnostic. ChatPromptTemplate produces BaseMessage objects that every LangChain chat model accepts — ChatOpenAI, ChatAnthropic, ChatGoogleGenerativeAI, ChatOllama, and others. Write the prompt once, pipe it into any model.

    References

  • LangChain documentation — Prompt Templates. Link
  • LangChain documentation — ChatPromptTemplate API Reference. Link
  • LangChain documentation — MessagesPlaceholder. Link
  • LangChain documentation — LCEL (LangChain Expression Language). Link
  • Harrison Chase — "Building LLM Applications with LangChain" (LangChain Blog, 2024). Link
  • OpenAI documentation — Chat Completions API. Link
  • Related Tutorials