LangChain Tools: Build Custom Tools and Connect LLMs to External Services

Intermediate90 min3 exercises55 XP

Prerequisites

0/3 exercises

An LLM can write a beautiful explanation of today's weather — but it has no idea what the actual temperature is. It can describe how to query a database — but it cannot run the query. The gap between knowing how and actually doing is what LangChain tools close. By the end of this tutorial, you'll know how to give an LLM the ability to call any Python function, hit any API, and return structured results — all with type safety and error handling baked in.

What Are LangChain Tools and Why Do They Matter?

A LangChain tool is a Python function wrapped in metadata that tells the LLM what the function does, what inputs it expects, and when to use it. The LLM reads this metadata, decides whether to call the tool, generates the correct arguments, and your code executes the function with those arguments. You'll need Python 3.10+, langchain 0.3+, and langchain-openai (pip install langchain langchain-openai langchain-community).

I think of it this way: you're giving the LLM a menu of capabilities. Each tool is a menu item with a name, a description, and an order form (the input schema). The LLM reads the menu, picks what it needs, fills out the order form, and your code does the actual work.

Here's a minimal example. We create a tool that multiplies two numbers, then bind it to a model so the LLM can decide to use it:

Your first LangChain tool

Loading editor...

Running this prints something like:

Python

Loading editor...

Notice what happened: the LLM did not compute 17 * 28 itself. It recognized that a multiply tool exists, extracted the arguments from the natural language question, and returned a structured tool call. Your code then executes multiply(17, 28) to get the actual answer.

The power here is composition. You can give the LLM ten tools — a calculator, a weather API, a database query function, a web search — and the LLM picks the right one based on the user's question. That is the foundation of AI agents.

The @tool Decorator — The Fast Way to Create Tools

The @tool decorator is the quickest way to turn any Python function into a LangChain tool. I reach for it 90% of the time because it requires zero boilerplate — you just write a normal function with type hints and a docstring.

LangChain reads three things from your decorated function: the function name becomes the tool name, the docstring becomes the description the LLM sees, and the type annotations become the input schema. All three matter — if the docstring is vague, the LLM won't know when to use the tool.

A well-documented tool

Loading editor...

Which produces:

Python

Loading editor...

The schema was generated automatically from the type hint text: str. The LLM receives this schema and knows it must provide a string argument called text.

Tools with Multiple Parameters and Defaults

Tools can accept any number of typed parameters. Optional parameters with defaults work exactly as you'd expect:

Tool with optional parameters

Loading editor...

The LLM sees category as optional in the schema. If the user says "find me some books about Python," the LLM might call search_products(query="Python", category="books"). If the user just says "find me something about cooking," it might omit the category entirely.

StructuredTool — Full Control with Pydantic Schemas

The @tool decorator is convenient, but sometimes you need more control. Maybe you want to validate inputs before the function runs, add field-level descriptions that are richer than what docstrings allow, or define the tool dynamically at runtime. That's where StructuredTool comes in.

StructuredTool lets you define the input schema as a Pydantic model, giving you full validation and type coercion. Here is the same multiply tool from earlier, rebuilt with StructuredTool:

StructuredTool with Pydantic schema

Loading editor...

The result:

Python

Loading editor...

The key difference: each field in MultiplyInput has its own description, which goes straight into the JSON schema the LLM sees. For complex tools with many parameters, these per-field descriptions dramatically improve the LLM's accuracy in generating correct arguments.

When to Use @tool vs StructuredTool

@tool decorator — simple and fast

@tool
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: 22°C, sunny"

# Name, description, schema all auto-generated

StructuredTool — explicit control

class WeatherInput(BaseModel):
    city: str = Field(description="City name, e.g. 'London'")
    units: str = Field(default="celsius", description="Temperature unit")

weather_tool = StructuredTool.from_function(
    func=get_weather_func,
    name="get_weather",
    description="Get weather for a city.",
    args_schema=WeatherInput,
)

My rule of thumb: start with @tool. Move to StructuredTool when you need input validation, per-field descriptions, or dynamic tool creation (e.g., generating tools from a config file at startup).

Binding Tools to Models and Executing Tool Calls

Creating a tool is only half the story. You need to bind it to a chat model so the LLM knows the tool exists, and then execute the tool call when the LLM requests it. This two-step dance is where most beginners get tripped up.

Bind tools and process tool calls

Loading editor...

Running this gives:

Python

Loading editor...

The LLM chose get_word_count over multiply because the question was about counting words, not multiplication. It parsed the quoted text as the argument. This is the core loop of tool-augmented LLMs: ask the LLM, execute the tool, and optionally feed the result back for a final answer.

Feeding Tool Results Back to the LLM

In most real applications, you want the LLM to incorporate the tool result into a natural-language answer. To do this, you send the tool result back as a ToolMessage and invoke the model again:

Complete tool call round-trip

Loading editor...

The model responds with something like:

Python

Loading editor...

This round-trip pattern — human message, AI tool call, tool result, AI final answer — is the same loop that powers ChatGPT plugins, Claude's tool use, and every LangChain agent.

Exercise 1: Build a Tool Dispatch Function

Write Code

Write a function dispatch_tool(tool_name, tool_map, args) that looks up a tool by name in a dictionary, calls it with the given args dictionary, and returns the result as a string. If the tool name is not found, return "Error: Tool 'X' not found" where X is the tool name.

This simulates the core of a tool execution loop in a LangChain agent.

Loading editor...

Built-in Tools — Tavily Search, Wikipedia, and More

LangChain ships with dozens of pre-built tools so you don't have to wrap every API from scratch. The most commonly used ones connect your LLM to the internet, knowledge bases, and system utilities.

Tavily Search — Web Search for LLMs

Tavily is a search API built specifically for LLM applications. It returns clean, structured results instead of raw HTML, which means the LLM gets better context with fewer tokens. You need a Tavily API key (free tier available at tavily.com):

Web search with Tavily

Loading editor...

Each result comes back as a dictionary with title, url, and content fields. The content is already cleaned — no HTML tags, no navigation menus.

Wikipedia — Knowledge Base Lookups

The Wikipedia tool is useful for factual lookups where you want the LLM to ground its answers in encyclopedia-quality content:

Wikipedia tool

Loading editor...

The doc_content_chars_max parameter is crucial for controlling token costs. Without it, Wikipedia articles can eat thousands of tokens per query.

Combining Multiple Tools

The real power shows when you bind several tools to a single model. The LLM picks the right tool for each question:

Multi-tool model

Loading editor...

The LLM routes math questions to multiply, text questions to get_word_count, current-events questions to search, and factual questions to wikipedia. No routing logic on your side — the model handles it based on tool descriptions alone.

Building Real-World Tools — API Wrappers and Database Queries

The toy examples above are useful for understanding the mechanics, but production tools connect to real services. Let me walk through two patterns I use constantly: wrapping a REST API and querying a database.

Wrapping a REST API

Suppose you want your LLM to look up current exchange rates. Here's a tool that wraps a free currency API:

REST API tool with error handling

Loading editor...

Three things to notice. First, the tool returns a string — even for errors. This is important because the LLM needs to read the result and incorporate it into its response. If you raise an exception, the agent loop crashes. Second, the timeout=10 prevents hanging on slow APIs. Third, the descriptive error messages help the LLM explain the failure to the user.

Querying a SQLite Database

Database tools let the LLM answer questions about your data without the user needing to know SQL. Here's a read-only database query tool:

Database query tool

Loading editor...

In the database tool description, I included the exact table schema. This is not optional — the LLM needs to know the column names and types to write correct SQL. Without it, the LLM guesses column names and the queries fail.

Exercise 2: Validate Tool Input Schemas

Write Code

Write a function validate_tool_args(schema, args) that validates a dictionary of arguments against a schema dictionary. The schema maps parameter names to their expected Python types. The function should return a tuple of (is_valid, errors) where is_valid is a boolean and errors is a list of error strings.

Check two things: (1) all required schema keys must be present in args, and (2) each provided value must be an instance of the expected type.

Loading editor...

Tool Error Handling and Retry Logic

Tools fail. APIs time out, databases go down, rate limits get hit. The question isn't if your tools will error — it's whether your application recovers gracefully or crashes in front of the user. I've spent more time debugging error handling in tool-based systems than writing the tools themselves.

Returning Errors as Strings

The simplest error-handling pattern: catch exceptions inside your tool and return the error as a string. The LLM reads the error message and can explain the problem or try a different approach:

Error handling inside a tool

Loading editor...

When the LLM calls divide(10, 0), it gets back "Error: Cannot divide by zero." instead of a Python traceback. The LLM can then tell the user: "I can't divide by zero — could you provide a different denominator?"

Using handle_tool_error

LangChain also provides a built-in handle_tool_error parameter on tools. When set to True, any exception is caught and returned as a string automatically:

Built-in error handling

Loading editor...

If eval raises a SyntaxError or NameError, LangChain catches it and returns the error message to the LLM rather than crashing the agent loop.

You can also pass a custom error handler function for more control:

Custom error handler

Loading editor...

Retry Logic with Fallbacks

For transient errors (timeouts, rate limits), you often want to retry before giving up. Here's a pattern using Python's tenacity library that works well with LangChain tools:

Retry logic with tenacity

Loading editor...

The wait_exponential strategy waits 1 second after the first failure, 2 seconds after the second, then 4 seconds — giving the external service time to recover. After 3 failed attempts, the exception propagates and the tool returns an error message.

Using Tools in LCEL Chains

Tools slot naturally into LCEL chains. A common pattern is to build a chain that takes a user question, calls the LLM with bound tools, executes any tool calls, and returns the final answer — all in a single composable pipeline.

Tool execution in an LCEL chain

Loading editor...

This outputs:

Python

Loading editor...

The pipe operator connects the LLM (which generates tool calls) to the executor (which runs them). You get a single callable chain that handles the full tool-use flow. This is a simpler alternative to using a full agent when you only need one round of tool calls.

Common Mistakes and How to Fix Them

After building dozens of tool-based systems, these are the mistakes I see most often — and I've made every one of them myself.

Mistake 1: Vague Tool Descriptions

Vague description — LLM guesses wrong

@tool
def process(data: str) -> str:
    """Process the data."""
    # What does "process" mean? The LLM has no idea
    return data.upper()

Specific description — LLM routes correctly

@tool
def uppercase_text(text: str) -> str:
    """Convert text to uppercase letters.

    Use when the user asks to capitalize, uppercase,
    or make text ALL CAPS.

    Args:
        text: The text to convert to uppercase.
    """
    return text.upper()

Mistake 2: Missing Type Annotations

Without type annotations, LangChain cannot generate an input schema. The tool either fails to register or generates a wildcard schema that accepts anything:

No type hints — broken schema

@tool
def add(a, b):
    """Add two numbers."""
    return a + b
# Schema is empty — LLM doesn't know what to pass

Type hints — correct schema

@tool
def add(a: int, b: int) -> int:
    """Add two integers and return the sum."""
    return a + b
# Schema: {"a": int, "b": int}

Mistake 3: Raising Exceptions Instead of Returning Errors

If your tool raises an unhandled exception, the entire agent loop crashes. Always catch exceptions and return error messages as strings:

Exception crashes the agent

@tool
def fetch_data(url: str) -> str:
    """Fetch data from a URL."""
    response = requests.get(url)
    response.raise_for_status()  # Raises on 4xx/5xx
    return response.text

Error returned gracefully

@tool
def fetch_data(url: str) -> str:
    """Fetch data from a URL."""
    try:
        response = requests.get(url, timeout=10)
        response.raise_for_status()
        return response.text
    except requests.RequestException as e:
        return f"Failed to fetch {url}: {e}"

Mistake 4: Tools That Return Too Much Data

Returning a 10,000-word Wikipedia article or 500 database rows as a tool result burns through tokens and can exceed context limits. Always truncate, summarize, or paginate tool output:

Truncating tool output

Loading editor...

Exercise 3: Build a Safe Tool Wrapper

Write Code

Write a function safe_tool_call(func, args, max_retries=2) that calls a function with the given args dict and returns the result as a string. If the function raises an exception, retry up to max_retries times. If all retries fail, return "Error after N retries: <error message>" where N is max_retries.

The function should track how many attempts were made.

Loading editor...

Performance Tips and Best Practices

Tool-based LLM applications have a unique performance profile: the bottleneck is almost never your Python code — it's the LLM calls and external API requests. Here's what actually matters for speed and cost.

Keep tool descriptions short but precise. Every character in your tool descriptions and schemas is a token that gets sent with every LLM call. If you have 10 tools with 200-word descriptions each, that's 2,000 words of system-prompt overhead on every request. Aim for 2-3 sentences per tool.

Limit the number of tools bound to a single model. In my experience, GPT-4o handles 10-15 tools well. Beyond 20, the model starts making routing mistakes. If you have 50 tools, group them into categories and use a two-stage approach: the first LLM picks the category, the second LLM (bound to just that category's tools) picks the specific tool.

Cache tool results aggressively. If the exchange rate for USD/EUR was fetched 30 seconds ago, don't hit the API again. Use Python's functools.lru_cache or a Redis cache with a TTL:

Caching tool results

Loading editor...

Use async tools for I/O-bound operations. When your chain calls multiple tools in parallel (via RunnableParallel), async tools prevent one slow API call from blocking the others:

Async tool example

Loading editor...

Frequently Asked Questions

Can I use tools with models other than OpenAI?

Yes. bind_tools() works with any LangChain chat model that supports tool calling — including Anthropic Claude, Google Gemini, Mistral, and local models via Ollama. The interface is identical:

Python

Loading editor...

What is the difference between tools and function calling?

They refer to the same concept. "Function calling" is OpenAI's original term for the feature. "Tool use" is the broader term used by LangChain and Anthropic. In LangChain, both are accessed through the same bind_tools() API regardless of the underlying provider.

How do I make a tool return structured data instead of strings?

Tools can return any serializable type (dicts, lists, Pydantic models). However, when the result is passed back to the LLM as a ToolMessage, it gets serialized to a string. For inter-tool communication within an agent, return dicts. For LLM consumption, format the output as a readable string:

Python

Loading editor...

How many tools can I bind to one model?

There is no hard limit, but practical limits exist. OpenAI supports up to 128 tools per call. Anthropic Claude supports up to 64. However, accuracy degrades well before those limits. In practice, 5-15 well-described tools work reliably. Beyond that, consider a two-stage routing approach.

Summary

LangChain tools bridge the gap between what LLMs know and what they can do. You learned how to create tools with the @tool decorator and StructuredTool, bind them to models, execute tool calls, feed results back, and handle errors with retry logic. The key takeaway: tools are just Python functions with metadata. The LLM reads the metadata, decides when to use the function, and generates the arguments. Your code handles everything else.

From here, the natural next step is agents — systems where the LLM can call tools in a loop, making multiple tool calls to answer a single question. That is covered in the LangChain Chains tutorial.

Complete Code

Complete working script (copy-paste and run)

Loading editor...

References

LangChain documentation — Tools. Link

LangChain documentation — How to create tools. Link

LangChain documentation — How to use chat models to call tools. Link

OpenAI documentation — Function calling. Link

Anthropic documentation — Tool use. Link

Pydantic documentation — Models. Link

Tenacity documentation — Retry library. Link

What Are LangChain Tools and Why Do They Matter?

The @tool Decorator — The Fast Way to Create Tools

Tools with Multiple Parameters and Defaults

StructuredTool — Full Control with Pydantic Schemas

When to Use @tool vs StructuredTool

Binding Tools to Models and Executing Tool Calls

Feeding Tool Results Back to the LLM

Built-in Tools — Tavily Search, Wikipedia, and More

Tavily Search — Web Search for LLMs

Wikipedia — Knowledge Base Lookups

Combining Multiple Tools

Building Real-World Tools — API Wrappers and Database Queries

Wrapping a REST API

Querying a SQLite Database

Tool Error Handling and Retry Logic

Returning Errors as Strings

Using handle_tool_error

Retry Logic with Fallbacks

Using Tools in LCEL Chains

Common Mistakes and How to Fix Them

Mistake 1: Vague Tool Descriptions

Mistake 2: Missing Type Annotations

Mistake 3: Raising Exceptions Instead of Returning Errors

Mistake 4: Tools That Return Too Much Data

Performance Tips and Best Practices

Frequently Asked Questions

Can I use tools with models other than OpenAI?

What is the difference between tools and function calling?

How do I make a tool return structured data instead of strings?

How many tools can I bind to one model?

Summary

Complete Code

References

Related Tutorials