OpenAI API in Python: Build Your First AI App in 5 Minutes

Beginner8 min2 exercises30 XP

Prerequisites

Python for Beginners Python Strings Python Dictionaries Python Functions

0/2 exercises

You're about to write a Python script that thinks. Not a rules engine, not a regex hack — an actual AI that reads your question and writes back a thoughtful answer. The whole thing takes three lines of real code.

Your First OpenAI API Call in Python

Code first, explanation after. Install the library, set your API key, and hit Run:

Your first AI call

Loading editor...

The exact wording will differ every time — the model generates fresh text on each call. Three lines of actual logic (create client, make request, print result) and you have a working AI assistant.

Anatomy of an API Call

That code block has four moving parts. Understanding them now saves you hours of confused debugging later.

Client — openai.AsyncOpenAI() creates a connection to OpenAI's servers. We use the async version because it works natively in browser environments like this one.

Model — "gpt-4o-mini" is fast and cheap (about $0.15 per million input tokens). Perfect for learning.

Messages — A list of dictionaries. Each has a "role" and "content". Right now we're sending one user message.

Response — The AI's answer lives at response.choices[0].message.content.

The response object also contains token usage — that is how OpenAI measures and bills you. The code below prints the model name, finish reason, and a cost breakdown. Each token is roughly 4 characters of English text:

Inspecting the response object

Loading editor...

System Messages — Programming AI Behavior

Here is a problem you will hit immediately: you ask the AI to explain a for loop, and it writes a 500-word essay with advanced examples. You wanted a beginner-friendly answer. How do you control the style without rewriting your question every time?

The messages list supports three roles:

"system" — Background instructions that shape every response. The end user never sees this, but it controls tone, format, and scope.

"user" — The question or input from the person using your app.

"assistant" — Previous AI responses (for multi-turn conversations — covered in the chatbot tutorial).

Watch how a system message transforms the output — same question, completely different response. The first call has no system message. The second tells the AI to be a beginner-friendly Python tutor with a 150-word limit:

Without a system message

Loading editor...

With a system message

Loading editor...

The system message did not change what the AI knows — it changed how it communicates. Think of it as programming the AI's behaviour without writing any extra logic.

Exercise: Make the AI Talk Like a Pirate

Write Code

Write a system message that makes the AI respond like a pirate. Then ask it: "What is a Python list?"

Print the AI's response followed by "DONE" on a new line.

Your system message should include the word "pirate" so the AI knows what style to use. The client variable is already set up.

Loading editor...

Temperature — Creativity vs. Precision

Ask the AI the same question three times and you get three different answers. Sometimes that is what you want — brainstorming, creative writing. Other times you need the same reliable answer every time — code generation, data extraction. The temperature parameter controls this tradeoff.

Temperature	Behaviour	Best for
0.0	Deterministic — same input gives nearly identical output	Code generation, data extraction, factual Q&A
0.5	Balanced — some variation, mostly predictable	Tutoring, summaries
1.0	Creative — varied responses, may take unexpected angles	Brainstorming, creative writing
1.5+	Very creative — occasionally wild or incoherent	Rarely useful in practice

The code below asks the same factual question three times at temperature 0, then three times at temperature 1. At temperature 0, all three answers converge to nearly the same text. At temperature 1, each response takes a different angle — different facts, different phrasing:

Temperature 0 vs. temperature 1

Loading editor...

My rule of thumb: start with temperature=0 for anything where correctness matters, and only raise it when you actually want variety.

Building Real Tools — Tutor, Debugger, Translator

Now that you understand the three building blocks (model, messages, temperature), let us combine them into tools you would actually use. All three follow the exact same pattern — the only thing that changes is the system message.

Tool 1: Python tutor. This function wraps a single API call with a system message that enforces a beginner-friendly format: plain English answer first, then one minimal code example, then a line-by-line walkthrough. Temperature is set to 0.3 for consistency:

Tool 1: A Python tutor

Loading editor...

Tool 2: Code debugger. Paste broken code, get a diagnosis and a fix. Temperature 0 because you want the correct fix, not a creative one. The system message structures the output into three parts — identify the bug, show corrected code, explain why the original failed:

Tool 2: A code debugger

Loading editor...

That is a genuinely tricky bug — modifying a list while iterating by index causes skipped elements. Most beginners would not catch it. The AI identifies the root cause and suggests a list comprehension as the fix.

Tool 3: Code translator. Translates Python to any language. The target_language parameter slots directly into the system message — showing how flexible this pattern really is:

Tool 3: A code translator

Loading editor...

All three tools use the exact same API call. The only difference is the system message. That is the central insight: the system message is your primary programming interface for AI behaviour.

Exercise: Build a Code Explainer

Write Code

Create a function called explain_code that takes a Python code snippet and returns a plain-English explanation. Use a system message that tells the AI to:

1. Explain what the code does in one paragraph

2. Walk through it line by line

3. Keep the explanation under 200 words

Then test it on the code: sorted(set("hello world")) and print the result followed by "DONE" on a new line.

The client variable is already set up.

Loading editor...

Streaming — See the AI Think in Real Time

By default, the API waits until the entire response is generated before returning anything. For short answers that is fine. But for longer outputs, the user stares at a blank screen for seconds. Streaming fixes this — tokens arrive one at a time, just like ChatGPT's typing effect.

Add stream=True to your API call and iterate over the response chunks. Each chunk contains a small piece of the answer (often a single word or punctuation mark). The code below prints each chunk as it arrives, building the response incrementally:

Streaming API response

Loading editor...

Common Mistakes and How to Fix Them

These are the mistakes that trip up most beginners. Knowing them upfront saves you from puzzling debugging sessions.

Mistake 1: Hardcoding Your API Key

❌ Key visible in your code

client = openai.AsyncOpenAI(api_key="sk-abc123...")

✅ Key stored in environment variable

import os
client = openai.AsyncOpenAI(api_key=os.environ["OPENAI_API_KEY"])

Set it in your terminal: export OPENAI_API_KEY="sk-your-key". Never commit API keys to Git — even in private repos. If your key leaks, anyone can run up charges on your account.

Mistake 2: Ignoring the finish_reason

If the model hits its token limit, it stops mid-sentence — and your code happily processes a chopped-off response. Always check response.choices[0].finish_reason:

"stop" — Completed normally. Good.

"length" — Ran out of tokens. Your response got cut off. Increase max_tokens in your API call.

Mistake 3: Vague Prompts

❌ Vague — generic boilerplate output

{"role": "user", "content": "Write code"}

✅ Specific — production-quality output

{"role": "user", "content": "Write a Python function that takes a list of integers and returns the second largest, handling duplicates and empty lists"}

Specificity is free and dramatically improves output quality. Tell the AI exactly what inputs to handle, what format to return, and what edge cases to cover.

Mistake 4: Using the Wrong Model

Start with gpt-4o-mini for everything. It handles most tasks perfectly and costs a fraction of the full GPT-4o model. Only upgrade when the output quality genuinely is not good enough — which for learning purposes is almost never.

Handling API Errors Gracefully

API calls fail. Your key expires, you hit a rate limit, or OpenAI has an outage. The openai library raises specific exceptions for each case. Wrapping your calls in a try/except block with these three exceptions covers 95% of production errors:

Error handling pattern

Loading editor...

Frequently Asked Questions

Can I use Claude or Gemini instead of OpenAI?

Absolutely. The concepts — messages, system prompts, temperature — transfer directly. The API syntax differs slightly. We cover Claude and Gemini in dedicated tutorials.

Do I need a GPU?

No. The AI runs on OpenAI's servers. Your Python script sends text over the internet and gets text back. Any computer with Python and internet access works.

What is the difference between GPT-4o and GPT-4o-mini?

GPT-4o is more capable at complex reasoning, math, and nuanced instructions — but costs about 17x more. GPT-4o-mini is faster and handles most straightforward tasks equally well. Start with mini, upgrade only if needed.

Summary and What Comes Next

You built three working AI tools in this tutorial — a Python tutor, a code debugger, and a code translator. Every one of them uses the same code pattern:

The core pattern

Loading editor...

Every AI app you will build — chatbots, RAG systems, AI agents — extends this pattern with additional messages, tools, or retrieval layers. The foundation does not change.

Next steps in the GenAI learning path:

Build a Chatbot with Memory — where the AI remembers your entire conversation using the assistant role

Prompt Engineering Fundamentals — systematic techniques for getting better outputs

Build with Claude and Gemini — the same concepts applied to Anthropic and Google's APIs

References

OpenAI Chat Completions Guide — official guide to text generation

OpenAI API Reference — chat.completions.create — full parameter documentation

OpenAI Pricing — current model pricing

OpenAI Tokenizer Tool — interactive token counter

OpenAI Prompt Engineering Guide — best practices for effective prompts

Versions used in this tutorial: Python 3.12, openai library 1.x, model gpt-4o-mini.

OpenAI API in Python: Build Your First AI App in 5 Minutes

Your First OpenAI API Call in Python

Anatomy of an API Call

System Messages — Programming AI Behavior

Temperature — Creativity vs. Precision

Building Real Tools — Tutor, Debugger, Translator

Streaming — See the AI Think in Real Time

Common Mistakes and How to Fix Them

Mistake 1: Hardcoding Your API Key

Mistake 2: Ignoring the finish_reason

Mistake 3: Vague Prompts

Mistake 4: Using the Wrong Model

Handling API Errors Gracefully

Frequently Asked Questions

Summary and What Comes Next

References

Related Tutorials

Save your progress across devices