Getting Your API Key
The OpenAI API requires an API key for authentication. Here's how to get one:
- Go to platform.openai.com and create an account
- Navigate to API Keys in the top navigation
- Click Create new secret key
- Give it a descriptive name (e.g., "Development" or "My App")
- Copy the key immediately โ it starts with
sk-...
Security rule: Never paste your API key in public code, GitHub repos, or shared documents. Use environment variables. If a key is ever exposed, rotate it immediately in the dashboard.
Add a spending limit before you start (platform.openai.com โ Settings โ Billing โ Usage limits). Set a hard monthly limit of $10โ$20 while learning.
Understanding OpenAI Models in 2026
OpenAI has simplified its model lineup significantly. Here's what you need to know:
- GPT-4o โ Flagship multimodal model. Best quality, vision support, $2.50/M input tokens
- GPT-4o-mini โ Fast and cheap. Best for most production uses. $0.15/M input tokens
- o3 โ Advanced reasoning model for math, science, complex logic. Slower and pricier.
- o3-mini โ Efficient reasoning model. Good balance for coding and analysis tasks.
- DALL-E 3 โ Image generation. $0.04โ$0.12 per image.
- Whisper โ Speech-to-text transcription. $0.006/minute.
- text-embedding-3-small โ Embeddings for RAG. $0.02/M tokens.
Chat Completions API: The Core
The Chat Completions API is the foundation of most OpenAI applications. Here's the simplest possible Python example:
from openai import OpenAI
client = OpenAI(api_key="sk-your-api-key")
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is the capital of France?"}
]
)
print(response.choices[0].message.content)
# Output: "The capital of France is Paris."Understanding Roles
The messages array contains messages with three possible roles:
- system โ Instructions for the model. Defines behavior, personality, and constraints.
- user โ The user's input message.
- assistant โ Previous model responses (for multi-turn conversations).
# Multi-turn conversation example
messages = [
{"role": "system", "content": "You are a friendly cooking assistant."},
{"role": "user", "content": "I have chicken, garlic, and lemon."},
{"role": "assistant", "content": "Those are great ingredients! I'd recommend lemon garlic chicken..."},
{"role": "user", "content": "How long should I cook it?"}
]
# GPT will use the full conversation history for contextSystem Prompts: The Most Important Skill
Writing effective system prompts is what separates useful AI applications from useless ones. A good system prompt defines:
- Role and personality โ who the AI is and how it speaks
- Context โ what the AI knows about the business or use case
- Constraints โ what it should and shouldn't do
- Output format โ how responses should be structured
# Example: Customer support system prompt
SYSTEM = """You are a customer support agent for TechStore.
Your tone is friendly, professional, and concise.
You can help with:
- Order status and tracking
- Product questions
- Return and refund requests
Company info:
- Returns accepted within 30 days
- Free shipping on orders over โฌ50
- Support hours: Mon-Fri 9am-6pm
If a question is outside your knowledge, say:
"I'll need to check with our team on that.
Can I get your email to follow up?"
Always respond in the same language as the user.
Keep responses under 150 words."""Streaming Responses
Streaming sends tokens as they're generated rather than waiting for the complete response. This dramatically improves perceived performance for chatbot applications.
# Streaming example
stream = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "Write a haiku about Python"}],
stream=True
)
for chunk in stream:
if chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)Function Calling (Tool Use)
Function calling lets the model request the execution of specific functions when it needs information it doesn't have. This is how you build AI agents that can look up data, perform calculations, or interact with APIs.
tools = [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {
"type": "string",
"description": "The city name"
}
},
"required": ["city"]
}
}
}
]
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[{"role": "user", "content": "What's the weather in Rome?"}],
tools=tools,
tool_choice="auto"
)
# Check if the model wants to call a function
if response.choices[0].message.tool_calls:
tool_call = response.choices[0].message.tool_calls[0]
function_name = tool_call.function.name # "get_weather"
arguments = json.loads(tool_call.function.arguments) # {"city": "Rome"}
# Execute the function and send result back
weather_data = get_weather(arguments["city"])
# Continue conversation with tool result
messages.append(response.choices[0].message)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(weather_data)
})Embeddings for Semantic Search
Embeddings convert text into numerical vectors that capture semantic meaning. Similar texts produce similar vectors โ enabling semantic search, clustering, and RAG applications.
# Generate embeddings
response = client.embeddings.create(
model="text-embedding-3-small",
input="The quick brown fox jumps over the lazy dog"
)
embedding = response.data[0].embedding
# Returns a list of 1536 floating point numbers
# Compare two texts with cosine similarity
from numpy import dot
from numpy.linalg import norm
def cosine_similarity(a, b):
return dot(a, b) / (norm(a) * norm(b))
emb1 = get_embedding("What is machine learning?")
emb2 = get_embedding("How does AI work?")
emb3 = get_embedding("Best pasta recipe")
# emb1 and emb2 will be ~0.85 similar
# emb1 and emb3 will be ~0.3 similarAssistants API with File Uploads
The Assistants API is OpenAI's higher-level abstraction for building AI agents with persistent threads, file search, and code execution.
# Create an assistant with file search enabled
assistant = client.beta.assistants.create(
name="Document Analyzer",
instructions="You help users understand technical documentation.",
tools=[{"type": "file_search"}],
model="gpt-4o"
)
# Upload a file
with open("product_manual.pdf", "rb") as f:
file = client.files.create(file=f, purpose="assistants")
# Create a thread and ask a question
thread = client.beta.threads.create()
message = client.beta.threads.messages.create(
thread_id=thread.id,
role="user",
content="What are the safety warnings for Model X?",
attachments=[{"file_id": file.id, "tools": [{"type": "file_search"}]}]
)
# Run the assistant
run = client.beta.threads.runs.create_and_poll(
thread_id=thread.id,
assistant_id=assistant.id
)Cost Management and Rate Limits
Understanding costs before building production systems is crucial:
- Set hard spending limits in platform.openai.com โ Billing โ Usage limits
- Log every API call with token counts โ unexpected costs are always traced to one workflow
- Cache responses for repeated identical queries (FAQ bots answer the same questions)
- Use GPT-4o-mini for 90% of tasks and reserve GPT-4o for when it's truly needed
- Batch requests when processing many items โ the Batch API is 50% cheaper
Building a Simple Chatbot
Let's put it all together โ a simple terminal chatbot that maintains conversation history:
from openai import OpenAI
client = OpenAI()
conversation_history = []
system_message = {
"role": "system",
"content": "You are a helpful assistant. Be concise."
}
def chat(user_input):
conversation_history.append({
"role": "user",
"content": user_input
})
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=[system_message] + conversation_history,
max_tokens=500
)
assistant_message = response.choices[0].message.content
conversation_history.append({
"role": "assistant",
"content": assistant_message
})
return assistant_message
# Main loop
while True:
user_input = input("You: ")
if user_input.lower() == "quit":
break
response = chat(user_input)
print(f"AI: {response}")This is the foundation of every chatbot. From here, you can add system prompts for specific use cases, function calling for external data, streaming for better UX, and persistent storage for cross-session memory.
To build this into a client-facing product without code using n8n, see our n8n + OpenAI integration guide.