How to Build a Coding Agent with Pydantic AI Direct API
📚 Reference Implementation: This tutorial is based on Codantic AI - an educational coding agent that demonstrates these concepts in action.
Ready to build your own AI coding agent? In this comprehensive tutorial, we'll create a fully functional coding agent using Pydantic AI's Direct API. Instead of relying on high-level abstractions, we'll build everything from scratch to understand exactly how modern AI agents work.
By the end of this tutorial, you'll have a working coding agent with file operations, shell access, and intelligent conversation management - plus the knowledge to extend it further.
What We're Building
We'll create an AI coding agent with these capabilities:
- 📁 File Operations: Read, write, and edit files securely
- 🔍 Code Analysis: Search and analyze codebases
- 🔧 Shell Access: Execute commands safely
- 💭 Smart Conversations: Maintain context and handle complex tasks
- 🛡️ Security: Path validation and audit logging
Why Use Pydantic AI Direct API?
Most AI agent frameworks hide complexity behind convenient APIs, which is great for productivity but terrible for learning. We'll use Pydantic AI's Direct API to build everything explicitly, giving you complete control:
# Direct API usage - no hidden abstractions!
model_response = model_request_sync(
self.model_name,
self.context, # Full conversation history
model_request_parameters=ModelRequestParameters(
function_tools=tools_definitions, # Your custom tools
allow_text_output=True,
),
)
This approach lets you understand exactly:
- 📝 How context is managed (
self.context
list) - 🛠 How tools are integrated (
function_tools
) - 🔄 How agent loops work (iteration with tool calling)
- 🧠 How context trimming prevents token overflow
Prerequisites and Setup
Before we start building, make sure you have:
- Python 3.13+
- Google Gemini API key (Get one here)
- Basic understanding of Python and async/await patterns
Let's set up our project:
mkdir my-coding-agent && cd my-coding-agent
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install pydantic-ai>=1.0.1 python-dotenv>=1.1.1 rich>=14.1.0
# Create environment file
echo "GOOGLE_API_KEY=your_api_key_here" > .env
Step 1: Understanding the Agent Loop Pattern
Our coding agent will implement a classic agent loop pattern that forms the backbone of most AI coding assistants:
This loop continues until either:
- The model provides a final text response (no tool calls)
- Maximum iterations are reached (safety limit)
- An error occurs
Step 2: Creating the Core Agent Class
Let's start building! Create agent_loop.py
and implement the core AgentLoop
class:
# agent_loop.py
from typing import List
from pydantic_ai.direct import model_request_sync
from pydantic_ai.messages import ModelRequest, ModelMessage, UserPromptPart, SystemPromptPart, ToolReturnPart
from pydantic_ai.models import ModelRequestParameters
from pydantic_ai.usage import RunUsage
class AgentLoop:
"""Our main coding agent with conversation history and tool calling."""
def __init__(self, model_name: str = 'google-gla:gemini-2.5-flash',
working_directory: str = './code',
max_context_tokens: int = 100000):
self.model_name = model_name
self.working_directory = working_directory
self.context: List[ModelMessage] = []
self.max_iterations = 20 # Prevent infinite loops
self.max_context_tokens = max_context_tokens
self.total_usage = RunUsage()
# Load system prompt (we'll create this next)
system_content = self._load_system_prompt()
system_message = ModelRequest(parts=[SystemPromptPart(content=system_content)])
self.context.append(system_message)
def _load_system_prompt(self) -> str:
"""Load the system prompt that defines our agent's behavior."""
return """You are a helpful AI coding assistant with access to file system tools.
Be concise and direct. Use the available tools to help users with coding tasks.
Always validate file paths for security and provide helpful error messages."""
This creates our foundation with proper initialization and system prompt loading.
Step 3: Building Your First Tool - File Reader
Now let's create our first tool. Create tools/read_tool.py
:
# tools/read_tool.py
import os
from pydantic import BaseModel, Field
from pydantic_ai.tools import ToolDefinition
MAX_CHARS = 10000
def read_file(working_directory: str, path: str, skip: int = 0, lines: int = None) -> str:
"""Securely read files with path validation."""
try:
# Security: validate paths to prevent directory traversal
abs_working_dir = os.path.abspath(working_directory)
abs_file_path = os.path.abspath(os.path.join(working_directory, path))
if not abs_file_path.startswith(abs_working_dir):
return f'Error: Cannot read "{path}" - outside permitted directory'
with open(abs_file_path, 'r') as f:
if skip == 0 and lines is None:
return f.read(MAX_CHARS)
file_lines = f.readlines()
start_idx = skip
end_idx = start_idx + lines if lines else len(file_lines)
return ''.join(file_lines[start_idx:end_idx])[:MAX_CHARS]
except FileNotFoundError:
return f"Error: File not found: {path}"
except Exception as e:
return f"Error: {e}"
# Pydantic model for tool parameters
class ReadParams(BaseModel):
path: str = Field(description="Path to file, relative to working directory")
skip: int = Field(default=0, description="Lines to skip from beginning")
lines: int = Field(default=None, description="Number of lines to read")
# Tool definition for Pydantic AI
read_tool_definition = ToolDefinition(
name='read',
description='Read files from the filesystem with optional line limits',
parameters_json_schema=ReadParams.model_json_schema(),
)
Key security features:
- Path Validation: Prevents directory traversal attacks
- Size Limits: Prevents memory exhaustion
- Error Handling: Graceful failure with helpful messages
Step 4: Implementing Smart Context Management
Add context management to your AgentLoop
class. This prevents expensive API calls when conversations get long:
# Add to AgentLoop class
def _trim_context(self):
"""Trim context when over token limit, always keep system prompt."""
if self.last_input_tokens <= self.max_context_tokens:
return
# Calculate how many messages to keep (50% by default)
trim_ratio = 0.5
target_messages = int(len(self.context) * trim_ratio)
if target_messages < 2: # Always keep system + 1 message
target_messages = 2
# Remove oldest messages (keep system at index 0)
messages_to_remove = len(self.context) - target_messages
for _ in range(messages_to_remove):
if len(self.context) > 2:
self.context.pop(1) # Remove oldest non-system message
print(f"🔄 Trimmed context: kept {len(self.context)} messages")
def _update_token_usage(self, model_response):
"""Track token usage from API responses."""
self.total_usage.incr(model_response.usage)
self.last_input_tokens = model_response.usage.input_tokens or 0
This smart trimming ensures:
- 🔒 System prompt is never lost (contains tool definitions)
- 📊 Token usage stays within limits (prevents expensive API calls)
- 🧠 Recent context is preserved (maintains conversation flow)
Step 5: Creating a Complete Tool System
Let's build a complete tool system. Create tools/tool_registry.py
to manage all our tools:
# tools/tool_registry.py
from .read_tool import read_file, read_tool_definition
# Tool execution mapping
TOOL_FUNCTIONS = {
"read": read_file,
# We'll add more tools here
}
# Tool definitions for Pydantic AI
TOOL_DEFINITIONS = [
read_tool_definition,
# We'll add more definitions here
]
def execute_tool(function_call_part, working_directory):
"""Execute a tool call with proper error handling."""
function_name = function_call_part.tool_name
if function_name not in TOOL_FUNCTIONS:
return {"error": f"Unknown tool: {function_name}"}
# Prepare arguments
args = dict(function_call_part.args) if function_call_part.args else {}
args["working_directory"] = working_directory
try:
result = TOOL_FUNCTIONS[function_name](**args)
return {"result": result}
except Exception as e:
return {"error": f"Tool execution failed: {str(e)}"}
Now let's add more essential tools. Create tools/write_tool.py
:
# tools/write_tool.py
import os
from pydantic import BaseModel, Field
from pydantic_ai.tools import ToolDefinition
def write_file(working_directory: str, path: str, content: str, force: bool = False) -> str:
"""Write content to files with security validation."""
try:
abs_working_dir = os.path.abspath(working_directory)
abs_file_path = os.path.abspath(os.path.join(working_directory, path))
if not abs_file_path.startswith(abs_working_dir):
return f'Error: Cannot write "{path}" - outside permitted directory'
# Create directories if needed
os.makedirs(os.path.dirname(abs_file_path), exist_ok=True)
# Check if file exists and force flag
if os.path.exists(abs_file_path) and not force:
return f'Error: File "{path}" already exists. Use force=True to overwrite'
with open(abs_file_path, 'w') as f:
f.write(content)
return f'Successfully wrote {len(content)} characters to "{path}"'
except Exception as e:
return f"Error writing file: {e}"
class WriteParams(BaseModel):
path: str = Field(description="Path to write file, relative to working directory")
content: str = Field(description="Content to write to the file")
force: bool = Field(default=False, description="Overwrite existing files")
write_tool_definition = ToolDefinition(
name='write',
description='Write content to files with security validation',
parameters_json_schema=WriteParams.model_json_schema(),
)
Each tool follows this consistent pattern:
- Security validation (path checking)
- Error handling (graceful failures)
- Clear feedback (success/error messages)
- Pydantic schemas (type safety)
Step 6: Implementing the Main Agent Loop
Now let's implement the core agent loop in your AgentLoop
class. Add this method:
# Add to AgentLoop class
def run(self, user_input: str) -> str:
"""Run a complete conversation turn with tool calling support."""
# Add user input to context
user_message = ModelRequest(parts=[UserPromptPart(content=user_input)])
self.context.append(user_message)
# Trim context if needed
self._trim_context()
iteration = 0
while iteration < self.max_iterations:
iteration += 1
try:
# Make request to model with tools
model_response = model_request_sync(
self.model_name,
self.context,
model_request_parameters=ModelRequestParameters(
function_tools=TOOL_DEFINITIONS,
allow_text_output=True,
),
)
# Track token usage
self._update_token_usage(model_response)
# Add response to context
self.context.append(model_response)
# Process response parts
has_tool_calls = False
final_text = ""
tool_return_parts = []
for part in model_response.parts:
if hasattr(part, 'tool_name') and part.tool_name:
has_tool_calls = True
# Execute the tool
tool_result = execute_tool(part, self.working_directory)
print(f"🔧 Executed {part.tool_name}: {tool_result.get('result', tool_result.get('error', ''))[:50]}...")
# Create tool return part
tool_return = ToolReturnPart(
tool_name=part.tool_name,
content=str(tool_result),
tool_call_id=getattr(part, 'tool_call_id', 'default')
)
tool_return_parts.append(tool_return)
else:
# Text response
final_text += getattr(part, 'content', str(part))
# If we have tool calls, add results and continue
if has_tool_calls:
tool_message = ModelRequest(parts=tool_return_parts)
self.context.append(tool_message)
self._trim_context()
continue
# No tool calls = final response
return final_text.strip() if final_text else "Task completed."
except Exception as e:
return f"Error: {str(e)}"
return "Maximum iterations reached."
Step 7: Creating the Main Application
Finally, let's create main.py
to tie everything together:
# main.py
import os
from dotenv import load_dotenv
from agent_loop import AgentLoop
from tools.tool_registry import TOOL_FUNCTIONS, TOOL_DEFINITIONS
from tools.read_tool import read_file, read_tool_definition
from tools.write_tool import write_file, write_tool_definition
# Load environment variables
load_dotenv()
# Update tool registry
TOOL_FUNCTIONS.update({
"read": read_file,
"write": write_file,
})
TOOL_DEFINITIONS.extend([
read_tool_definition,
write_tool_definition,
])
def main():
"""Run the coding agent."""
print("🤖 AI Coding Agent - Type 'quit' to exit")
# Create code directory
code_dir = './code'
os.makedirs(code_dir, exist_ok=True)
agent = AgentLoop(working_directory=code_dir)
while True:
try:
user_input = input("\n👤 You: ").strip()
if user_input.lower() in ['quit', 'exit', 'q']:
break
if not user_input:
continue
# Run the agent
response = agent.run(user_input)
print(f"\n🤖 Agent: {response}")
except KeyboardInterrupt:
print("\n👋 Goodbye!")
break
except Exception as e:
print(f"\n❌ Error: {e}")
if __name__ == "__main__":
main()
Step 8: Testing Your Agent
Now let's test our coding agent! Run it and try these commands:
python main.py
Try these interactions:
- File Creation: "Create a Python file called hello.py with a simple hello world function"
- File Reading: "Read the hello.py file I just created"
- Code Analysis: "What does this code do?" (after reading a file)
How Your Agent Works in Practice
Here's what happens when you ask: "Create a Python file with a factorial function"
- User Input: Agent receives request and adds it to context
- Model Processing: Gemini analyzes the request with available tools
- Tool Selection: Model chooses
write
tool to create the file - Tool Execution: Agent securely writes the factorial function
- Response: Agent confirms successful file creation
Advanced Features to Add
Once you have the basic agent working, try adding these features:
1. More Tools
# tools/bash_tool.py - Execute shell commands safely
# tools/edit_tool.py - Edit files with string replacements
# tools/search_tool.py - Search code with patterns
2. Rich Display
from rich.console import Console
from rich.panel import Panel
# Add beautiful formatting for tool outputs
console = Console()
console.print(Panel("Tool executed successfully!", style="green"))
3. Conversation Memory
# Save conversation history to files
# Load previous conversations
# Export agent sessions
Security Best Practices
Your agent includes built-in security features:
Path Validation
# Always validate file paths
abs_working_dir = os.path.abspath(working_directory)
abs_file_path = os.path.abspath(os.path.join(working_directory, path))
if not abs_file_path.startswith(abs_working_dir):
return f'Error: Path outside working directory'
Resource Limits
- Token limits prevent expensive API calls
- Iteration limits prevent infinite loops
- File size limits prevent memory issues
- Working directory isolation prevents system access
Key Learning Points
Building this agent teaches you:
- Direct API Control: No hidden abstractions, complete transparency
- Agent Loop Patterns: The fundamental pattern behind all AI agents
- Tool Integration: How to extend AI capabilities safely
- Context Management: Balancing memory with cost efficiency
- Security Patterns: Essential protections for AI systems
Complete Project Structure
Your final project should look like this:
my-coding-agent/
├── .env # API keys
├── main.py # Main application
├── agent_loop.py # Core agent logic
└── tools/
├── __init__.py
├── tool_registry.py # Tool management
├── read_tool.py # File reading
└── write_tool.py # File writing
Next Steps: Extending Your Agent
Once you have the basic agent working, try adding these features:
1. Add More Tools
# tools/bash_tool.py - Execute shell commands safely
# tools/edit_tool.py - Edit files with string replacements
# tools/search_tool.py - Search code with regex patterns
2. Enhanced UI
from rich.console import Console
from rich.panel import Panel
# Add beautiful formatting to your agent
console = Console()
console.print(Panel("🤖 Agent Response", style="green"))
3. Conversation Persistence
# Save/load conversation history
# Export chat sessions
# Resume previous conversations
What You've Learned
By building this coding agent, you now understand:
- 🤖 Agent Loop Architecture - The core pattern behind all AI agents
- 🔧 Tool Integration - How to extend AI capabilities safely and effectively
- 💭 Context Management - Balancing conversation memory with API costs
- 🛡️ Security Patterns - Essential protections for AI-powered systems
- 📊 Direct API Usage - No abstractions hiding the implementation details
Why This Matters
Understanding these patterns prepares you to:
- Debug AI agent issues when tools fail or behave unexpectedly
- Build production agents using the same fundamental patterns
- Optimize performance by managing context and tool usage efficiently
- Implement security for AI systems that interact with files and code
- Innovate on existing frameworks by understanding their underlying architecture
Conclusion
You've successfully built a functional AI coding agent using Pydantic AI's Direct API! This minimal, educational approach reveals the core concepts that power all modern AI agents - from GitHub Copilot to Claude Code.
The agent you built demonstrates that powerful AI systems don't always require complex frameworks. Sometimes the best way to understand and build AI agents is to start from the fundamentals and work your way up.
Your agent can:
- ✅ Read and write files securely
- ✅ Maintain intelligent conversations
- ✅ Handle errors gracefully
- ✅ Manage context efficiently
- ✅ Execute tools safely
Next steps:
- Add more tools (bash, edit, search)
- Implement conversation persistence
- Add rich terminal UI formatting
- Deploy as a web service
- Scale up for production use
The patterns you've learned apply to any AI agent architecture. Whether you're building with LangChain, CrewAI, or custom solutions, these fundamentals remain the same.
Ready to build more advanced agents? The foundation you've built here will serve you well in any AI agent framework.
Ready to see it in action?
- 🔗 Complete Implementation: Codantic AI Repository
- 🎬 Live Demo: See the agent working with file operations and code analysis
- 🚀 Get Started: Clone the repo and start building your own AI agents today!