Pydantic AI

Pydantic AI agents connect to the gateway through the openai Python SDK. Create an AsyncOpenAI client pointed at the gateway and wrap it in Pydantic AI's OpenAIProvider.

Before you begin

Complete the Pydantic AI setup guide first. It covers dependency installation, API keys, and the base setup for running Pydantic AI examples.

Configure the model

Create an AsyncOpenAI client with your gateway API key and base URL, then pass it to Pydantic AI's OpenAIChatModel.

agent.py

from openai import AsyncOpenAI
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

openai_client = AsyncOpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
)

model = OpenAIChatModel(
    "google/gemini-2.5-flash",
    provider=OpenAIProvider(openai_client=openai_client),
)

All subsequent examples use this model instance.

Create an agent

Define a Pydantic AI agent with a system prompt and the gateway-backed model.

agent.py

from pydantic_ai import Agent

agent = Agent(
    model,
    system_prompt="You are a helpful assistant. Keep answers concise.",
)

Run the agent

Call agent.run() with a user message to get a response from the gateway.

run.py

import asyncio

async def main():
    result = await agent.run("What is 2+2? Reply with just the number.")
    print(result.output)
    # "4"

asyncio.run(main())

Memory with thread and resource IDs

Pass x-thread-id and x-resource-id as default headers on the AsyncOpenAI client to enable observational memory. The gateway stores observations per thread and injects them as context on subsequent requests.

agent.py

from openai import AsyncOpenAI
from pydantic_ai import Agent
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

openai_client = AsyncOpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
    default_headers={
        "x-thread-id": "my-thread-1",
        "x-resource-id": "user-42",
    },
)

model = OpenAIChatModel(
    "google/gemini-2.5-flash",
    provider=OpenAIProvider(openai_client=openai_client),
)

agent = Agent(
    model,
    system_prompt="You are a helpful assistant.",
)

memory.py

import asyncio

async def main():
    # First request: introduce yourself
    await agent.run("My name is Alex and I prefer concise answers.")

    # Second request: the gateway remembers
    result = await agent.run("What is my name?")
    print(result.output)
    # "Alex"

asyncio.run(main())

Tool calling

Define tools as async functions decorated with the agent's tool registration.

agent.py

from dataclasses import dataclass
from pydantic_ai import Agent, RunContext

@dataclass
class AgentDeps:
    user_name: str = "friend"

agent = Agent(
    model,
    deps_type=AgentDeps,
    system_prompt="You are a helpful weather assistant.",
)

@agent.tool
async def get_weather(ctx: RunContext[AgentDeps], location: str) -> str:
    """Get the current weather for a given location."""
    return f"The weather in {location} is sunny, 72°F."

Run with tools

Invoke the agent with a query that triggers the registered tool. Pass dependencies through the deps argument.

run.py

import asyncio

async def main():
    deps = AgentDeps(user_name="Alex")
    result = await agent.run("What is the weather in San Francisco?", deps=deps)
    print(result.output)
    # "The weather in San Francisco is sunny, 72°F."

asyncio.run(main())

Streaming

Stream agent responses for incremental output.

stream.py

import asyncio

async def main():
    deps = AgentDeps(user_name="Alex")

    async with agent.run_stream("Tell me about San Francisco weather.", deps=deps) as stream:
        async for chunk in stream.stream_text():
            print(chunk, end="", flush=True)

asyncio.run(main())

Full example

A complete agent with gateway memory, tools, and dependencies:

agent.py

import asyncio
from dataclasses import dataclass

from openai import AsyncOpenAI
from pydantic_ai import Agent, RunContext
from pydantic_ai.models.openai import OpenAIChatModel
from pydantic_ai.providers.openai import OpenAIProvider

# Gateway client with memory
openai_client = AsyncOpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
    default_headers={
        "x-thread-id": "weather-agent-thread",
        "x-resource-id": "user-42",
    },
)

model = OpenAIChatModel(
    "google/gemini-2.5-flash",
    provider=OpenAIProvider(openai_client=openai_client),
)


@dataclass
class AgentDeps:
    user_name: str = "friend"


agent = Agent(
    model,
    deps_type=AgentDeps,
    system_prompt=(
        "You are a helpful weather assistant. "
        "Use the weather tool when the user asks about weather."
    ),
    tools=[],
)


@agent.tool
async def get_weather(ctx: RunContext[AgentDeps], location: str) -> str:
    """Get the current weather for a given location."""
    return f"The weather in {location} is sunny, 72°F."


async def main():
    deps = AgentDeps(user_name="Alex")

    result = await agent.run("What is the weather in San Francisco?", deps=deps)
    print(result.output)
    # "The weather in San Francisco is sunny, 72°F."


asyncio.run(main())

Features: Observational memory, streaming, BYOK, and gateway tools
Models: Supported providers and model routing
API reference: Complete endpoint documentation

Before you begin​

Configure the model​

Create an agent​

Run the agent​

Memory with thread and resource IDs​

Tool calling​

Run with tools​

Streaming​

Full example​

Related​

Before you begin

Configure the model

Create an agent

Run the agent

Memory with thread and resource IDs

Tool calling

Run with tools

Streaming

Full example

Related