Skip to main content

Deep Agents

Deep Agents builds on LangGraph to add task planning, subagent delegation, and file system tools. Connect it to the gateway through langchain-openai's ChatOpenAI class and all LLM calls are proxied with automatic memory.

Before you begin

Complete the Deep Agents Python quickstart first. It covers installation and the base project setup for Deep Agents.

Configure the model

Create a ChatOpenAI instance with your gateway API key and base URL.

agent.py
from langchain_openai import ChatOpenAI

model = ChatOpenAI(
    model="google/gemini-2.5-flash",
    temperature=0,
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
)

Create a deep agent

Use create_deep_agent with a model, tools, and an optional system prompt. Deep Agents adds built-in planning and file system tools automatically.

agent.py
from deepagents import create_deep_agent
from langchain_openai import ChatOpenAI


def get_weather(location: str) -> str:
    """Get the current weather for a given location."""
    return f"The weather in {location} is sunny, 72°F."


model = ChatOpenAI(
    model="google/gemini-2.5-flash",
    temperature=0,
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
)

agent = create_deep_agent(
    model=model,
    tools=[get_weather],
    system_prompt="You are a helpful assistant. Use the weather tool when asked about weather.",
)

Run the agent

The agent returned by create_deep_agent is a compiled LangGraph graph. Invoke it the same way you invoke any LangGraph graph.

run.py
result = agent.invoke({
    "messages": [{"role": "user", "content": "What is the weather in Tokyo?"}]
})

reply = result["messages"][-1].content
print(reply)
# "The weather in Tokyo is sunny, 72°F."

Memory with thread and resource IDs

Pass x-thread-id and x-resource-id as default headers to enable observational memory. The gateway stores observations per thread and injects them as context on subsequent requests.

agent.py
from deepagents import create_deep_agent
from langchain_openai import ChatOpenAI

model = ChatOpenAI(
    model="google/gemini-2.5-flash",
    temperature=0,
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
    default_headers={
        "x-thread-id": "my-thread-1",
        "x-resource-id": "user-42",
    },
)

agent = create_deep_agent(
    model=model,
    tools=[get_weather],
    system_prompt="You are a helpful assistant.",
)

# First request: introduce yourself
agent.invoke({
    "messages": [{"role": "user", "content": "My name is Alex and I prefer concise answers."}]
})

# Second request: the gateway remembers
result = agent.invoke({
    "messages": [{"role": "user", "content": "What is my name?"}]
})

print(result["messages"][-1].content)
# "Alex"

Subagents

Deep Agents supports spawning specialized subagents for context isolation. Define subagents as dictionaries with a name, description, and their own tools.

agent.py
agent = create_deep_agent(
    model=model,
    tools=[get_weather],
    system_prompt=(
        "You are a helpful assistant. "
        "Delegate weather research to the weather subagent."
    ),
    subagents=[
        {
            "name": "weather_researcher",
            "description": "Investigate weather-related questions and summarize findings.",
            "system_prompt": "You help with weather requests. Use available tools and return concise notes.",
            "tools": [get_weather],
        }
    ],
)

Streaming

Stream responses for incremental output.

stream.py
import asyncio


async def main():
    async for event in agent.astream(
        {"messages": [{"role": "user", "content": "What is the weather in Paris?"}]},
        stream_mode="messages",
    ):
        message, metadata = event
        if message.content:
            print(message.content, end="", flush=True)


asyncio.run(main())
  • Features: Observational memory, streaming, BYOK, and gateway tools
  • Models: Supported providers and model routing
  • API reference: Complete endpoint documentation