LangGraph

LangGraph agents connect to the gateway through @langchain/openai's ChatOpenAI class. Point it at the gateway base URL and all LLM calls are proxied with automatic memory.

Before you begin

Complete the LangGraph JS quickstart first. It covers installation and the base project setup for LangGraph.

Configure the model

Create a ChatOpenAI instance with your gateway API key and base URL.

src/agent.ts

import { ChatOpenAI } from "@langchain/openai";

export const model = new ChatOpenAI({
  model: "google/gemini-2.5-flash",
  configuration: {
    apiKey: "YOUR_API_KEY",
    baseURL: "https://gateway-api.mastra.ai/v1",
  },
});

All subsequent examples import this model instance from ./agent.

Chat completions

Send a basic message through the model.

src/chat.ts

import { model } from "./agent";

const response = await model.invoke([
  { role: "user", content: "What is 2+2? Reply with just the number." },
]);

console.log(response.content);
// "4"

System messages

Add a system message at the start of the messages array to define the model's behavior.

const response = await model.invoke([
  { role: "system", content: "You are a calculator. Only respond with numbers, no words." },
  { role: "user", content: "What is 10 * 5?" },
]);

console.log(response.content);
// "50"

Memory with thread and resource IDs

Pass x-thread-id and x-resource-id as default headers to enable observational memory. The gateway stores observations per thread and injects them as context on subsequent requests.

src/agent.ts

import { ChatOpenAI } from "@langchain/openai";

export const model = new ChatOpenAI({
  model: "google/gemini-2.5-flash",
  configuration: {
    apiKey: "YOUR_API_KEY",
    baseURL: "https://gateway-api.mastra.ai/v1",
    defaultHeaders: {
      "x-thread-id": "my-thread-1",
      "x-resource-id": "user-42",
    },
  },
});

// First request: introduce yourself
await model.invoke([
  { role: "user", content: "My name is Alex and I prefer concise answers." },
]);

// Second request: the gateway remembers
const response = await model.invoke([
  { role: "user", content: "What is my name?" },
]);

console.log(response.content);
// "Alex"

Tool calling

Bind tools using the @langchain/core tool helper with Zod schemas.

src/tools.ts

import { ChatOpenAI } from "@langchain/openai";
import { tool } from "@langchain/core/tools";
import { z } from "zod";

const getWeather = tool(
  async ({ location }) => {
    return `The weather in ${location} is sunny, 72°F.`;
  },
  {
    name: "get_weather",
    description: "Get the current weather for a given location",
    schema: z.object({
      location: z.string().describe("The city to get weather for"),
    }),
  },
);

const modelWithTools = new ChatOpenAI({
  model: "google/gemini-2.5-flash",
  configuration: {
    apiKey: "YOUR_API_KEY",
    baseURL: "https://gateway-api.mastra.ai/v1",
  },
}).bindTools([getWeather]);

const response = await modelWithTools.invoke([
  { role: "user", content: "What is the weather in San Francisco?" },
]);

// The response contains tool_calls when the model wants to use a tool
if (response.tool_calls && response.tool_calls.length > 0) {
  console.log(response.tool_calls[0].name);  // "get_weather"
  console.log(response.tool_calls[0].args);  // { location: "San Francisco" }
}

Full agent with StateGraph

Combine ChatOpenAI, tools, and a StateGraph to build a ReAct-style agent that loops between the model and tool execution until the model stops calling tools.

src/agent.ts

import { StateGraph, MessagesAnnotation } from "@langchain/langgraph";
import { ChatOpenAI } from "@langchain/openai";
import { ToolNode } from "@langchain/langgraph/prebuilt";
import { tool } from "@langchain/core/tools";
import { z } from "zod";

const getWeather = tool(
  async ({ location }) => {
    return `The weather in ${location} is sunny, 72°F.`;
  },
  {
    name: "get_weather",
    description: "Get the current weather for a given location",
    schema: z.object({
      location: z.string().describe("The city to get weather for"),
    }),
  },
);

const tools = [getWeather];
const toolNode = new ToolNode(tools);

const model = new ChatOpenAI({
  model: "google/gemini-2.5-flash",
  temperature: 0,
  configuration: {
    apiKey: "YOUR_API_KEY",
    baseURL: "https://gateway-api.mastra.ai/v1",
    defaultHeaders: {
      "x-thread-id": "weather-agent-thread",
    },
  },
}).bindTools(tools);

async function agent(state: typeof MessagesAnnotation.State) {
  const response = await model.invoke(state.messages);
  return { messages: [response] };
}

function shouldContinue(state: typeof MessagesAnnotation.State) {
  const lastMessage = state.messages[state.messages.length - 1];
  if (
    "tool_calls" in lastMessage &&
    Array.isArray(lastMessage.tool_calls) &&
    lastMessage.tool_calls.length > 0
  ) {
    return "tools";
  }
  return "__end__";
}

const workflow = new StateGraph(MessagesAnnotation)
  .addNode("agent", agent)
  .addNode("tools", toolNode)
  .addEdge("__start__", "agent")
  .addConditionalEdges("agent", shouldContinue)
  .addEdge("tools", "agent");

const graph = workflow.compile();

// Run the agent
const result = await graph.invoke({
  messages: [{ role: "user", content: "What is the weather in San Francisco?" }],
});

const reply = result.messages.at(-1);
console.log(reply?.content);
// "The weather in San Francisco is sunny, 72°F."

The agent routes __start__ → agent → tools → agent until no tool_calls remain, then exits at __end__.

Features: Observational memory, streaming, BYOK, and gateway tools
Models: Supported providers and model routing
API reference: Complete endpoint documentation

Before you begin​

Configure the model​

Chat completions​

System messages​

Memory with thread and resource IDs​

Tool calling​

Full agent with StateGraph​

Related​

Before you begin

Configure the model

Chat completions

System messages

Memory with thread and resource IDs

Tool calling

Full agent with StateGraph

Related