Skip to main content

API reference

Base URL: https://gateway-api.mastra.ai

All requests require a valid API key with the msk_ prefix. Get your API key from the Memory Gateway dashboard.

Authentication

The gateway supports three authentication modes:

Direct mode

Send your gateway API key in the Authorization header. The gateway routes the request to the configured provider.

Authorization: Bearer msk_...

Pass-through mode (BYOK)

Send your gateway key in X-Memory-Gateway-Authorization and your own provider key in Authorization. The gateway uses your provider key for the upstream request. BYOK is available on the Teams plan and above.

X-Memory-Gateway-Authorization: Bearer msk_...
Authorization: Bearer <your-provider-key>

Anthropic mode

Send your gateway key in the x-api-key header, matching the native Anthropic SDK format.

x-api-key: msk_...

Gateway headers

These headers control gateway behavior on LLM proxy endpoints:

HeaderDescription
x-thread-idThread ID for memory. The gateway loads prior observations and saves new ones to this thread. Requires x-resource-id.
x-resource-idResource ID for memory scoping. Identifies the end user or entity that owns the thread. Required when x-thread-id is set.
x-gateway-toolsOverride gateway tool injection. Set to web_search to enable or none to disable, regardless of project settings.
warning

Sending x-thread-id without x-resource-id returns a 400 Bad Request error. Both headers must be provided together to activate memory.

LLM proxy endpoints

POST /v1/chat/completions

Proxy requests to the OpenAI Chat Completions API. The request body and response follow the OpenAI Chat Completions format.

curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-thread-id: thread_123" \
  -H "x-resource-id: user_456" \
  -d '{
    "model": "openai/gpt-5.4",
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Forwarded headers: chatgpt-account-id

Supports: Streaming ("stream": true)

POST /v1/messages

Proxy requests to the Anthropic Messages API. The request body and response follow the Anthropic Messages format.

curl https://gateway-api.mastra.ai/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "x-thread-id: thread_123" \
  -H "x-resource-id: user_456" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Forwarded headers: anthropic-version, anthropic-beta

Supports: Streaming ("stream": true)

POST /v1/responses

Proxy requests to the OpenAI Responses API. The request body and response follow the OpenAI Responses format.

curl https://gateway-api.mastra.ai/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-thread-id: thread_123" \
  -H "x-resource-id: user_456" \
  -d '{
    "model": "openai/gpt-5.4",
    "input": "Hello"
  }'

Forwarded headers: chatgpt-account-id

Supports: Streaming ("stream": true)

Memory endpoints

All memory endpoints are under /v1/memory. Requires a valid API key. All operations are scoped to the project associated with the API key.

Threads

GET /v1/memory/threads

List threads for the current project.

Query parameters:

resourceId?:
string
Filter threads by resource ID.
limit?:
integer
= 50
Maximum results to return. Max: 200.
offset?:
integer
= 0
Number of results to skip.

Response:

{
  "threads": [
    {
      "id": "thread_abc",
      "projectId": "proj_123",
      "resourceId": "user_456",
      "title": "Conversation about weather",
      "metadata": { "topic": "weather" },
      "createdAt": "2025-01-01T00:00:00.000Z",
      "updatedAt": "2025-01-01T00:00:00.000Z"
    }
  ],
  "total": 1
}

POST /v1/memory/threads

Create a new thread.

Request body:

resourceId:
string
Resource ID to associate with the thread.
id?:
string
Custom thread ID. A random ID is generated if omitted.
title?:
string
Display title for the thread.
metadata?:
object
Arbitrary metadata to attach to the thread.

Response: 201 Created

{
  "thread": {
    "id": "thread_abc",
    "projectId": "proj_123",
    "resourceId": "user_456",
    "title": "My thread",
    "metadata": null,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  }
}

GET /v1/memory/threads/:threadId

Get a single thread by ID.

Response:

{
  "thread": {
    "id": "thread_abc",
    "projectId": "proj_123",
    "resourceId": "user_456",
    "title": "My thread",
    "metadata": null,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  }
}

Errors: 404 if the thread does not exist.

PATCH /v1/memory/threads/:threadId

Update a thread's title or metadata.

Request body:

title?:
string
New title for the thread.
metadata?:
object
New metadata for the thread. Replaces existing metadata.

Response: The updated thread object.

Errors: 404 if the thread does not exist.

DELETE /v1/memory/threads/:threadId

Delete a thread and its associated data.

Response:

{ "ok": true }

Errors: 404 if the thread does not exist.

POST /v1/memory/threads/:threadId/clone

Clone a thread and its messages.

Response: 201 Created

{
  "thread": {
    "id": "thread_new",
    "projectId": "proj_123",
    "resourceId": "user_456",
    "title": "My thread",
    "metadata": null,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  }
}

Messages

GET /v1/memory/threads/:threadId/messages

List messages in a thread.

Query parameters:

limit?:
integer
= 50
Maximum results to return. Max: 200.
offset?:
integer
= 0
Number of results to skip.
order?:
string
= asc
Sort order. `asc` or `desc`.

Response:

{
  "messages": [
    {
      "id": "msg_abc",
      "threadId": "thread_abc",
      "role": "user",
      "content": "Hello",
      "type": "text",
      "createdAt": "2025-01-01T00:00:00.000Z"
    }
  ],
  "total": 1
}

Errors: 404 if the thread does not exist.

POST /v1/memory/threads/:threadId/messages

Save messages to a thread.

Request body:

messages:
array
Array of message objects. Minimum 1 item. Each message has `role` (user, assistant, system, or tool), `content`, and optional `type` (default: text).

Response: 201 Created

{
  "messages": [
    {
      "id": "msg_abc",
      "threadId": "thread_abc",
      "role": "user",
      "content": "Hello",
      "type": "text",
      "createdAt": "2025-01-01T00:00:00.000Z"
    }
  ]
}

Errors: 404 if the thread does not exist.

DELETE /v1/memory/threads/:threadId/messages

Delete specific messages from a thread.

Request body:

messageIds:
array
Array of message IDs to delete. Minimum 1 item.

Response:

{ "ok": true }

Errors: 404 if the thread does not exist.

Observational memory

Observational memory records are created automatically when you include x-thread-id on LLM proxy requests. These endpoints allow you to read the observations the gateway has extracted.

GET /v1/memory/threads/:threadId/observations

Get the active observations for a thread.

Query parameters:

resourceId?:
string
Filter by resource ID.

Response:

{
  "observations": [
    "User prefers TypeScript",
    "User is building a chat application"
  ]
}

Returns { "observations": null } if no observations exist for the thread.

GET /v1/memory/threads/:threadId/observations/record

Get the full observational memory record for a thread, including metadata about the observation process.

Query parameters:

resourceId?:
string
Filter by resource ID.

Response:

{
  "record": {
    "id": "om_abc",
    "scope": "thread",
    "threadId": "thread_abc",
    "resourceId": "user_456",
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z",
    "lastObservedAt": "2025-01-01T01:00:00.000Z",
    "originType": "observation",
    "generationCount": 3,
    "activeObservations": ["User prefers TypeScript"],
    "totalTokensObserved": 1500,
    "observationTokenCount": 200,
    "pendingMessageTokens": 0,
    "isReflecting": false,
    "isObserving": false,
    "isBufferingObservation": false,
    "isBufferingReflection": false
  }
}

Returns { "record": null } if no record exists.

GET /v1/memory/threads/:threadId/observations/history

Get the history of observational memory records for a thread.

Query parameters:

limit?:
integer
= 10
Maximum records to return. Max: 200.
resourceId?:
string
Filter by resource ID.

Response:

{
  "records": [
    {
      "id": "om_abc",
      "scope": "thread",
      "threadId": "thread_abc",
      "resourceId": "user_456",
      "createdAt": "2025-01-01T00:00:00.000Z",
      "updatedAt": "2025-01-01T00:00:00.000Z",
      "lastObservedAt": "2025-01-01T01:00:00.000Z",
      "originType": "reflection",
      "generationCount": 5,
      "activeObservations": ["User prefers TypeScript", "User is building a chat app"],
      "totalTokensObserved": 3000,
      "observationTokenCount": 400,
      "pendingMessageTokens": 0,
      "isReflecting": false,
      "isObserving": false,
      "isBufferingObservation": false,
      "isBufferingReflection": false
    }
  ]
}

Errors: 404 if the thread does not exist.