API reference

Base URL: https://gateway-api.mastra.ai

All requests require a valid API key with the msk_ prefix. Get your API key from the Memory Gateway dashboard.

Authentication

The gateway supports three authentication modes:

Direct mode

Send your gateway API key in the Authorization header. The gateway routes the request to the configured provider.

Authorization: Bearer msk_...

Pass-through mode (BYOK)

Send your gateway key in X-Memory-Gateway-Authorization and your own provider key in Authorization. The gateway uses your provider key for the upstream request. BYOK is available on the Teams plan and above.

X-Memory-Gateway-Authorization: Bearer msk_...
Authorization: Bearer <your-provider-key>

Anthropic mode

Send your gateway key in the x-api-key header, matching the native Anthropic SDK format.

x-api-key: msk_...

Gateway headers

These headers control gateway behavior on LLM proxy endpoints:

Header	Description
`x-thread-id`	Thread ID for memory. The gateway loads prior observations and saves new ones to this thread. Requires `x-resource-id`.
`x-resource-id`	Resource ID for memory scoping. Identifies the end user or entity that owns the thread. Required when `x-thread-id` is set.
`x-gateway-tools`	Override gateway tool injection. Set to `web_search` to enable or `none` to disable, regardless of project settings.

warning

Sending x-thread-id without x-resource-id returns a 400 Bad Request error. Both headers must be provided together to activate memory.

LLM proxy endpoints

POST `/v1/chat/completions`

Proxy requests to the OpenAI Chat Completions API. The request body and response follow the OpenAI Chat Completions format.

curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-thread-id: thread_123" \
  -H "x-resource-id: user_456" \
  -d '{
    "model": "openai/gpt-5.4",
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Forwarded headers: chatgpt-account-id

Supports: Streaming ("stream": true)

POST `/v1/messages`

Proxy requests to the Anthropic Messages API. The request body and response follow the Anthropic Messages format.

curl https://gateway-api.mastra.ai/v1/messages \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "anthropic-version: 2023-06-01" \
  -H "x-thread-id: thread_123" \
  -H "x-resource-id: user_456" \
  -d '{
    "model": "anthropic/claude-sonnet-4-6",
    "max_tokens": 1024,
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Forwarded headers: anthropic-version, anthropic-beta

Supports: Streaming ("stream": true)

POST `/v1/responses`

Proxy requests to the OpenAI Responses API. The request body and response follow the OpenAI Responses format.

curl https://gateway-api.mastra.ai/v1/responses \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-thread-id: thread_123" \
  -H "x-resource-id: user_456" \
  -d '{
    "model": "openai/gpt-5.4",
    "input": "Hello"
  }'

Forwarded headers: chatgpt-account-id

Supports: Streaming ("stream": true)

Memory endpoints

All memory endpoints are under /v1/memory. Requires a valid API key. All operations are scoped to the project associated with the API key.

Threads

GET `/v1/memory/threads`

List threads for the current project.

Query parameters:

resourceId?:

string

Filter threads by resource ID.

limit?:

integer

= 50

Maximum results to return. Max: 200.

offset?:

integer

= 0

Number of results to skip.

Response:

{
  "threads": [
    {
      "id": "thread_abc",
      "projectId": "proj_123",
      "resourceId": "user_456",
      "title": "Conversation about weather",
      "metadata": { "topic": "weather" },
      "createdAt": "2025-01-01T00:00:00.000Z",
      "updatedAt": "2025-01-01T00:00:00.000Z"
    }
  ],
  "total": 1
}

POST `/v1/memory/threads`

Create a new thread.

Request body:

resourceId:

string

Resource ID to associate with the thread.

id?:

string

Custom thread ID. A random ID is generated if omitted.

title?:

string

Display title for the thread.

metadata?:

object

Arbitrary metadata to attach to the thread.

Response: 201 Created

{
  "thread": {
    "id": "thread_abc",
    "projectId": "proj_123",
    "resourceId": "user_456",
    "title": "My thread",
    "metadata": null,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  }
}

GET `/v1/memory/threads/:threadId`

Get a single thread by ID.

Response:

{
  "thread": {
    "id": "thread_abc",
    "projectId": "proj_123",
    "resourceId": "user_456",
    "title": "My thread",
    "metadata": null,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  }
}

Errors: 404 if the thread does not exist.

PATCH `/v1/memory/threads/:threadId`

Update a thread's title or metadata.

Request body:

title?:

string

New title for the thread.

metadata?:

object

New metadata for the thread. Replaces existing metadata.

Response: The updated thread object.

Errors: 404 if the thread does not exist.

DELETE `/v1/memory/threads/:threadId`

Delete a thread and its associated data.

Response:

{ "ok": true }

Errors: 404 if the thread does not exist.

POST `/v1/memory/threads/:threadId/clone`

Clone a thread and its messages.

Response: 201 Created

{
  "thread": {
    "id": "thread_new",
    "projectId": "proj_123",
    "resourceId": "user_456",
    "title": "My thread",
    "metadata": null,
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z"
  }
}

Messages

GET `/v1/memory/threads/:threadId/messages`

List messages in a thread.

Query parameters:

limit?:

integer

= 50

Maximum results to return. Max: 200.

offset?:

integer

= 0

Number of results to skip.

order?:

string

= asc

Sort order. `asc` or `desc`.

Response:

{
  "messages": [
    {
      "id": "msg_abc",
      "threadId": "thread_abc",
      "role": "user",
      "content": "Hello",
      "type": "text",
      "createdAt": "2025-01-01T00:00:00.000Z"
    }
  ],
  "total": 1
}

Errors: 404 if the thread does not exist.

POST `/v1/memory/threads/:threadId/messages`

Save messages to a thread.

Request body:

messages:

array

Array of message objects. Minimum 1 item. Each message has `role` (user, assistant, system, or tool), `content`, and optional `type` (default: text).

Response: 201 Created

{
  "messages": [
    {
      "id": "msg_abc",
      "threadId": "thread_abc",
      "role": "user",
      "content": "Hello",
      "type": "text",
      "createdAt": "2025-01-01T00:00:00.000Z"
    }
  ]
}

Errors: 404 if the thread does not exist.

DELETE `/v1/memory/threads/:threadId/messages`

Delete specific messages from a thread.

Request body:

messageIds:

array

Array of message IDs to delete. Minimum 1 item.

Response:

{ "ok": true }

Errors: 404 if the thread does not exist.

Observational memory

Observational memory records are created automatically when you include x-thread-id on LLM proxy requests. These endpoints allow you to read the observations the gateway has extracted.

GET `/v1/memory/threads/:threadId/observations`

Get the active observations for a thread.

Query parameters:

resourceId?:

string

Filter by resource ID.

Response:

{
  "observations": [
    "User prefers TypeScript",
    "User is building a chat application"
  ]
}

Returns { "observations": null } if no observations exist for the thread.

GET `/v1/memory/threads/:threadId/observations/record`

Get the full observational memory record for a thread, including metadata about the observation process.

Query parameters:

resourceId?:

string

Filter by resource ID.

Response:

{
  "record": {
    "id": "om_abc",
    "scope": "thread",
    "threadId": "thread_abc",
    "resourceId": "user_456",
    "createdAt": "2025-01-01T00:00:00.000Z",
    "updatedAt": "2025-01-01T00:00:00.000Z",
    "lastObservedAt": "2025-01-01T01:00:00.000Z",
    "originType": "observation",
    "generationCount": 3,
    "activeObservations": ["User prefers TypeScript"],
    "totalTokensObserved": 1500,
    "observationTokenCount": 200,
    "pendingMessageTokens": 0,
    "isReflecting": false,
    "isObserving": false,
    "isBufferingObservation": false,
    "isBufferingReflection": false
  }
}

Returns { "record": null } if no record exists.

GET `/v1/memory/threads/:threadId/observations/history`

Get the history of observational memory records for a thread.

Query parameters:

limit?:

integer

= 10

Maximum records to return. Max: 200.

resourceId?:

string

Filter by resource ID.

Response:

{
  "records": [
    {
      "id": "om_abc",
      "scope": "thread",
      "threadId": "thread_abc",
      "resourceId": "user_456",
      "createdAt": "2025-01-01T00:00:00.000Z",
      "updatedAt": "2025-01-01T00:00:00.000Z",
      "lastObservedAt": "2025-01-01T01:00:00.000Z",
      "originType": "reflection",
      "generationCount": 5,
      "activeObservations": ["User prefers TypeScript", "User is building a chat app"],
      "totalTokensObserved": 3000,
      "observationTokenCount": 400,
      "pendingMessageTokens": 0,
      "isReflecting": false,
      "isObserving": false,
      "isBufferingObservation": false,
      "isBufferingReflection": false
    }
  ]
}

Errors: 404 if the thread does not exist.

Authentication​

Direct mode​

Pass-through mode (BYOK)​

Anthropic mode​

Gateway headers​

LLM proxy endpoints​

POST /v1/chat/completions​

POST /v1/messages​

POST /v1/responses​

Memory endpoints​

Threads​

GET /v1/memory/threads​

POST /v1/memory/threads​

GET /v1/memory/threads/:threadId​

PATCH /v1/memory/threads/:threadId​

DELETE /v1/memory/threads/:threadId​

POST /v1/memory/threads/:threadId/clone​

Messages​

GET /v1/memory/threads/:threadId/messages​

POST /v1/memory/threads/:threadId/messages​

DELETE /v1/memory/threads/:threadId/messages​

Observational memory​

GET /v1/memory/threads/:threadId/observations​

GET /v1/memory/threads/:threadId/observations/record​

GET /v1/memory/threads/:threadId/observations/history​

Authentication

Direct mode

Pass-through mode (BYOK)

Anthropic mode

Gateway headers

LLM proxy endpoints

POST `/v1/chat/completions`

POST `/v1/messages`

POST `/v1/responses`

Memory endpoints

Threads

GET `/v1/memory/threads`

POST `/v1/memory/threads`

GET `/v1/memory/threads/:threadId`

PATCH `/v1/memory/threads/:threadId`

DELETE `/v1/memory/threads/:threadId`

POST `/v1/memory/threads/:threadId/clone`

Messages

GET `/v1/memory/threads/:threadId/messages`

POST `/v1/memory/threads/:threadId/messages`

DELETE `/v1/memory/threads/:threadId/messages`

Observational memory

GET `/v1/memory/threads/:threadId/observations`

GET `/v1/memory/threads/:threadId/observations/record`

GET `/v1/memory/threads/:threadId/observations/history`