API reference
Base URL: https://gateway-api.mastra.ai
All requests require a valid API key with the msk_ prefix. Get your API key from the Memory Gateway dashboard.
Authentication
The gateway supports three authentication modes:
Direct mode
Send your gateway API key in the Authorization header. The gateway routes the request to the configured provider.
Authorization: Bearer msk_...
Pass-through mode (BYOK)
Send your gateway key in X-Memory-Gateway-Authorization and your own provider key in Authorization. The gateway uses your provider key for the upstream request. BYOK is available on the Teams plan and above.
X-Memory-Gateway-Authorization: Bearer msk_...
Authorization: Bearer <your-provider-key>
Anthropic mode
Send your gateway key in the x-api-key header, matching the native Anthropic SDK format.
x-api-key: msk_...
Gateway headers
These headers control gateway behavior on LLM proxy endpoints:
| Header | Description |
|---|---|
x-thread-id | Thread ID for memory. The gateway loads prior observations and saves new ones to this thread. Requires x-resource-id. |
x-resource-id | Resource ID for memory scoping. Identifies the end user or entity that owns the thread. Required when x-thread-id is set. |
x-gateway-tools | Override gateway tool injection. Set to web_search to enable or none to disable, regardless of project settings. |
Sending x-thread-id without x-resource-id returns a 400 Bad Request error. Both headers must be provided together to activate memory.
LLM proxy endpoints
POST /v1/chat/completions
Proxy requests to the OpenAI Chat Completions API. The request body and response follow the OpenAI Chat Completions format.
curl https://gateway-api.mastra.ai/v1/chat/completions \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "x-thread-id: thread_123" \
-H "x-resource-id: user_456" \
-d '{
"model": "openai/gpt-5.4",
"messages": [
{ "role": "user", "content": "Hello" }
]
}'
Forwarded headers: chatgpt-account-id
Supports: Streaming ("stream": true)
POST /v1/messages
Proxy requests to the Anthropic Messages API. The request body and response follow the Anthropic Messages format.
curl https://gateway-api.mastra.ai/v1/messages \
-H "x-api-key: YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "anthropic-version: 2023-06-01" \
-H "x-thread-id: thread_123" \
-H "x-resource-id: user_456" \
-d '{
"model": "anthropic/claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [
{ "role": "user", "content": "Hello" }
]
}'
Forwarded headers: anthropic-version, anthropic-beta
Supports: Streaming ("stream": true)
POST /v1/responses
Proxy requests to the OpenAI Responses API. The request body and response follow the OpenAI Responses format.
curl https://gateway-api.mastra.ai/v1/responses \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-H "x-thread-id: thread_123" \
-H "x-resource-id: user_456" \
-d '{
"model": "openai/gpt-5.4",
"input": "Hello"
}'
Forwarded headers: chatgpt-account-id
Supports: Streaming ("stream": true)
Memory endpoints
All memory endpoints are under /v1/memory. Requires a valid API key. All operations are scoped to the project associated with the API key.
Threads
GET /v1/memory/threads
List threads for the current project.
Query parameters:
Response:
{
"threads": [
{
"id": "thread_abc",
"projectId": "proj_123",
"resourceId": "user_456",
"title": "Conversation about weather",
"metadata": { "topic": "weather" },
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z"
}
],
"total": 1
}
POST /v1/memory/threads
Create a new thread.
Request body:
Response: 201 Created
{
"thread": {
"id": "thread_abc",
"projectId": "proj_123",
"resourceId": "user_456",
"title": "My thread",
"metadata": null,
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z"
}
}
GET /v1/memory/threads/:threadId
Get a single thread by ID.
Response:
{
"thread": {
"id": "thread_abc",
"projectId": "proj_123",
"resourceId": "user_456",
"title": "My thread",
"metadata": null,
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z"
}
}
Errors: 404 if the thread does not exist.
PATCH /v1/memory/threads/:threadId
Update a thread's title or metadata.
Request body:
Response: The updated thread object.
Errors: 404 if the thread does not exist.
DELETE /v1/memory/threads/:threadId
Delete a thread and its associated data.
Response:
{ "ok": true }
Errors: 404 if the thread does not exist.
POST /v1/memory/threads/:threadId/clone
Clone a thread and its messages.
Response: 201 Created
{
"thread": {
"id": "thread_new",
"projectId": "proj_123",
"resourceId": "user_456",
"title": "My thread",
"metadata": null,
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z"
}
}
Messages
GET /v1/memory/threads/:threadId/messages
List messages in a thread.
Query parameters:
Response:
{
"messages": [
{
"id": "msg_abc",
"threadId": "thread_abc",
"role": "user",
"content": "Hello",
"type": "text",
"createdAt": "2025-01-01T00:00:00.000Z"
}
],
"total": 1
}
Errors: 404 if the thread does not exist.
POST /v1/memory/threads/:threadId/messages
Save messages to a thread.
Request body:
Response: 201 Created
{
"messages": [
{
"id": "msg_abc",
"threadId": "thread_abc",
"role": "user",
"content": "Hello",
"type": "text",
"createdAt": "2025-01-01T00:00:00.000Z"
}
]
}
Errors: 404 if the thread does not exist.
DELETE /v1/memory/threads/:threadId/messages
Delete specific messages from a thread.
Request body:
Response:
{ "ok": true }
Errors: 404 if the thread does not exist.
Observational memory
Observational memory records are created automatically when you include x-thread-id on LLM proxy requests. These endpoints allow you to read the observations the gateway has extracted.
GET /v1/memory/threads/:threadId/observations
Get the active observations for a thread.
Query parameters:
Response:
{
"observations": [
"User prefers TypeScript",
"User is building a chat application"
]
}
Returns { "observations": null } if no observations exist for the thread.
GET /v1/memory/threads/:threadId/observations/record
Get the full observational memory record for a thread, including metadata about the observation process.
Query parameters:
Response:
{
"record": {
"id": "om_abc",
"scope": "thread",
"threadId": "thread_abc",
"resourceId": "user_456",
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z",
"lastObservedAt": "2025-01-01T01:00:00.000Z",
"originType": "observation",
"generationCount": 3,
"activeObservations": ["User prefers TypeScript"],
"totalTokensObserved": 1500,
"observationTokenCount": 200,
"pendingMessageTokens": 0,
"isReflecting": false,
"isObserving": false,
"isBufferingObservation": false,
"isBufferingReflection": false
}
}
Returns { "record": null } if no record exists.
GET /v1/memory/threads/:threadId/observations/history
Get the history of observational memory records for a thread.
Query parameters:
Response:
{
"records": [
{
"id": "om_abc",
"scope": "thread",
"threadId": "thread_abc",
"resourceId": "user_456",
"createdAt": "2025-01-01T00:00:00.000Z",
"updatedAt": "2025-01-01T00:00:00.000Z",
"lastObservedAt": "2025-01-01T01:00:00.000Z",
"originType": "reflection",
"generationCount": 5,
"activeObservations": ["User prefers TypeScript", "User is building a chat app"],
"totalTokensObserved": 3000,
"observationTokenCount": 400,
"pendingMessageTokens": 0,
"isReflecting": false,
"isObserving": false,
"isBufferingObservation": false,
"isBufferingReflection": false
}
]
}
Errors: 404 if the thread does not exist.