# Features

The Mastra Memory Gateway combines an OpenAI-compatible API proxy with built-in memory and tool capabilities. This page covers each feature in detail.

## Observational Memory

[Observational Memory](https://mastra.ai/research/observational-memory) is the gateway's core differentiator. It automatically extracts and stores observations from every conversation, then injects relevant context into future requests. Your application does not need any memory management code.

### How it works

1. Your app sends a request with `x-thread-id` and `x-resource-id` headers
2. The gateway loads existing observations for that thread and resource
3. Observations are injected into the context
4. The request is proxied to the model provider
5. New observations are extracted from the response and stored
6. The response is streamed back to your app

Your application only sends the current message. The gateway handles all context assembly behind the scenes.

### Thread and resource headers

Observations are scoped per thread. Each thread maintains its own observation history.

- **Thread ID** (`x-thread-id`): Groups messages into a conversation. Use a unique ID per conversation, topic, or session.
- **Resource ID** (`x-resource-id`): Identifies the end user or entity that owns the thread. Use this to associate threads with a user, account, or organization in your application.

Both headers are required to activate Observational Memory. If `x-thread-id` is provided without `x-resource-id`, the gateway returns a `400` error.

> **Note:** A thread is bound to its resource ID on creation. If a subsequent request sends a different `x-resource-id` for the same `x-thread-id`, the gateway rejects it with a `404` because one resource cannot access another resource's thread.

```bash
curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-thread-id: topic-physics-101" \
  -H "x-resource-id: student-jane" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "messages": [{ "role": "user", "content": "Explain Newton'\''s first law." }]
  }'
```

### Memory configuration

Configure observational memory per project in the [Memory Gateway dashboard](https://gateway.mastra.ai) under **Settings → Observational Memory Thresholds**.

**observationTokens** (`number`): Maximum token budget for the observation context injected into prompts. (Default: `30000`)

**reflectionTokens** (`number`): Maximum token budget for reflection summaries generated from observations. (Default: `40000`)

## Model selection

Switch between models by changing the `model` field in your request. No provider configuration, SDK changes, or API key swaps required.

```bash
# Use Claude
curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "anthropic/claude-sonnet-4-6", "messages": [{ "role": "user", "content": "Hello!" }] }'

# Switch to GPT, same endpoint, same key
curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{ "model": "openai/gpt-5.4", "messages": [{ "role": "user", "content": "Hello!" }] }'
```

The gateway supports hundreds of models across multiple providers. See the [Models](https://gateway.mastra.ai/docs/models) page for the full list of supported providers and routing details.

## Bring your own key

By default, requests are routed through shared infrastructure with no provider keys needed. With Bring Your Own Key (BYOK), you use your own provider API keys for direct access to OpenAI, Anthropic, Google, and other providers while keeping the gateway's memory and tool features.

> **Note:** BYOK is available on the **Teams plan** and above.

### Configure in the dashboard

Add provider keys in the [Memory Gateway dashboard](https://gateway.mastra.ai) under **Settings → Bring your own key**:

1. Open your project settings
2. In the **Bring your own key** section, click **Add key**
3. Select a provider (OpenAI, Anthropic, Google, or a custom provider)
4. Paste your provider API key

Once configured, all requests for that provider are routed directly using your key.

### Per-request pass-through

To send a provider key with a single request instead of storing it, pass your provider key in the `Authorization` header and your Mastra key in `X-Memory-Gateway-Authorization`:

```bash
curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "X-Memory-Gateway-Authorization: Bearer YOUR_MASTRA_KEY" \
  -H "Authorization: Bearer YOUR_OPENAI_KEY" \
  -H "Content-Type: application/json" \
  -H "x-thread-id: my-thread" \
  -H "x-resource-id: user-42" \
  -d '{
    "model": "openai/gpt-5.4",
    "messages": [{ "role": "user", "content": "Hello!" }]
  }'
```

The gateway authenticates with the Mastra key, resolves the provider from the model ID, and forwards the request using your provider key. Memory features continue to work normally.

## API compatibility

The gateway exposes three proxy endpoints that match the native API formats. Visit the [API reference](https://gateway.mastra.ai/docs/api/overview) for full details on each endpoint.

### OpenAI Chat Completions API

OpenAI Chat Completions format. Works with any OpenAI-compatible SDK or HTTP client.

### Anthropic Messages API

Anthropic Messages API format. Use the `x-api-key` header or Anthropic SDK for authentication. A successful response returns a `content` array containing the model's reply.

### OpenAI Responses API

OpenAI Responses format for multi-turn, agentic workflows.

## Gateway tools

The gateway can inject server-side tools into requests. Tools are added transparently so that the model sees them as available functions and the gateway handles execution.

### Web search

The `web_search` tool gives models access to current information from the web. Enable it per project in the dashboard or per request with the `x-gateway-tools` header.

```bash
# Enable web search for this request
curl https://gateway-api.mastra.ai/v1/chat/completions \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "x-gateway-tools: web_search" \
  -d '{
    "model": "google/gemini-2.5-flash",
    "messages": [{ "role": "user", "content": "What happened in tech news today?" }]
  }'
```

### Tool header overrides

The `x-gateway-tools` header controls tool injection per request:

| Header value | Behavior                                          |
| ------------ | ------------------------------------------------- |
| `web_search` | Enable web search (even if not in project config) |
| `none`       | Disable all gateway tools for this request        |
| _(omitted)_  | Fall back to project-level tool configuration     |

Gateway tools are only injected when the model doesn't already define a tool with the same name in the request body.

## Streaming

All three proxy endpoints support streaming. The gateway passes through the upstream stream as-is, so standard SDK streaming patterns work without changes.

**Python**:

```python title="main.py"
from openai import OpenAI

client = OpenAI(
    api_key="YOUR_API_KEY",
    base_url="https://gateway-api.mastra.ai/v1",
)

stream = client.chat.completions.create(
    model="google/gemini-2.5-flash",
    messages=[{"role": "user", "content": "Tell me a story."}],
    stream=True,
    extra_headers={
        "x-thread-id": "story-thread",
        "x-resource-id": "user-1",
    },
)

for chunk in stream:
    if chunk.choices and chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
```

**TypeScript**:

```typescript title="index.ts"
import OpenAI from "openai";

const client = new OpenAI({
  apiKey: "YOUR_API_KEY",
  baseURL: "https://gateway-api.mastra.ai/v1",
});

const stream = await client.chat.completions.create({
  model: "google/gemini-2.5-flash",
  messages: [{ role: "user", content: "Tell me a story." }],
  stream: true,
}, {
  headers: {
    "x-thread-id": "story-thread",
    "x-resource-id": "user-1",
  },
});

for await (const chunk of stream) {
  const content = chunk.choices[0]?.delta?.content;
  if (content) process.stdout.write(content);
}
```

## Authentication

The gateway supports two authentication modes:

| Mode                | Headers                                                                                   | Use case                                          |
| ------------------- | ----------------------------------------------------------------------------------------- | ------------------------------------------------- |
| Direct              | `Authorization: Bearer msk_...`                                                           | Standard usage with gateway-managed provider keys |
| Pass-through (BYOK) | `X-Memory-Gateway-Authorization: Bearer msk_...` + `Authorization: Bearer <provider-key>` | Use your own provider key with gateway memory     |

All API keys use the `msk_` prefix and are created from the Mastra dashboard.

The Anthropic SDK sends credentials via `x-api-key` instead of `Authorization`. The gateway accepts both formats, so the Anthropic SDK works without any auth workarounds:

```python title="main.py"
import anthropic

# x-api-key is sent automatically by the SDK
client = anthropic.Anthropic(
    api_key="msk_...",
    base_url="https://gateway-api.mastra.ai/v1",
)
```

## Memory API

In addition to automatic memory through proxy requests, the gateway provides a REST API for direct memory management. Use it to create threads, retrieve conversation history, and inspect observations.

See the [API reference](https://gateway.mastra.ai/docs/api/overview) for the full endpoint documentation.