Limits
The Mastra Memory Gateway enforces rate limits and pagination constraints to protect the service and ensure fair usage.
Rate limits
Rate limits are applied per API key. When a limit is exceeded, the gateway returns a 429 status code with a Retry-After header.
LLM proxy
Applies to POST /v1/chat/completions, POST /v1/messages, and POST /v1/responses.
| Rule | Default limit | Window |
|---|---|---|
| Global (all endpoints) | 5,000 requests | 60 seconds |
| LLM proxy | 2,000 requests | 60 seconds |
| LLM burst | 400 requests | 10 seconds |
Memory API
Applies to the /v1/memory/* endpoints. Read and write operations have separate limits.
| Rule | Default limit | Window |
|---|---|---|
| Memory read (GET, HEAD) | 1,200 requests | 60 seconds |
| Memory write (POST, PATCH, DELETE) | 600 requests | 60 seconds |
Rate limit headers
Every response includes rate limit headers:
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Remaining requests in the current window |
X-RateLimit-Reset | Unix timestamp when the window resets |
X-RateLimit-Scope | Which limit applies (llm_proxy, memory_read, or memory_write) |
Retry-After | Seconds until the limit resets (only present on 429 responses) |
Rate limit error response
{
"error": {
"message": "Rate limit exceeded. Please retry after the reset time.",
"type": "rate_limit_error",
"scope": "llm_proxy",
"retry_after_seconds": 12
}
}
Pagination limits
List endpoints accept limit and offset query parameters.
| Parameter | Default | Maximum |
|---|---|---|
limit (threads, messages) | 50 | 200 |
limit (observation history) | 10 | 200 |
offset | 0 | — |
Authentication
- All API keys use the
msk_prefix - An invalid or missing API key returns a
401error
Error format
All error responses follow the same structure:
{
"error": {
"message": "Human-readable error description",
"type": "error_type"
}
}
Common error types:
| Type | Status code | Description |
|---|---|---|
authentication_error | 401 | Invalid or missing API key |
not_found | 404 | Thread or resource not found |
rate_limit_error | 429 | Rate limit exceeded |
invalid_request_error | 400 | Malformed request or unsupported provider/endpoint combination |