Skip to main content

Limits

The Mastra Memory Gateway enforces rate limits and pagination constraints to protect the service and ensure fair usage.

Rate limits

Rate limits are applied per API key. When a limit is exceeded, the gateway returns a 429 status code with a Retry-After header.

LLM proxy

Applies to POST /v1/chat/completions, POST /v1/messages, and POST /v1/responses.

RuleDefault limitWindow
Global (all endpoints)5,000 requests60 seconds
LLM proxy2,000 requests60 seconds
LLM burst400 requests10 seconds

Memory API

Applies to the /v1/memory/* endpoints. Read and write operations have separate limits.

RuleDefault limitWindow
Memory read (GET, HEAD)1,200 requests60 seconds
Memory write (POST, PATCH, DELETE)600 requests60 seconds

Rate limit headers

Every response includes rate limit headers:

HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current window
X-RateLimit-RemainingRemaining requests in the current window
X-RateLimit-ResetUnix timestamp when the window resets
X-RateLimit-ScopeWhich limit applies (llm_proxy, memory_read, or memory_write)
Retry-AfterSeconds until the limit resets (only present on 429 responses)

Rate limit error response

{
  "error": {
    "message": "Rate limit exceeded. Please retry after the reset time.",
    "type": "rate_limit_error",
    "scope": "llm_proxy",
    "retry_after_seconds": 12
  }
}

Pagination limits

List endpoints accept limit and offset query parameters.

ParameterDefaultMaximum
limit (threads, messages)50200
limit (observation history)10200
offset0

Authentication

  • All API keys use the msk_ prefix
  • An invalid or missing API key returns a 401 error

Error format

All error responses follow the same structure:

{
  "error": {
    "message": "Human-readable error description",
    "type": "error_type"
  }
}

Common error types:

TypeStatus codeDescription
authentication_error401Invalid or missing API key
not_found404Thread or resource not found
rate_limit_error429Rate limit exceeded
invalid_request_error400Malformed request or unsupported provider/endpoint combination