Limits

The Mastra Memory Gateway enforces rate limits and pagination constraints to protect the service and ensure fair usage.

Rate limits

Rate limits are applied per API key. When a limit is exceeded, the gateway returns a 429 status code with a Retry-After header.

LLM proxy

Applies to POST /v1/chat/completions, POST /v1/messages, and POST /v1/responses.

Rule	Default limit	Window
Global (all endpoints)	5,000 requests	60 seconds
LLM proxy	2,000 requests	60 seconds
LLM burst	400 requests	10 seconds

Memory API

Applies to the /v1/memory/* endpoints. Read and write operations have separate limits.

Rule	Default limit	Window
Memory read (GET, HEAD)	1,200 requests	60 seconds
Memory write (POST, PATCH, DELETE)	600 requests	60 seconds

Rate limit headers

Every response includes rate limit headers:

Header	Description
`X-RateLimit-Limit`	Maximum requests allowed in the current window
`X-RateLimit-Remaining`	Remaining requests in the current window
`X-RateLimit-Reset`	Unix timestamp when the window resets
`X-RateLimit-Scope`	Which limit applies (`llm_proxy`, `memory_read`, or `memory_write`)
`Retry-After`	Seconds until the limit resets (only present on `429` responses)

Rate limit error response

{
	"error": {
		"message": "Rate limit exceeded. Please retry after the reset time.",
		"type": "rate_limit_error",
		"scope": "llm_proxy",
		"retry_after_seconds": 12
	}
}

Pagination limits

List endpoints accept limit and offset query parameters.

Parameter	Default	Maximum
`limit` (threads, messages)	50	200
`limit` (observation history)	10	200
`offset`	0	—

Authentication

All API keys use the msk_ prefix
An invalid or missing API key returns a 401 error

Error format

All error responses follow the same structure:

{
	"error": {
		"message": "Human-readable error description",
		"type": "error_type"
	}
}

Common error types:

Type	Status code	Description
`authentication_error`	401	Invalid or missing API key
`not_found`	404	Thread or resource not found
`rate_limit_error`	429	Rate limit exceeded
`invalid_request_error`	400	Malformed request or unsupported provider/endpoint combination

Rate limits​

LLM proxy​

Memory API​

Rate limit headers​

Rate limit error response​

Pagination limits​

Authentication​

Error format​