Rate limits

How requests are rate-limited per API key, how to detect throttling, and how to back off cleanly

Every Partners API request is rate-limited per API key. Limits exist to keep the platform stable for every integrator. Most integrations never hit the limits in normal operation; bulk sync and migration jobs do, and need to handle throttling explicitly.

How rate limiting works

Limits are applied per API key, not per tenant or per IP.
Reads and writes are counted separately, each in a rolling one-minute window:
- Reads: 100 requests per minute.
- Writes: 20 requests per minute.
For REST requests, the request method decides the bucket: safe methods (GET, HEAD, OPTIONS) count as reads, everything else counts as writes. For GraphQL, the bucket is decided by operation type — queries count as reads, mutations as writes.
Requests over the limit return a 429 Too Many Requests response.
The Retry-After response header indicates how long to wait before the next request is permitted.

If the rate-limiting service is temporarily unavailable, requests are allowed through rather than rejected, so a transient outage on our side won't break your integration.

Detecting throttling

When a request is throttled, you'll see:

HTTP/1.1 429 Too Many Requests
Retry-After: 5
Content-Type: application/json

{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded. Please retry later."
}

The Retry-After value is in seconds. Some endpoints return a longer window for heavier operations.

Backing off cleanly

Respect Retry-After. Sleep for the number of seconds the header gives you, then retry the request.
Add jitter. When multiple clients are throttled simultaneously, retrying at the same time creates a thundering herd. Randomise the retry delay slightly (e.g. Retry-After + random(0,2) seconds).
Use exponential backoff for repeated 429s. If the same request hits 429 multiple times, double the delay each retry up to a sane maximum (e.g. 60 seconds).

Bulk operations

For initial sync or large migration jobs:

Sequence requests rather than firing them in parallel. A single-threaded loop with appropriate delays is usually plenty fast for most data sizes.
Use field selection (Field selection) to keep responses small.
Cache static data (collection definitions, tenant info) on your side rather than re-fetching.
Contact sales if you need a temporary higher limit for a one-off migration window.

Errors — the full error envelope including 429 responses.
Pagination and sorting — paginated requests count toward the same limit; small page sizes mean more requests.

How rate limiting works

Detecting throttling

Backing off cleanly

Bulk operations

Related

On this page