Rate Limits & Quotas

Signa enforces two independent controls per organization:

Monthly quotas cap how many requests you can make against each pool over a billing period. Quotas reset on your billing anchor day (usually the first of the month).
Rate limits cap how many requests per minute you can make against each tier. Rate limits slide in real time and are separate from monthly quotas — a single request counts against both.

All API keys under the same organization share one quota and one rate-limit window.

Monthly Quota Pools (Beta)

Endpoints are grouped into quota pools. Each pool has its own monthly allowance. During beta, every customer is on the same plan with the following limits:

Pool	Beta monthly quota	Rate limit	Endpoints
search	100,000	1,000/min	`GET /v1/trademarks`, `POST /v1/trademarks`, `GET /v1/trademarks/suggest`, `GET /v1/suggest`, `GET /v1/owners`, `GET /v1/attorneys`, `GET /v1/firms`
read	500,000	10,000/min	`GET /v1/trademarks/{id}`, `POST /v1/trademarks/batch`, `GET /v1/trademarks/{id}/history`, `GET /v1/trademarks/{id}/changes`, `GET /v1/trademarks/{id}/related`, `GET /v1/trademarks/{id}/proceedings`, `GET /v1/trademarks/{id}/coverage`, `GET /v1/trademarks/{id}/source`, `GET /v1/owners/{id}`, `GET /v1/owners/{id}/`, `GET /v1/attorneys/{id}`, `GET /v1/attorneys/{id}/`, `GET /v1/firms/{id}`, `GET /v1/firms/{id}/*`, `GET /v1/proceedings`, `GET /v1/proceedings/{id}`
reference	unmetered	10,000/min	`GET /v1/offices`, `GET /v1/jurisdictions`, `GET /v1/classifications`, `GET /v1/design-codes`, `GET /v1/event-types`, `GET /v1/deadline-rules`, `GET /v1/opposition-rules`
utility	unmetered	10,000/min	`GET /v1/organization/`, `GET /health/`, `GET /docs`, `GET /v1/openapi.json`, `/mcp/*`

Pools marked screening (1,000/mo), clearance (100/mo), and image_search (5,000/mo) appear in /v1/organization/plan for forward-compatibility, but no endpoints are shipped against them yet. They will appear in this table when the corresponding endpoints ship.

Pool Rules of Thumb

List endpoints (returning many results) are search.
Detail endpoints (returning one resource, or a batch of IDs) are read.
Static taxonomies (offices, classifications, design codes) are reference and do not count against any monthly quota.
Dashboard / health / docs are utility and do not count against any monthly quota.

To find your current usage per pool, call GET /v1/organization/usage. Your plan limits are also included in every /v1/organization/usage response under by_endpoint_type[*].limit.

Rate Limit Tiers

Rate limits are per-minute sliding-window caps applied independently of monthly quotas. Each request is evaluated against both the tier ceiling AND your plan’s advertised rpm — the lower of the two is enforced. During beta, the plan rpm matches the tier ceiling, so the two values are equivalent.

Tier	Beta limit (effective)	Endpoints
Reads (tier-1)	10,000/min	All GET/HEAD/OPTIONS requests
Search (tier-2)	1,000/min (tier ceiling 10,000/min; beta plan rpm caps to 1,000)	`GET /v1/trademarks`, `POST /v1/trademarks`, `GET /v1/trademarks/suggest`, `GET /v1/suggest`
Writes (tier-3)	1,000/min	All POST/PATCH/DELETE not in Search tier
MCP	1,000/min	`/mcp` endpoints

POST /v1/trademarks is the search endpoint (POST is used only to accept JSON filter bodies too large for a URL) — it counts against the search rate tier and the search monthly quota pool, not writes. POST /v1/trademarks/batch is a bulk-lookup read and counts against the writes tier + read monthly pool.

Per-plan rate limits (future, post-beta)

When pricing launches, each plan will also advertise a per-plan rpm that further caps the tier ceiling. Effective limit is always min(tier, plan_rpm). Beta customers see no change; paid plans see their advertised rpm become the enforced cap.

Plan (future)	reads	search	writes
Free	60/min	30/min	30/min
Starter	300/min	300/min	300/min
Pro	1,000/min	1,000/min	1,000/min
Enterprise	5,000/min	5,000/min	5,000/min
Beta (current)	10,000/min	1,000/min	1,000/min

These paid-plan numbers are provisional and will be tuned based on beta usage data before pricing launch. If you hit them in testing, tell us — that’s valuable signal.

Rate Limit Headers

Every API response includes IETF-standard rate limit headers so you can monitor your usage in real time:

HTTP/1.1 200 OK
RateLimit-Policy: 10000;w=60
RateLimit: remaining=9994, reset=30

Header	Format	Description
`RateLimit-Policy`	`{limit};w={windowSec}`	The limit and window size in seconds (e.g., `1000;w=60` means 1,000 requests per 60-second window).
`RateLimit`	`remaining={N}, reset={seconds}`	Remaining requests in the current window and seconds until the window resets.
`Retry-After`	`{seconds}`	Only present on `429` responses. Number of seconds to wait before retrying.

Monitor the remaining value proactively. If it drops below 10% of your limit, slow down requests rather than hitting 429 errors.

Daily Sub-Caps

In addition to the monthly quota pool, each pool has a daily sub-cap set at 10% of the monthly limit. This is a hard cap for all plans (including paid plans that allow monthly overage) — its purpose is to prevent a single client from exhausting an entire month’s allowance in minutes.

Pool	Beta monthly	Beta daily
search	100,000	10,000
read	500,000	50,000

The daily counter resets at UTC midnight.

Daily Quota Headers

Every metered response includes three additional headers alongside the existing rate-limit headers:

Header	Format	Description
`X-Quota-Daily-Limit`	integer	Total daily units allowed for this pool.
`X-Quota-Daily-Remaining`	integer	Remaining daily units.
`X-Quota-Daily-Reset`	ISO 8601 timestamp	Next UTC midnight, when the daily counter resets.

The existing X-Quota-Limit, X-Quota-Remaining, and X-Quota-Reset headers continue to report the monthly counter.

429 with `quota_scope`

If a request exceeds either cap, you receive a 429 with error.type = "quota_exceeded". The error.quota_scope field tells you which counter was breached:

{
  "error": {
    "type": "quota_exceeded",
    "title": "Daily quota exceeded",
    "detail": "You have used 10000 of 10000 units today. Quota resets at 2026-05-13T00:00:00.000Z.",
    "quota_scope": "daily",
    "quota_limit": 10000,
    "quota_used": 10000,
    "quota_resets_at": "2026-05-13T00:00:00.000Z"
  },
  "request_id": "req_..."
}

quota_scope is "monthly" or "daily". The error.type stays "quota_exceeded" regardless; switch on quota_scope if you need to distinguish.

Per-Endpoint Classification

Each endpoint falls into the tier determined by its HTTP method and path. Notable classifications:

Endpoint	Tier	Beta limit	Notes
`GET /v1/trademarks`	Search	1,000/min	The list/search endpoint. Routes to the search tier (POST shape is an alias).
`POST /v1/trademarks`	Search	1,000/min	Body-shaped search — counts against search tier + search quota pool, not writes.
`GET /v1/trademarks/suggest`	Search	1,000/min	Explicitly routed to the search tier.
`GET /v1/suggest`	Search	1,000/min	Cross-entity suggest — search tier.
`POST /v1/trademarks/batch`	Writes	1,000/min	Bulk-lookup read — one request per batch regardless of size. Exempt from `Idempotency-Key`.
`POST /v1/organization/api-keys`	Writes	1,000/min	Mint an API key — requires `Idempotency-Key`.

429 Response

When you exceed your rate limit, the API returns a 429 Too Many Requests status with details about when you can retry:

HTTP/1.1 429 Too Many Requests
RateLimit-Policy: 10000;w=60
RateLimit: remaining=0, reset=12
Retry-After: 12

{
  "error": {
    "type": "rate_limited",
    "title": "Rate limit exceeded",
    "status": 429,
    "detail": "Rate limit exceeded. Retry after 12 seconds.",
    "retryable": true,
    "retry_after": 12
  },
  "request_id": "req_abc123"
}

The Retry-After response header and the retry_after field in the body both contain the number of seconds to wait.

Ignoring 429 responses and continuing to send requests will not help — those requests are also rejected. In extreme cases, sustained limit violations may result in a temporary block of your API key.

Monitoring Usage

Check your current billing period usage and rate limit status with Get Usage:

GET /v1/organization/usage
Authorization: Bearer sig_YOUR_KEY

{
  "object": "usage",
  "billing_period": {
    "start": "2026-04-01T00:00:00Z",
    "end": "2026-04-30T23:59:59Z"
  },
  "by_endpoint_type": {
    "search": { "used": 1204, "limit": 100000 },
    "read": { "used": 8932, "limit": 500000 },
    "screening": { "used": 42, "limit": 1000 },
    "clearance": { "used": 5, "limit": 500 },
    "image_search": { "used": 18, "limit": 1000 },
    "check": { "used": 3, "limit": 100 },
    "export": { "used": 0, "limit": 10 }
  },
  "rate_limit": {
    "limit": 10000,
    "resets_at": "2026-04-18T12:30:00Z"
  },
  "request_id": "req_abc123"
}

by_endpoint_type reports used and limit for every metered endpoint type in the current billing period. A limit of null means unlimited; 0 means the endpoint type is not allowed on your plan. rate_limit shows the current sliding-window status.

Avoid polling /v1/organization/usage in a tight loop. It shares the standard reads quota.

Handling 429 in Code

The Signa TypeScript SDK handles 429 responses automatically with built-in retry logic — see SDK Error Handling. If you are implementing your own retry logic, wait for the Retry-After duration and retry with exponential backoff.

Best Practices

Use batch endpoints to reduce request count

A single batch request of 100 IDs counts as 1 request against your rate limit, compared to 100 individual GET requests. See Batch Get Trademarks.

Cache responses with ETags

Using If-None-Match headers with ETags avoids downloading unchanged response bodies, saving bandwidth and processing time. While 304 Not Modified responses still count against your rate limit, they are significantly cheaper for both client and server. See the caching guide for implementation patterns.

Use targeted lookups instead of frequent searches

If you are periodically checking for trademark status changes, use Trademark History on specific marks rather than re-running broad searches.

Spread requests evenly

Bursting 500 requests in the first second of a window is more likely to trigger rate limiting than spreading them evenly across the minute. If you need to process a large batch, add a small delay (50—100ms) between requests.

Use separate API keys per concern

If your application has both a user-facing dashboard and a background sync job, create separate API keys for each. This prevents a background job from exhausting the rate limit that your dashboard users depend on.

Need Higher Limits?

If you are consistently hitting rate limits, reach out at support@signa.so to discuss higher limits tailored to your workload.

Documentation Index

​Monthly Quota Pools (Beta)

​Pool Rules of Thumb

​Rate Limit Tiers

​Per-plan rate limits (future, post-beta)

​Rate Limit Headers

​Daily Sub-Caps

​Daily Quota Headers

​429 with quota_scope

​Per-Endpoint Classification

​429 Response

​Monitoring Usage

​Handling 429 in Code

​Best Practices

​Need Higher Limits?

Monthly Quota Pools (Beta)

Pool Rules of Thumb

Rate Limit Tiers

Per-plan rate limits (future, post-beta)

Rate Limit Headers

Daily Sub-Caps

Daily Quota Headers

429 with `quota_scope`

Per-Endpoint Classification

429 Response

Monitoring Usage

Handling 429 in Code

Best Practices

Need Higher Limits?