- Monthly quotas cap how many requests you can make against each pool over a billing period. Quotas reset on your billing anchor day (usually the first of the month).
- Rate limits cap how many requests per minute you can make against each tier. Rate limits slide in real time and are separate from monthly quotas — a single request counts against both.
Monthly Quota Pools (Beta)
Endpoints are grouped into quota pools. Each pool has its own monthly allowance. During beta, every customer is on the same plan with the following limits:| Pool | Beta monthly quota | Rate limit | Endpoints |
|---|---|---|---|
| search | 100,000 | 10,000/min | GET /v1/trademarks, POST /v1/trademarks, GET /v1/trademarks/suggest, GET /v1/suggest, GET /v1/owners, GET /v1/attorneys, GET /v1/firms |
| read | 500,000 | 10,000/min | GET /v1/trademarks/{id}, POST /v1/trademarks/batch, GET /v1/trademarks/{id}/history, GET /v1/trademarks/{id}/changes, GET /v1/trademarks/{id}/related, GET /v1/trademarks/{id}/proceedings, GET /v1/trademarks/{id}/coverage, GET /v1/trademarks/{id}/source, GET /v1/owners/{id}, GET /v1/owners/{id}/*, GET /v1/attorneys/{id}, GET /v1/attorneys/{id}/*, GET /v1/firms/{id}, GET /v1/firms/{id}/*, GET /v1/proceedings, GET /v1/proceedings/{id} |
| reference | unmetered | 10,000/min | GET /v1/offices, GET /v1/jurisdictions, GET /v1/classifications, GET /v1/design-codes, GET /v1/event-types, GET /v1/deadline-rules |
| utility | unmetered | 10,000/min | GET /v1/organization/*, GET /health/*, GET /docs, GET /v1/openapi.json, /mcp/* |
Pools marked screening (1,000/mo), clearance (100/mo), and image_search (5,000/mo) appear in
/v1/organization/plan for forward-compatibility, but no endpoints are shipped against them yet. They will appear in this table when the corresponding endpoints ship.Pool Rules of Thumb
- List endpoints (returning many results) are search.
- Detail endpoints (returning one resource, or a batch of IDs) are read.
- Static taxonomies (offices, classifications, design codes) are reference and do not count against any monthly quota.
- Dashboard / health / docs are utility and do not count against any monthly quota.
GET /v1/organization/usage. Your plan limits are also included in every /v1/organization/usage response under by_endpoint_type[*].limit.
Rate Limit Tiers
Rate limits are per-minute sliding-window caps applied independently of monthly quotas. Each request is evaluated against both the tier ceiling AND your plan’s advertised rpm — the lower of the two is enforced. During beta, the plan rpm matches the tier ceiling, so the two values are equivalent.| Tier | Beta limit (effective) | Endpoints |
|---|---|---|
| Reads (tier-1) | 10,000/min | All GET/HEAD/OPTIONS requests |
| Search (tier-2) | 10,000/min | GET /v1/trademarks, POST /v1/trademarks, GET /v1/trademarks/suggest, GET /v1/suggest |
| Writes (tier-3) | 1,000/min | All POST/PATCH/DELETE not in Search tier |
| MCP | 1,000/min | /mcp endpoints |
POST /v1/trademarks is the search endpoint (POST is used only to accept JSON filter bodies too large for a URL) — it counts against the search rate tier and the search monthly quota pool, not writes. POST /v1/trademarks/batch is a bulk-lookup read and counts against the writes tier + read monthly pool.
Per-plan rate limits (future, post-beta)
When pricing launches, each plan will also advertise a per-plan rpm that further caps the tier ceiling. Effective limit is alwaysmin(tier, plan_rpm). Beta customers see no change; paid plans see their advertised rpm become the enforced cap.
| Plan (future) | reads | search | writes |
|---|---|---|---|
| Free | 60/min | 30/min | 30/min |
| Starter | 300/min | 300/min | 300/min |
| Pro | 1,000/min | 1,000/min | 1,000/min |
| Enterprise | 5,000/min | 5,000/min | 5,000/min |
| Beta (current) | 10,000/min | 10,000/min | 1,000/min |
These paid-plan numbers are provisional and will be tuned based on beta usage data before pricing launch. If you hit them in testing, tell us — that’s valuable signal.
Rate Limit Headers
Every API response includes IETF-standard rate limit headers so you can monitor your usage in real time:| Header | Format | Description |
|---|---|---|
RateLimit-Policy | {limit};w={windowSec} | The limit and window size in seconds (e.g., 1000;w=60 means 1,000 requests per 60-second window). |
RateLimit | remaining={N}, reset={seconds} | Remaining requests in the current window and seconds until the window resets. |
Retry-After | {seconds} | Only present on 429 responses. Number of seconds to wait before retrying. |
Per-Endpoint Classification
Each endpoint falls into the tier determined by its HTTP method and path. Notable classifications:| Endpoint | Tier | Beta limit | Notes |
|---|---|---|---|
GET /v1/trademarks | Search | 10,000/min | The list/search endpoint. Routes to the search tier (POST shape is an alias). |
POST /v1/trademarks | Search | 10,000/min | Body-shaped search — counts against search tier + search quota pool, not writes. |
GET /v1/trademarks/suggest | Search | 10,000/min | Explicitly routed to the search tier. |
GET /v1/suggest | Search | 10,000/min | Cross-entity suggest — search tier. |
POST /v1/trademarks/batch | Writes | 1,000/min | Bulk-lookup read — one request per batch regardless of size. Exempt from Idempotency-Key. |
POST /v1/organization/api-keys | Writes | 1,000/min | Mint an API key — requires Idempotency-Key. |
429 Response
When you exceed your rate limit, the API returns a429 Too Many Requests status with details about when you can retry:
Retry-After response header and the retry_after field in the body both contain the number of seconds to wait.
Monitoring Usage
Check your current billing period usage and rate limit status with Get Usage:by_endpoint_type reports used and limit for every metered endpoint type in the current billing period. A limit of null means unlimited; 0 means the endpoint type is not allowed on your plan. rate_limit shows the current sliding-window status.
Avoid polling
/v1/organization/usage in a tight loop. It shares the standard reads quota.Handling 429 in Code
The Signa TypeScript SDK handles 429 responses automatically with built-in retry logic — see SDK Error Handling. If you are implementing your own retry logic, wait for theRetry-After duration and retry with exponential backoff.
Best Practices
Use batch endpoints to reduce request count
Use batch endpoints to reduce request count
A single batch request of 100 IDs counts as 1 request against your rate limit, compared to 100 individual GET requests. See Batch Get Trademarks.
Cache responses with ETags
Cache responses with ETags
Use targeted lookups instead of frequent searches
Use targeted lookups instead of frequent searches
If you are periodically checking for trademark status changes, use Trademark History on specific marks rather than re-running broad searches.
Spread requests evenly
Spread requests evenly
Bursting 500 requests in the first second of a window is more likely to trigger rate limiting than spreading them evenly across the minute. If you need to process a large batch, add a small delay (50—100ms) between requests.
Use separate API keys per concern
Use separate API keys per concern
If your application has both a user-facing dashboard and a background sync job, create separate API keys for each. This prevents a background job from exhausting the rate limit that your dashboard users depend on.