Skip to main content
Signa enforces two independent controls per organization:
  • Monthly quotas cap how many requests you can make against each pool over a billing period. Quotas reset on your billing anchor day (usually the first of the month).
  • Rate limits cap how many requests per minute you can make against each tier. Rate limits slide in real time and are separate from monthly quotas — a single request counts against both.
All API keys under the same organization share one quota and one rate-limit window.

Monthly Quota Pools (Beta)

Endpoints are grouped into quota pools. Each pool has its own monthly allowance. During beta, every customer is on the same plan with the following limits:
PoolBeta monthly quotaRate limitEndpoints
search100,00010,000/minGET /v1/trademarks, POST /v1/trademarks, GET /v1/trademarks/suggest, GET /v1/suggest, GET /v1/owners, GET /v1/attorneys, GET /v1/firms
read500,00010,000/minGET /v1/trademarks/{id}, POST /v1/trademarks/batch, GET /v1/trademarks/{id}/history, GET /v1/trademarks/{id}/changes, GET /v1/trademarks/{id}/related, GET /v1/trademarks/{id}/proceedings, GET /v1/trademarks/{id}/coverage, GET /v1/trademarks/{id}/source, GET /v1/owners/{id}, GET /v1/owners/{id}/*, GET /v1/attorneys/{id}, GET /v1/attorneys/{id}/*, GET /v1/firms/{id}, GET /v1/firms/{id}/*, GET /v1/proceedings, GET /v1/proceedings/{id}
referenceunmetered10,000/minGET /v1/offices, GET /v1/jurisdictions, GET /v1/classifications, GET /v1/design-codes, GET /v1/event-types, GET /v1/deadline-rules
utilityunmetered10,000/minGET /v1/organization/*, GET /health/*, GET /docs, GET /v1/openapi.json, /mcp/*
Pools marked screening (1,000/mo), clearance (100/mo), and image_search (5,000/mo) appear in /v1/organization/plan for forward-compatibility, but no endpoints are shipped against them yet. They will appear in this table when the corresponding endpoints ship.

Pool Rules of Thumb

  • List endpoints (returning many results) are search.
  • Detail endpoints (returning one resource, or a batch of IDs) are read.
  • Static taxonomies (offices, classifications, design codes) are reference and do not count against any monthly quota.
  • Dashboard / health / docs are utility and do not count against any monthly quota.
To find your current usage per pool, call GET /v1/organization/usage. Your plan limits are also included in every /v1/organization/usage response under by_endpoint_type[*].limit.

Rate Limit Tiers

Rate limits are per-minute sliding-window caps applied independently of monthly quotas. Each request is evaluated against both the tier ceiling AND your plan’s advertised rpm — the lower of the two is enforced. During beta, the plan rpm matches the tier ceiling, so the two values are equivalent.
TierBeta limit (effective)Endpoints
Reads (tier-1)10,000/minAll GET/HEAD/OPTIONS requests
Search (tier-2)10,000/minGET /v1/trademarks, POST /v1/trademarks, GET /v1/trademarks/suggest, GET /v1/suggest
Writes (tier-3)1,000/minAll POST/PATCH/DELETE not in Search tier
MCP1,000/min/mcp endpoints
POST /v1/trademarks is the search endpoint (POST is used only to accept JSON filter bodies too large for a URL) — it counts against the search rate tier and the search monthly quota pool, not writes. POST /v1/trademarks/batch is a bulk-lookup read and counts against the writes tier + read monthly pool.

Per-plan rate limits (future, post-beta)

When pricing launches, each plan will also advertise a per-plan rpm that further caps the tier ceiling. Effective limit is always min(tier, plan_rpm). Beta customers see no change; paid plans see their advertised rpm become the enforced cap.
Plan (future)readssearchwrites
Free60/min30/min30/min
Starter300/min300/min300/min
Pro1,000/min1,000/min1,000/min
Enterprise5,000/min5,000/min5,000/min
Beta (current)10,000/min10,000/min1,000/min
These paid-plan numbers are provisional and will be tuned based on beta usage data before pricing launch. If you hit them in testing, tell us — that’s valuable signal.

Rate Limit Headers

Every API response includes IETF-standard rate limit headers so you can monitor your usage in real time:
HTTP/1.1 200 OK
RateLimit-Policy: 10000;w=60
RateLimit: remaining=9994, reset=30
HeaderFormatDescription
RateLimit-Policy{limit};w={windowSec}The limit and window size in seconds (e.g., 1000;w=60 means 1,000 requests per 60-second window).
RateLimitremaining={N}, reset={seconds}Remaining requests in the current window and seconds until the window resets.
Retry-After{seconds}Only present on 429 responses. Number of seconds to wait before retrying.
Monitor the remaining value proactively. If it drops below 10% of your limit, slow down requests rather than hitting 429 errors.

Per-Endpoint Classification

Each endpoint falls into the tier determined by its HTTP method and path. Notable classifications:
EndpointTierBeta limitNotes
GET /v1/trademarksSearch10,000/minThe list/search endpoint. Routes to the search tier (POST shape is an alias).
POST /v1/trademarksSearch10,000/minBody-shaped search — counts against search tier + search quota pool, not writes.
GET /v1/trademarks/suggestSearch10,000/minExplicitly routed to the search tier.
GET /v1/suggestSearch10,000/minCross-entity suggest — search tier.
POST /v1/trademarks/batchWrites1,000/minBulk-lookup read — one request per batch regardless of size. Exempt from Idempotency-Key.
POST /v1/organization/api-keysWrites1,000/minMint an API key — requires Idempotency-Key.

429 Response

When you exceed your rate limit, the API returns a 429 Too Many Requests status with details about when you can retry:
HTTP/1.1 429 Too Many Requests
RateLimit-Policy: 10000;w=60
RateLimit: remaining=0, reset=12
Retry-After: 12
{
  "error": {
    "type": "rate_limited",
    "title": "Rate limit exceeded",
    "status": 429,
    "detail": "Rate limit exceeded. Retry after 12 seconds.",
    "retryable": true,
    "retry_after": 12
  },
  "request_id": "req_abc123"
}
The Retry-After response header and the retry_after field in the body both contain the number of seconds to wait.
Ignoring 429 responses and continuing to send requests will not help — those requests are also rejected. In extreme cases, sustained limit violations may result in a temporary block of your API key.

Monitoring Usage

Check your current billing period usage and rate limit status with Get Usage:
GET /v1/organization/usage
Authorization: Bearer sig_YOUR_KEY
{
  "object": "usage",
  "billing_period": {
    "start": "2026-04-01T00:00:00Z",
    "end": "2026-04-30T23:59:59Z"
  },
  "by_endpoint_type": {
    "search": { "used": 1204, "limit": 100000 },
    "read": { "used": 8932, "limit": 500000 },
    "screening": { "used": 42, "limit": 1000 },
    "clearance": { "used": 5, "limit": 500 },
    "image_search": { "used": 18, "limit": 1000 },
    "check": { "used": 3, "limit": 100 },
    "export": { "used": 0, "limit": 10 }
  },
  "rate_limit": {
    "limit": 10000,
    "resets_at": "2026-04-18T12:30:00Z"
  },
  "request_id": "req_abc123"
}
by_endpoint_type reports used and limit for every metered endpoint type in the current billing period. A limit of null means unlimited; 0 means the endpoint type is not allowed on your plan. rate_limit shows the current sliding-window status.
Avoid polling /v1/organization/usage in a tight loop. It shares the standard reads quota.

Handling 429 in Code

The Signa TypeScript SDK handles 429 responses automatically with built-in retry logic — see SDK Error Handling. If you are implementing your own retry logic, wait for the Retry-After duration and retry with exponential backoff.

Best Practices

A single batch request of 100 IDs counts as 1 request against your rate limit, compared to 100 individual GET requests. See Batch Get Trademarks.
Using If-None-Match headers with ETags avoids downloading unchanged response bodies, saving bandwidth and processing time. While 304 Not Modified responses still count against your rate limit, they are significantly cheaper for both client and server. See the caching guide for implementation patterns.
If you are periodically checking for trademark status changes, use Trademark History on specific marks rather than re-running broad searches.
Bursting 500 requests in the first second of a window is more likely to trigger rate limiting than spreading them evenly across the minute. If you need to process a large batch, add a small delay (50—100ms) between requests.
If your application has both a user-facing dashboard and a background sync job, create separate API keys for each. This prevents a background job from exhausting the rate limit that your dashboard users depend on.

Need Higher Limits?

If you are consistently hitting rate limits, reach out at support@signa.so to discuss higher limits tailored to your workload.