Skip to main content
Signa ingests trademark data from multiple offices (2 in production today, EUIPO in beta, 18 more planned). The same company can appear with dozens of name variations across offices. Entity resolution is the process that links these variations to a single canonical owner record.

The Problem

A single company files trademarks under many names:
OfficeName as Filed
USPTOAPPLE INC.
EUIPOApple Inc
DPMAApple Inc., Cupertino, California, US
CIPOAPPLE INC
WIPOApple Inc.
Without entity resolution, you would see five separate owners for one company.

How It Works

Signa resolves entities in three stages:
  1. Name normalization — an 11-step pipeline that reduces every name to a canonical form
  2. Public company matching — linking owners to SEC and GLEIF records
  3. Corporate parent linking — connecting subsidiaries to parent companies via GLEIF Level 2 data
Phase 1 uses case-insensitive exact name matching within the same office. The cross-office matching, EUIPO applicant ID lookup, and near-miss detection described below are Phase 2 features currently in development.

Stage 1: Name Normalization

Every owner and attorney name passes through the normalizeName pipeline when ingested. The pipeline produces a canonical_name and an optional normalized_suffix stored in separate columns. The 11 steps, in order:
StepOperationExample
1Unicode NFC normalizationEnsure consistent byte representation
2Strip diacritical marksNestle (from Nestle with accent)
3Lowercaseapple inc
4Strip leading articlesthe, die, la, le, el, les, etc.
5Normalize conjunctionsand, und, et, y all become &
6Strip punctuation (keep &)Hyphens become spaces; commas, periods removed
7Collapse whitespaceMultiple spaces become one
8Expand abbreviationsintl becomes international, mfg becomes manufacturing
9Extract legal suffixinc, gmbh, ltd, sa stored separately
10Final trimRemove trailing whitespace
11Return canonical + suffix{ canonicalName, normalizedSuffix }
The legal suffix dictionary covers 40+ forms across English, German, French, Italian, Dutch, Nordic, Spanish, Portuguese, Japanese, Korean, and Turkish corporate designations.
curl -s https://api.signa.so/v1/owners?q=apple \
  -H "Authorization: Bearer sig_live_xxx" | jq '.data[0]'

Stage 2: Public Company Matching

Signa maintains a public_companies table populated from two authoritative sources:
SourceCoverageIdentifier
SEC EDGAR~10,000 US-listed companiesCIK (Central Index Key)
GLEIF~2.5M legal entities worldwideLEI (Legal Entity Identifier)
Matching is performed by comparing the owner’s canonical_name + country_code against the normalized legal names from SEC and GLEIF records. When a match is found, the owner detail response includes the linked company data:
curl -s https://api.signa.so/v1/owners/own_abc123 \
  -H "Authorization: Bearer sig_live_xxx" | jq '.public_companies'
You can also filter owners by public company identifiers:
FilterDescription
ticker=AAPLFind the owner linked to this stock ticker
lei=HWUPKR0...Find the owner linked to this LEI
has_public_company=trueOnly return owners with a public company match

Stage 3: Corporate Parent Linking

Using GLEIF Level 2 relationship data (467,000+ records), Signa links subsidiaries to their corporate parents via IS_DIRECTLY_CONSOLIDATED_BY relationships. This populates the parent_owner_id field on owner records and powers the /v1/owners/{id}/related endpoint.
curl -s "https://api.signa.so/v1/owners/own_abc123/related" \
  -H "Authorization: Bearer sig_live_xxx"

Merge Handling

When entity resolution determines that two owner (or attorney) records represent the same entity, the duplicate is merged into the canonical record. The old record’s merged_into_id is set, and all associated trademarks are reassigned. What happens when you request a merged entity: Any GET request to the old ID returns a 410 Gone response with a pointer to the canonical record:
curl -s https://api.signa.so/v1/owners/own_old123 \
  -H "Authorization: Bearer sig_live_xxx"
# HTTP 410
# {
#   "error": {
#     "type": "https://api.signa.so/errors/entity_merged",
#     "status": 410,
#     "detail": "Owner own_old123 has been merged into own_abc123.",
#     "merged_into": "own_abc123",
#     "suggestion": "Use GET /v1/owners/own_abc123 instead."
#   }
# }
Merge handling applies to owners, attorneys, and firms. All three entity types have a merged_into_id field and return 410 Gone when the entity has been merged into another record.
Best practice: Always handle 410 responses in your integration. If you cache owner IDs, update your cache when you receive a merge redirect.

Suppressed Owners

In some cases, an owner record is invalid or a duplicate that cannot be cleanly merged. These records are flagged with is_suppressed = true and are hidden from all public API list endpoints. If you have a direct reference to a suppressed owner ID, the detail endpoint still returns data, but the record will not appear in search results or list views.

FAQ

Entity resolution runs periodically, not in real-time. Newly ingested records may temporarily appear as separate owners until the next resolution pass links them. If you see duplicates persisting, the names may differ enough to prevent automatic matching. You can report suspected duplicates to support.
Yes. Use the ticker filter on the owners list endpoint: GET /v1/owners?ticker=AAPL. This performs a join through the public_companies table.
SEC data is refreshed daily from the company_tickers.json feed (~10,000 US companies). GLEIF data is refreshed weekly from the LEI database (~2.5M entities). GLEIF Level 2 relationship data (corporate parents) is refreshed on the same schedule.
Portfolios reference trademarks, not owners directly. When an owner is merged, its trademarks are reassigned to the canonical owner. Your portfolio contents remain unchanged.
Yes. The pipeline preserves CJK, Cyrillic, Arabic, Thai, and other Unicode scripts. Only Latin diacritics are stripped (step 2). The punctuation removal (step 6) uses Unicode-aware regex that preserves non-Latin letters.