The Problem
A single company files trademarks under many names:| Office | Name as Filed |
|---|---|
| USPTO | APPLE INC. |
| EUIPO | Apple Inc |
| DPMA | Apple Inc., Cupertino, California, US |
| CIPO | APPLE INC |
| WIPO | Apple Inc. |
How It Works
Signa resolves entities in three stages:- Name normalization — an 11-step pipeline that reduces every name to a canonical form
- Public company matching — linking owners to SEC and GLEIF records
- Corporate parent linking — connecting subsidiaries to parent companies via GLEIF Level 2 data
Phase 1 uses case-insensitive exact name matching within the same office. The cross-office matching, EUIPO applicant ID lookup, and near-miss detection described below are Phase 2 features currently in development.
Stage 1: Name Normalization
Every owner and attorney name passes through thenormalizeName pipeline when ingested. The pipeline produces a canonical_name and an optional normalized_suffix stored in separate columns.
The 11 steps, in order:
| Step | Operation | Example |
|---|---|---|
| 1 | Unicode NFC normalization | Ensure consistent byte representation |
| 2 | Strip diacritical marks | Nestle (from Nestle with accent) |
| 3 | Lowercase | apple inc |
| 4 | Strip leading articles | the, die, la, le, el, les, etc. |
| 5 | Normalize conjunctions | and, und, et, y all become & |
| 6 | Strip punctuation (keep &) | Hyphens become spaces; commas, periods removed |
| 7 | Collapse whitespace | Multiple spaces become one |
| 8 | Expand abbreviations | intl becomes international, mfg becomes manufacturing |
| 9 | Extract legal suffix | inc, gmbh, ltd, sa stored separately |
| 10 | Final trim | Remove trailing whitespace |
| 11 | Return canonical + suffix | { canonicalName, normalizedSuffix } |
Stage 2: Public Company Matching
Signa maintains apublic_companies table populated from two authoritative sources:
| Source | Coverage | Identifier |
|---|---|---|
| SEC EDGAR | ~10,000 US-listed companies | CIK (Central Index Key) |
| GLEIF | ~2.5M legal entities worldwide | LEI (Legal Entity Identifier) |
canonical_name + country_code against the normalized legal names from SEC and GLEIF records. When a match is found, the owner detail response includes the linked company data:
| Filter | Description |
|---|---|
ticker=AAPL | Find the owner linked to this stock ticker |
lei=HWUPKR0... | Find the owner linked to this LEI |
has_public_company=true | Only return owners with a public company match |
Stage 3: Corporate Parent Linking
Using GLEIF Level 2 relationship data (467,000+ records), Signa links subsidiaries to their corporate parents viaIS_DIRECTLY_CONSOLIDATED_BY relationships. This populates the parent_owner_id field on owner records and powers the /v1/owners/{id}/related endpoint.
Merge Handling
When entity resolution determines that two owner (or attorney) records represent the same entity, the duplicate is merged into the canonical record. The old record’smerged_into_id is set, and all associated trademarks are reassigned.
What happens when you request a merged entity:
Any GET request to the old ID returns a 410 Gone response with a pointer to the canonical record:
Merge handling applies to owners, attorneys, and firms. All three entity types have a
merged_into_id field and return 410 Gone when the entity has been merged into another record.410 responses in your integration. If you cache owner IDs, update your cache when you receive a merge redirect.
Suppressed Owners
In some cases, an owner record is invalid or a duplicate that cannot be cleanly merged. These records are flagged withis_suppressed = true and are hidden from all public API list endpoints. If you have a direct reference to a suppressed owner ID, the detail endpoint still returns data, but the record will not appear in search results or list views.
FAQ
Why do I see the same company with different owner IDs across offices?
Why do I see the same company with different owner IDs across offices?
Entity resolution runs periodically, not in real-time. Newly ingested records may temporarily appear as separate owners until the next resolution pass links them. If you see duplicates persisting, the names may differ enough to prevent automatic matching. You can report suspected duplicates to support.
Can I look up an owner by stock ticker?
Can I look up an owner by stock ticker?
Yes. Use the
ticker filter on the owners list endpoint: GET /v1/owners?ticker=AAPL. This performs a join through the public_companies table.How often is the public company data refreshed?
How often is the public company data refreshed?
SEC data is refreshed daily from the
company_tickers.json feed (~10,000 US companies). GLEIF data is refreshed weekly from the LEI database (~2.5M entities). GLEIF Level 2 relationship data (corporate parents) is refreshed on the same schedule.What happens to my portfolio when an owner is merged?
What happens to my portfolio when an owner is merged?
Portfolios reference trademarks, not owners directly. When an owner is merged, its trademarks are reassigned to the canonical owner. Your portfolio contents remain unchanged.
Does normalization handle CJK characters?
Does normalization handle CJK characters?
Yes. The pipeline preserves CJK, Cyrillic, Arabic, Thai, and other Unicode scripts. Only Latin diacritics are stripped (step 2). The punctuation removal (step 6) uses Unicode-aware regex that preserves non-Latin letters.