> ## Documentation Index
> Fetch the complete documentation index at: https://docs.signa.so/llms.txt
> Use this file to discover all available pages before exploring further.

# Entity Resolution

> How Signa links trademark owners into a single resolved entity across offices

The same company files trademarks in every office it cares about, and each office stores its own owner record with its own spelling. Entity resolution is how Signa links those per-office records into one **resolved entity** — so you can see a company's complete footprint, across every office, as a single object.

## The problem

Each office records a company however it was filed — a different legal entity, a different language, a different level of detail. Resolving these is hard in **both** directions: the same company often looks completely different across offices, and different companies often look identical.

**The same company, filed three different ways.** Procter & Gamble's records share almost nothing a string match could use:

| Owner as filed                                 | Country | Linked by             |
| ---------------------------------------------- | ------- | --------------------- |
| The Procter & Gamble Company                   | US      | shared public company |
| Procter & Gamble Manufacturing Cologne GmbH    | DE      | Madrid registration   |
| Procter & Gamble International Operations S.A. | CH      | Madrid registration   |

Three different legal entities, in three countries — joined into one entity not by their names, but by shared Madrid registrations and a common public company.

**It gets harder — some links cross scripts entirely.** One Russian holder appears across offices as `Limited Liability Company «DVA MYACHA»`, as a truncated `dva y`, and in Cyrillic as `«DVA МYАСНА»`. Nothing but a shared Madrid registration number connects them.

**And the reverse is just as dangerous.** Two companies can share an identical name and *not* be the same:

| Owner    | Country | Business       | LEI                  |
| -------- | ------- | -------------- | -------------------- |
| EQT Corp | US      | Natural gas    | 4NT01YGM4X7ZX86ISY52 |
| EQT AB   | SE      | Private equity | 213800U7P9GOIRKCTB34 |

Both are public companies named "EQT." Merging them would be a serious error — so Signa keeps them apart, distinguished by country and LEI and confirmed by human review.

Naive name matching is therefore neither sufficient nor safe. Entity resolution is how Signa gets both directions right.

## Owners and entities

Signa models this with two layers:

| Layer      | ID      | What it is                                                                            |
| ---------- | ------- | ------------------------------------------------------------------------------------- |
| **Owner**  | `own_*` | A single office's applicant/registrant record, exactly as that office filed it.       |
| **Entity** | `ent_*` | The cross-office identity that links the owner records belonging to the same company. |

We **link, we don't merge.** The original per-office owner records are never rewritten or collapsed — an entity points at them. That means **`own_` IDs are stable and never break**, even as resolution improves and links are added or corrected over time.

```bash theme={null}
GET /v1/entities/ent_R3jK9mN2
```

### Resolved vs. derived entities

Every owner is reachable as an entity, so you can always traverse from a mark to its company-level identity:

* **Resolved** (`entity_id_type: "resolved"`) — a materialized entity linking two or more owner records.
* **Derived** (`entity_id_type: "derived"`) — a singleton `ent_<owner-uuid>` for an owner that isn't linked to anything yet. It has exactly one member (itself).

A derived ID for an owner that later gets linked transparently resolves to the real entity — the response `id` may differ from the one you requested, and cached derived IDs never 404.

## What a resolved entity gives you

<CardGroup cols={2}>
  <Card title="One identity across offices" icon="layer-group">
    Member owners (one per office) with the evidence that justified each link.
  </Card>

  <Card title="Global portfolio" icon="folder-tree">
    Every mark across all member owners, in one paginated, fully filterable list.
  </Card>

  <Card title="Corporate family" icon="sitemap">
    The GLEIF parent and direct subsidiaries behind a brand.
  </Card>

  <Card title="Public-company facts" icon="building-columns">
    Ticker and LEI, aggregated across the company's office records.
  </Card>
</CardGroup>

The detail response embeds each member owner and a **whitelisted link evidence** projection — the `signal`, `tier`, `confidence`, and whether it was `decided_by` `auto`, `llm`, or `human`:

<CodeGroup>
  ```bash cURL theme={null}
  curl -s "https://api.signa.so/v1/entities/ent_R3jK9mN2" \
    -H "Authorization: Bearer sig_xxx" | jq '.members[].link'
  ```

  ```typescript TypeScript theme={null}
  import { Signa } from "@signa-so/sdk";

  const signa = new Signa({ api_key: process.env.SIGNA_API_KEY });
  const entity = await signa.entities.retrieve("ent_R3jK9mN2");
  console.log(entity.name, entity.member_count, entity.tickers);
  ```

  ```python Python theme={null}
  import signa

  client = signa.Client("sig_xxx")
  entity = client.entities.retrieve("ent_R3jK9mN2")
  print(entity.name, entity.member_count, entity.tickers)
  ```
</CodeGroup>

<Note>
  The `link` evidence is deliberately narrow. The model id, prompt hash, and any
  LLM rationale behind a link are operational data and are never returned through
  the API.
</Note>

See [Get Entity](/api-reference/parties/get-entity) for the full response shape.

## How resolution works

### 1. Name normalization

Every party name runs through a canonical normalization pass — case folding, diacritic stripping, Unicode-aware punctuation/whitespace handling, common-abbreviation expansion, and legal-form extraction (recognizing `Inc.`, `GmbH`, `S.A.` and friends as detachable suffixes). The result is a `canonical_name` that's stable across the spelling variants one company gets filed under, plus the original display `name`.

### 2. Linking signals

Pairs of owners are linked **only on strong, specific evidence**. Name similarity alone never links — there must be a second corroborating signal:

| Signal (`tier`)     | What links the owners                                                                                                  |
| ------------------- | ---------------------------------------------------------------------------------------------------------------------- |
| `office_identifier` | The same applicant identifier reported by an office.                                                                   |
| `madrid_ir`         | A shared Madrid international registration number, behind a distinctive-token name guard to prevent mass false merges. |
| `shared_pco`        | Both owners link to the **same public company** (SEC/GLEIF) at high confidence.                                        |
| `portfolio_overlap` | Matching name plus overlapping trademark portfolios — the signature of one company across offices.                     |

Unambiguous matches (shared identifiers, guarded Madrid numbers, the same public company) are **auto-linked deterministically**. Anything ambiguous is queued as a candidate for adjudication rather than guessed at.

### 3. Adjudication

Ambiguous candidates are resolved by a combination of deterministic rules and model-assisted review, and **every decision is durable**: a confirmed "different" verdict is recorded so the same pair is never silently re-linked later. Links can also be split when new evidence contradicts an earlier decision, again without ever rewriting the underlying owner records.

### 4. Accuracy and validation

Linking two companies that *aren't* the same is the worst mistake an entity resolver can make — far more damaging than a missed link — so Signa is built to earn every link:

* **Adjudicated, not guessed.** Ambiguous pairs are never auto-linked on a hunch. More than **57,000 owner pairs** have been individually adjudicated by AI judges and human reviewers.
* **LLM-as-judge, cross-checked across models.** Each ambiguous pair is evaluated by a large language model acting as an impartial judge, then independently cross-checked across multiple frontier models — with disagreements escalated to web research and human review. A link only stands when the evidence agrees.
* **Held to hard gates.** New links must clear strict precision and recall thresholds before they're written, and the system is tuned aggressively against false merges.
* **Decisions are durable.** A confirmed "different" verdict — like keeping EQT Corp and EQT AB apart — permanently blocks that merge from ever recurring.

This runs at scale: across **13 million** owner records, Signa has linked more than **one million** of them into over **335,000** resolved cross-office entities.

<Note>
  Resolution runs continuously as data is ingested, not in real time per request.
  Newly ingested owners may briefly appear unlinked until the next pass connects
  them.
</Note>

## Public-company enrichment

Signa maintains a `public_companies` table from two authoritative sources, and links owners to it:

| Source        | Coverage                        | Identifier |
| ------------- | ------------------------------- | ---------- |
| **SEC EDGAR** | \~10,000 US-listed companies    | CIK        |
| **GLEIF**     | \~2.5M legal entities worldwide | LEI        |

These facts are **aggregated onto the entity** — an entity carries a company's ticker even when only one office record was matched. You can filter both entities and owners by them:

| Filter                 | Description                                 |
| ---------------------- | ------------------------------------------- |
| `ticker=AAPL`          | Entities/owners linked to this stock ticker |
| `lei=HWUPKR0…`         | Linked to this LEI                          |
| `publicly_traded=true` | Has a confirmed active SEC ticker match     |
| `has_lei=true`         | Has a confirmed GLEIF LEI match             |

```bash theme={null}
GET /v1/entities?publicly_traded=true&country_code=US&sort=-trademark_count
```

## Corporate families

Using GLEIF Level 2 relationship data (460,000+ parent–subsidiary records), Signa connects an entity to its **direct corporate parent and subsidiaries**:

<CodeGroup>
  ```bash cURL theme={null}
  curl -s "https://api.signa.so/v1/entities/ent_R3jK9mN2/family" \
    -H "Authorization: Bearer sig_xxx"
  ```

  ```typescript TypeScript theme={null}
  const family = await signa.entities.family("ent_R3jK9mN2");
  console.log(family.parent?.name, family.children.length);
  ```
</CodeGroup>

<Note>
  GLEIF Level 2 covers LEI-reporting companies only. An absent edge does **not**
  imply the absence of a corporate relationship — see `coverage_caveat` on the
  response. See [Entity Family](/api-reference/parties/entity-family).
</Note>

## Endpoints

| Endpoint                                                                       | Purpose                                   |
| ------------------------------------------------------------------------------ | ----------------------------------------- |
| [`GET /v1/entities`](/api-reference/parties/list-entities)                     | Search and filter resolved entities       |
| [`GET /v1/entities/{id}`](/api-reference/parties/get-entity)                   | One entity with members + link evidence   |
| [`GET /v1/entities/{id}/trademarks`](/api-reference/parties/entity-trademarks) | Global portfolio across all member owners |
| [`GET /v1/entities/{id}/family`](/api-reference/parties/entity-family)         | GLEIF corporate parent and subsidiaries   |
| [`GET /v1/owners/{id}`](/api-reference/parties/get-owner)                      | A single per-office owner record          |

## FAQ

<AccordionGroup>
  <Accordion title="What's the difference between an owner and an entity?">
    An **owner** (`own_`) is one office's record of an applicant/registrant. An **entity** (`ent_`) links the owner records that belong to the same company across offices. Use entities for company-level questions ("everything Apple owns, everywhere"); use owners when you need the exact per-office record.
  </Accordion>

  <Accordion title="Why did the entity ID I requested come back with a different id?">
    You requested a **derived** singleton ID for an owner that has since been linked into a real entity. Signa transparently resolves it to the canonical entity and returns that `id`. This is expected — store the returned `id`.
  </Accordion>

  <Accordion title="Do owner IDs ever break?">
    No — owners are linked, not merged, so `own_` IDs are stable. Only **entities** are ever fused; when that happens the old `ent_` ID returns `410 Gone` with a pointer to its successor in `merged_into`, so handle that response in your integration.
  </Accordion>

  <Accordion title="Can I look up a company by stock ticker?">
    Yes — `GET /v1/entities?ticker=AAPL` returns the resolved entity, with public-company facts aggregated across its office records. The same filter works on `/v1/owners`.
  </Accordion>

  <Accordion title="How often is the data refreshed?">
    SEC data refreshes daily (\~10,000 US companies); GLEIF and its Level 2 corporate-parent relationships refresh weekly (\~2.5M entities). Entity resolution runs continuously as new records are ingested.
  </Accordion>

  <Accordion title="Does normalization handle non-Latin scripts?">
    Yes. Normalization preserves CJK, Cyrillic, Arabic, Thai, and other scripts. Only Latin diacritics are stripped, and punctuation handling is Unicode-aware.
  </Accordion>
</AccordionGroup>
