There’s no two ways about it, identification of US companies is problematic. Companies (including nonprofits) are created/registered at state level, and so the obvious thing is to use state-level identifiers, and this is what OpenCorporates uses.
Having said this, these are not without problems, and certainly don’t satisfy every single one of those qualities you’d want from good identifiers:
- stable (i.e. they don’t change) – in general this is true, but not in all cases. For example, we’ve found several registers (for example Georgia), which have changed the identifier system without notice, or documentation. This causes problems for users of the identifiers, although we come up with a transparent and effective system for handling this, listing previous numbers, and performing redirects for the URIs (we are also writing public reports listing what we’ve done)
- 1:1 mapping with the entity to which they relate. This in general is true, although we have come across situations where the identifiers are not unique across the state (see this report on Minnesota) but only to a a particular company type.
- well-known, i.e. they are widely used/distributed. This is a significant problem in the US – unless the UK company number or say the French Sirete, the numbers are not put on letterheads, or invoices, for example.
- open - no proprietary licence. This at least is not a problem, unlike for example the hated DUNS number.
On the other hand they are issued by the government that creates the entities, and do relate the the legal entities themselves (unlike the EIN – see below – or the DUNS number).
The DUNS can be ruled out, being both proprietary, and mapping not to legal entities, but to records inside D&B’s proprietary block-box database, and could be a company, a building, a division, or frankly, from an end-user’s perspective, anything at all.
EINs are superficially attractive, being issued by the IRS (i.e. Federal level). However, in general these are not a matter of public record, and they are not necessarily persistent or 1:1 mapping to legal entities. In addition, not all legal entities have EINs. They are, however, usually known by the entity themselves – the challenge, however, is mapping to legal entities (which are at the state level).
So not easy solutions, and we think in reality this will be solved by a) either us mapping different identifier systems together (we are already doing with EINs, where we can), and creating URIs that are transparent, have well-thought-out and public principles (see our blog – sorry won’t let me add another URL in this post) for how we handle the changes and issues, or b) a move to a universal identifier system such as the LEI, or most likely a combination of the two.