Clarification: multiple RO Sector Vocabularies

IATI guidance states that the two Reporting Organisation codes for Sectors (codes 98 & 99):

It is also recommended that if a publisher has its own classification system or systems then the vocabularies 99 or 98 (Reporting Organisation’s own vocabularies) should be used in addition to DAC codes.

When using 98 or 99, a publisher can add a vocaulary-uri to point to that vocabulary online (added in 2.02)

With the Initiative for Open Ag Funding, we have started to unearth cases where there may well be a need for more than two RO vocabularies. Eventually, these may well be added to the vocab code list (see also #495: Make vocab codelists non- embedded ), but we wanted to clarify whether the standard could feasibly be expressed by publishers by :

  • using either code 98 or 99 - but with different vocabulary-uri attributes - to express multiple vocabularies. This example details four different RO vocabularies (via the uri)

<sector code="1" vocabulary=""98" vocabulary-url="http://sectorvocabA.example.org" /> <sector code="2" vocabulary=""98" vocabulary-url="http://sectorvocabB.example.org" /> <sector code="3" vocabulary=""99" vocabulary-url="http://sectorvocabC.example.org" /> <sector code="4" vocabulary=""99" vocabulary-url="http://sectorvocabD.example.org" />

This doesn’t seem very elegant or user-friendly. A way round this could be to reserve a block of codes (eg: 80-100?) for RO vocabularies, for example (but that may in turn have more impact on data users). It would be useful to clarify/discuss this. @rolfkleef may have ideas!

Adding more lines to the sector vocabulary is best. No need for Ag and other well-established vocabularies to have to share the 98 and 99 codes in a single activity if a reporting agency is able to make their data most descriptive.

A bit similar to what Steven is suggesting, could you not just have a pattern here to allow an unlimited number of reporting organisation codes, e.g. anything beginning with 9 is a RO vocabulary?

I would not be in favor of using a pattern here. In general, hiding information in identifiers, leads to problems in the long run. If there is a well established sector classification, shouldn’t it deserve its own unique identifier? Still lots of room in the current SectorVocabulary code list.

The key use case as I see it:

  • We have a code list that is not on the IATI list, we want to use it now
  • We want to express that we share the code list with other organisations

Adding room for more “RO” vocabularies doesn’t solve the second point.

This issue occurs with more code lists and identifiers. Two routes I see:

  1. Adding a code to a code list should become easier: IATI takes on a role like “IANA: internet assigned names and numbers authority”. I’m not in favour of this myself, and it still doesn’t answer the “we want to use this now” case.

  2. Introduce “name spaces”: URIs basically solve that problem. Having a base URI with a code might be a middle ground for ease of use of finding sectors within one vocabulary. IATI could maintain a list of “recommended code list URIs”, and we could keep the existing codes as shorthand for URIs.

Route 2 could mean interpreting like:

  • If there is a code: assume that code list
  • If there is no code, use the URI

But it could also be the other way around (if there is a URI, use that, if not: use the code).

(I think we need to expand the standard with such interpretation rules: how is an application or user supposed to deal with the data if both attributes are there?)