Tech Paper: Codelist Management

We maintain or support a number of codelists. Embedded codelists belong to IATI and can only be updated through an upgrade as their modification may have functional implications. Non-embedded codelists either belong to third parties or belong to IATI but can be updated, subject to consultation, as needed.

There are currently a number of issues requiring clarification:

  • How best to handle versioning of codelists
  • How to keep non embedded codelists up to date
  • Should all external vocabularies be imported into their own IATI-duplicated codelist like OECD DAC CRS Sector codes
  • Change status of vocabulary codelists from embedded to non-embedded?
  • How do we ensure only valid lists get added as a codelist vocabulary

This paper is part of the Agenda for the IATI TAG 2016 Technical Consultation Workshop

:bird: #IATI #TAG2016

@bill_anderson I just wan to check how this paper is progressed. Is this something that will be written by the IATI tech team in advance ?

The reason I ask, is that there are a number of posts already in Discuss that speak to these - and other - points. It might be useful for us to collate together (unless this is already in hand):

Hi @stevieflow thanks for raising and I can confirm that part of the purpose of the paper is to bring all the various issues (from multiple sources) relating to codelists - including the ones you have highlighted. We are intending to publish this and all the others papers well in advance of the TAG so that members of the IATI community have a chance to review and comment in advance. Hope that helps

I would like to add a particular issue to the agenda for consideration here, either within this paper, or as a distinct paper relating to the OrganisationRegistrationAgency codelist.

With support from IATI and a number of other partners, the Identify-Org project has been working to develop a shared maintenance and governance model for a list of Organisation Identifier Lists. We have now completed a research process that:

  • Added new meta-data fields to the OrganisationRegistrationAgency list;
  • Reviewed each of the current entries from the IATI list to check we could locate an actual list of organisation identifiers, or that the agency identified was an issuer of organisation identifiers;
  • Test generation of an IATI-style codelist from the results;

Before the TAG we will be doing more work to build an improved front-end interface onto the underlying data gathered, and to agree a workflow for shared governance and updates to the list.

The implication of all this work is that:

  • We could propose that Identify-org.net becomes an external codelist, replacing the OrganisationRegistrationAgency list from IATI (conditional on securing a shared governance agreement with IATI, Open Contracting and other stakeholders for maintain this external list);

  • We would suggest some tweaks to the language of the IATI documentation to replace the description of ‘Organisation Registration Agencies’ with discussion of ‘Organisation Identifier Lists’. This is because, in our review of the organisation identifier methodology, we have found that some relevant lists do not strictly ‘register’ organisations, and so the terminology can be misleading to users.

Some of the considerations this raises may be

  • The research for identify-org.net has suggested deprecation of a number of current entries from the Organisation Registration Agency codelist.

Supporting notes

A view of the additional suggested descriptions, links and updated content in the IATI codelist format for the Registration Agency List is available here. This contains only the overlapping codes. Identify-org also has additional codes, but which were not included here to make comparison of the basic information easier.

Suggested deprecation of current IATI Organisation Registration Agency codelist entries. Click through on each link for an explanation in the description. Under the description is a link to search OIPA for instances of each code in use.

A review of any codes added to the IATI list since October 2016 is needed to finalise synchronisation of the lists.

Next steps

Advice on whether this should be worked into a distinct paper, or handled under codelist issues would be welcome.

Happy to have discussions/calls etc. to prepare appropriately on next steps.

Can we separate out process from content here? I think we need to include stuff on management and documentation in the paper, including the principle of deprecation. What we do with particular codes can, I think, be handled by our existing (or improved) consultation process.

I think that makes sense.

So - better to have a separate paper on management of the Organisation
Registration Agency codelist? Or to include that as a section in a general
codelists paper?

Let me know how best to contribute to either option.

Do you want to write something separately which we can combine into the one paper for the day?

@TimDavies apologies but could I clarify…

All we need, as part of the Codelist Management paper, is a concrete proposal for how we transition from the current Registration Agency Codelist to identify-org.net.

How you then present the work of the project as a whole is up to you.

The draft paper is here.

I’ve added the following proposal to the paper:

Proposal: Shared governance of Organisation Identifier Lists codelist

The OrganisationRegistrationAgency codelist should be renamed to ‘OrganisationIdentifierList’ and maintainence of this list should take place through the ‘org-id project’.

The org-id project is a collaborative effort between Open Contracting, IATI, 360 Giving, NRGI and other partners to maintain a shared list of organisation identifier lists. It builds on the IATI Organisation Registration Agency list, but adds additional meta-data, and introduces a new schema and repository for code information.

See this post for details of changes brought about.

Adopting this as the replacement to IATI’s OrganisationRegistrationAgency codelist will involve:

  • Linking to, or maintaining a copy of, the org-id confirmed codelist (all codes with the ‘confirmed’ property set to true). An API output in IATI XML format will be available;
  • Pointing users to the search interface for locating an appropriate organisation identifier lists from the IATI documentation. (This will shortly be available at http://org-id.guide);
  • Submitting all new requests for identifier lists to be included via the org-id project, either through pull-requests to the repository;
  • IATI Secretartiat staff engaging in the review process for new identifier lists submitted.

Before this can happen:

  • A small change is needed to the org-id schema to replace the boolean ‘deprecated’ flag for codes with a date of deprecation. This will allow the org-id API to provide a backwards compatible list that only removes codes with each IATI version update;
  • The repository for org-id needs to move to it’s own organisation repository;
  • A governance document between the partners in the project needs to be agreed for ongoing management of the list;

These changes can be made during March 2017.

I have made the following revisions to the draft proposal document on Codelist Management

  • Turned the embedded/non-embedded section into actionable proposals
  • Tidied the replicated/non-replicated section and added a metadata field for update frequency (suggested by @Herman)
  • Added a guideline on non-deletion of retired codes (suggested by @markbrough)
  • Added a new section from @TimDavies on the work of the Org-Id project

This is now available: http://org-id.guide

This is now available at http://org-id.guide/download

I’m not clear on how the determination of what is functional has been made. For example the proposal is that GeographicExactness would become NonEmbedded, but BudgetStatus would not. However, they’re both binary classifications of how an element would be interpreted, and I think both might be implemented with specific functional logic in data use systems.

There’s some others I’m also not sure about, but I think that’s the clearest example to start with.

Guideline: Retired codes should be deprecated but never deleted from codelists maintained or replicated by IATI

I agree with this in principle.

However, there’s a tricky implementation issue of what to do where the source codelists re-use codes. As a very pedantic example, the ISO 2 letter country code list reserves the right to re-use codes 50 years after the country stops existing.

Additions to the description types codelist for communication purposes

Based on the conversations and direction of IATI (i.e. make it easy to use) during IATI TAG 2017 it is proposed to extend the codelist of description types (http://iatistandard.org/202/codelists/DescriptionType/) to ease and encourage addition of data for communication purposes. This is founded on the experience of Akvo RSR users for about 4000 publicly accessible project webpages for communication purposes.

Addition of the following description types is proposed:

• Activity plan summary, type 5
• Background, type 6
• Activity plan, type 7
• Baseline situation, type 8
• Sustainability, type 9
• Subtitle, type 10

A further explanation of the fields (up for rephrasing):

• Activity plan summary: Detailed information about the implementation of the activity: the what, how, who and when.
• Background,: This should describe the geographical, political, environmental, social and/or cultural context of the activity, and any related activities that have already taken place or are underway.
• Activity plan: Enter a brief summary in order to display the summary. The summary should explain: 1) Why the activity is being carried out; 2) Where it is taking place; 3) Who will benefit and/or participate; 4) What it specifically hopes to accomplish 5) How goals will be reached
• Baseline situation: Describe the situation at the start of the activity.
• Sustainability: Describe how you aim to guarantee sustainability of the activity after implementation. Think about the institutional setting, capacity-building, a cost recovery plan, products used, feasible arrangements for operation and maintenance, anticipation of environmental impact and social integration.
• Subtitle: A short one-line description of the activity to typically be represented alongside the activity title.

For reference current mapping with Akvo RSR:

Did you accidentally switch the descriptions of Activity plan and Activity Plan Summary?

By changing Codelists from Embedded to Non-Embedded, what are they deemed to be at an earlier version of the standard?

For example:

  1. You have a Codelist. It is currently Embedded.
  2. By this change, it becomes Non-Embedded.
  3. At some point, it is decided to withdraw certain values on the Non-Embedded version of the Codelist.

Are these values deemed withdrawn against versions of the Standard at which the Codelist was Embedded?

  • If yes, is this permitted? There is nothing under Codelist management or the withdrawal discussion to indicate whether it is permitted to withdraw a value from an Embedded Codelist outside an integer upgrade (it’s backwards incompatible since Embedded Codelists are a fixed part of the Standard).
  • If no, the versions of the Standard where the Codelists are each Embedded and Non-Embedded are backwards incompatible. As such, changing Codelists from Embedded to Non-Embedded would have to be an integer change.

Also, there similar questions about adding new values and whether they are deemed part of the Codelist for earlier versions of the Standard (but without the backwards-incompatibility problems).

@bill_anderson