New DataStore API
As everyone is aware the new IATI DataStore is in the process of being built. There have been a few questions about the API that this discuss post seeks to address. We have a bit of background on the old DataStore for context then the information about the new DataStore API.
History of the old Datastore
When the IATI Standard was launched there were strong arguments against IATI maintaining a database: this was seen to be overlapping the service provided by the OECD CRS which is a curated database. The datastore was instead agreed as an uncurated view of the files held on the Registry.
The original datastore was built in 2011/12 by Open Knowledge (who also managed the IATI Registry at the time). It replicated all (readable) data on the Registry. It was sold as everything in the Registry today will be in the datastore (DS) tomorrow.
Locations and results data were not included in this alpha - only transactions, budgets and activity-level data were imported into the DataStore (the whole activity was only available through an xml blob stored for each activity). The alpha product did not clean data, it just show exactly what was on the Registry.
The plan had been that these additional features would be added in future phases of the DataStore but due to a number of reasons (including budget and contractual matters) the project never went beyond phase one.
New IATI datastore
The new datastore is based on existing open source software ‘OIPA’ and is actively being maintained by Zimmerman & Zimmerman. OIPA has been in use with a variety of international organisations and governments like UNESCO, IOM, DFID, MFA and many others. This new datastore will scrape the IATI Registry for IATI Publishers and their XML data sources, validate XML using the new IATI Validator service and will then transform, store and interface that data into API for anyone to use. The API has 14 different API endpoints each with their specific purpose. The datastore will also allow users to export XML, CSV and xlsx format if so desired. Snapshot of functionality as per original specification:
-
ETL (Extract, Transform, Load) from XML to JSON
-
Validation provided by new IATI Validator
-
IATI Version support
-
XML exports
-
Range of filters available
-
API output
-
CSV/XLSX Serialisations
Timeline for delivery
The new DataStore will be launched together with the IATI Validator this summer.
The IATI technical team met with Zimmerman and Zimmerman along with Data4Development earlier this month to agree how the two systems will integrate. We will share more information on the system integration of the two products in early May with an update on the timeline.
Moving from old to new API
The new OIPA-powered Datastore will differ from the current one both in API calls and results returned. This is because as it shows more of IATI data, a new structure is needed to do so logically.
For API calls we have limited the changes as much as was possible to do whilst still delivering a product that has a different core structure and more capabilities. Mostly what will be required is small tweaks in the url in use so that it points to the new location. The mapping is not going to be a 1:1 mapping so we cannot use the old API structure with redirects.
For returned results, the underlying structure will once again be mostly the same, with a few changes. The new Datastore allows for a more comprehensive and precise resultset to be explored. An example is the participating-org result row: in the current system, the users will receive a result containing participating-org.role = 3, while the new Datastore offers an expanded view, providing both participating-org.role.code = 3 and participating-org.role.name = Extending, effectively removing the need for cross checks and extra calls.
Don’t panic!
The technical team are here to help. There will be documentation of all the new parameters, queries possible and outputs so that the transition can happen as smoothly as possible.
We are also making sure there is a grace period where the old DataStore will exist in parallel till the end of 2019 so that there is plenty of time to make necessary changes.
Details of who is using the current DataStore are being collected here.