IATI XML to JSON converter?

Is anyone aware of any attempts to convert IATI data in its original XML format into JSON format?

I am experimenting with importing IATI into a mongoDB as a side project, but thought to check if there was any existing tools before I have a go at building something from scratch.

Isn’t it possible to get a Json response from the IATI data store? http://datastore.iatistandard.org/docs/api/#technical-formats

But perhaps you really need a converter and just converted data?

I experimented briefly with importing IATI data into elasticsearch (which uses JSON). I didn’t use the datastore itself, but did use the same approach - using xmltodict and then encoding that to JSON https://github.com/IATI/iati-datastore/blob/master/iati_datastore/iatilib/frontend/serialize/jsonserializer.py#L27. The different arguments of xmltodict may also be worth looking into, as it can produce a few different shapes of JSON.

I’ve looked a bit at Xquery 3.1 for that. BaseX has a JSON module that lets you import and export with something like json:serialize(//iati-activities, map { 'format': 'jsonml' })

Or command line to create a jsonML file from an xml file:
basex -i iati.xml -o iati.json -q "json:serialize(//iati-activities, map { 'format': 'jsonml' })"

I haven’t done anything with it yet, though.
http://docs.basex.org/wiki/JSON_Module

Thanks for these quick responses.

@carlelmstam My thinking was to parse raw IATI data from publisher’s files into JSON, rather than use the datastore (as at the moment there are instances of missing activities which the technical team are investigating).

@bjwebb xmltodict looks like a good shout - also came across it in this Stackoverflow post. Thanks for the link to how it is implemented in the datastore.

@rolfkleef BaseX looks interesting too, was considering playing around with that too.

Hi Dale,

You could also consider using the OIPA API. Dependent on your views of ‘in its original XML format’.

In my opinion/experience its not very practical to map IATI XML 1:1 to JSON. There’s some considerations with regards to JSON styling practices and making the API useful. Here’s the alterations with regards to IATI -> JSON mapping in OIPA.

  • Hyphens are replaced by underscores in JSON keys. Some users (and we ourselves) prefer dot notation to access properties in javascript, that would be troublesome with hyphens. It’s a minor thing and a discutable solution, we prefer practical use of the API.

  • XML elements that can occur multiple times, become a list in JSON. IN JSON its common practice to make list keys plural.

  • In OIPA most code lists are nested in the JSON. When using IATI XML 1:1 as JSON, you’ll have to keep all the used code lists on your front-end. In some cases that’s desirable, in most cases it isn’t. It could also cause bugs when you don’t keep that code lists up-to-date (you’ll have to maintain that for every front-end).

Here’s an example that shows all three of these alterations;

"recipient_countries": [{"country": {"code": "TZ", "name": "Tanzania, united republic of"},"percentage": 30.0}]

1:1 mapping would be:

"recipient-country": [{"code": "TZ", "percentage": 30.0}]

These are all just design decision with practical use on front-end websites in mind. I would be interested to hear other views on these issues!

Thanks for this extra detail on OIPA @VincentVW - I was more looking to convert the XML from the source publisher datasets (will likely use xmltodict for this), and primarily as a back-end only in the first instance.

Nonetheless, it’s good to know of some of the considerations and design approaches that were needed. I don’t work too much in json these days so hadn’t yet noticed some of the nuances of the notation when applied to IATI (use of underscores and plurals) so it was particularly good to highlight those.

Will post results here when I’ve had a chance to work something up!

1 Like