Results – allow disaggregations of results data (included 2.03)

Title
Results – allow disaggregations of results data

Standard
Activity

Schema Object
iati-activities/iati-activity/result/indicator/period/target
iati-activities/iati-activity/result/indicator/period/actual

Type of Change
Change to Schema

Issue
• Currently, you cannot have more than one target and actual per period. This means that it is not possible to disaggregate an indicator by more than one set of dimensions, specify other than by a technical workaround (see suggestions below).
Why is this a problem?: Many donors ask for results data to be disaggregated by, for example gender AND disability status, or gender AND age. This is not currently allowed in the schema, as only one target and one actual is allowed per period (per indicator). “It is recognised widely that results presented as averages for entire populations will usually mask differences within that population group, for example, by gender, wealth, disability, ethnicity, etc. The new Sustainable Development Goals (SDGs) in particular have put this issue higher on the agenda, under the heading of “Leave no one behind”. In order to ensure equity and the inclusion of marginalised groups, it is vital that disaggregated data is collected (and many aid providers are increasingly requiring disaggregation by a number of dimensions - For example, DFID requires results to be disaggregated by gender and is rolling out requirements to disaggregate by disability status). For IATI data to be useful, it in turn must enable the publication of disaggregated results data.” https://www.bond.org.uk/resources/publishing-results-to-iati:
Suggestion: The current workaround is to have two near identical indicators (eg through “dimension”) (or periods of time within indicators) for the same result. This leads to confusion as there is no sure way to know which values should be considered as disaggregations versus those that belong to separate indicators. It also causes duplicate information for the rest of the indicator, adding an unnecessary source of potential error and reporting burden.

Proposal
Multiple target and actual values (representing each disaggregation) should be permitted for a given period of an indicator. Change cardinality of “target” and “actual” elements from 0…1 to 0…*

(see http://iatistandard.org/202/activity-standard/iati-activities/iati-activity/result/indicator/period/target/ and http://iatistandard.org/202/activity-standard/iati-activities/iati-activity/result/indicator/period/actual/ for relevant sections of the standard))

Standards Day
Workshopped at the TAG 2017 and mentioned at the end of the Standards day as part of the results section. Although there was very little time to discuss the proposal, no criticism of the proposal was offered. Proposal has been on IATI Discuss since March 2017.

Links
• This topic has been discussed and agreed previously but missed in implementation: Disaggregation of results
• A more recent discussion is here: Results: allow disaggregation
• This topic addresses Principle 4 from a consultation driven by Monitoring and Evaluation experts from UK CSOs Jan – Mar 2017 – see Results: discussion space and TAG 2016/17 path. Technical suggestions were devised by technology specialists at the Nethope Athens conference March 2017. In all around 30 M&E and technical specialists were involved in this consultation and it builds on a previous consultation by Bond 2015-16 (https://www.bond.org.uk/resources/publishing-results-to-iati - also on discuss.iatistandard : Sharing Results using IATI data standard: will it improve learning and accountability? ).

Agree that the dimensions for disaggregation should be exactly the same for baselines, actuals and targets. Doesn’t that mean that if dimensions are used that it is mandatory to use exactly the same dimensions for baselines actuals and target? In other words, the disaggregation dimensions should always be the same in order to make baseline, actuals and targets comparable. So different levels of disaggregation within the same indicator are not allowed.

Technical suggestion:

  1. Change cardinality of “target” and “actual” from 0…1 to 0…*
  2. Implement suggestions relating to Principle 7.2 (see Results: discussion space and TAG 2016/17 path ) so that baselines can also be disaggregated
  3. Update iatistandard.org to note that multiple target and actuals should only be used per period for disaggregation.
    In other words: each combination of dimension values should be unique within one period of an indicator.

The option for disaggregated data would be most welcome! We have currently also used the workaround of using nearly identifical indicators (see https://aiddata.rvo.nl/results/): We have standardized several indicators in the result indicator title field (we intentionally left the indicator discription field open for activity specific definitions or clarifications), and added - total / - female/ etc. to allow disaggregations. This does however lead to double-counting if someone would aggregate all indicators e.g. jobs supported.

In our system we work with 3 levels of disaggregation:

  1. the main indicator (e.g. jobs supported)
  2. disaggregation level 1 (e.g. direct & indirect jobs)
  3. disaggregation level 2 (e.g. female/male, young, etc.)
    If the system could allow for disaggregation this would be most helpful, especially if more levels (at least 2) of disaggregation would be possible.

Currently, you cannot have more than one target and actual per period.

You can.

This can be done by specifying multiple <period> elements within a single <indicator> with the same <period-start> and <period-end> dates - there is no Rule stating that these must be unique.

Hi @hayfield - thanks for the contribution - sorry I should have written tighter text - taking the whole text that refers to this in the Issues statement:

“Currently, you cannot have more than one target and actual per period. This means that it is not possible to disaggregate an indicator by more than one set of dimensions, specify other than by a technical workaround (see suggestions below). …
Suggestion: The current workaround is to have two near identical indicators (eg through “dimension”) (or periods of time within indicators) for the same result. This leads to confusion as there is no sure way to know which values should be considered as disaggregations versus those that belong to separate indicators. It also causes duplicate information for the rest of the indicator, adding an unnecessary source of potential error and reporting burden.”

It should instead read:

“Currently, you cannot have more than one target and actual per period. This means that it is not possible to disaggregate an indicator by more than one set of dimensions, specify other than by a technical workaround (see suggestions below). …
Suggestion: The current workaround is to have two near identical indicators (eg through “dimension”) (or periods of time within indicators) for the same result. This leads to confusion as there is no sure way to know which values should be considered as disaggregations or those that belong to separate indicators. It also causes duplicate information for the rest of the indicator, adding an unnecessary source of potential error and reporting burden.”

The point of the proposal is to remove unnecessary technical workarounds, and had already been discussed and agreed previously a year or so ago: Disaggregation of results

@mikesmith Thanks for this clarification, I’d missed the discussion previously.


Sooooo, with a couple of examples is the following understanding of the proposal correct? …

At present, dimensions disaggregated by indicator must be represented as:

<!-- Current format -->
<indicator>
  [...]
  <period>
    <period-start iso-date="2016-01-01" />
    <period-end iso-date="2016-03-31" />
    <actual value="90">
      <dimension name="sex" value="male" />
    </actual>
    <target value="100">
      <dimension name="sex" value="male" />
    </target>
  </period>
  <period>
    <period-start iso-date="2016-01-01" />
    <period-end iso-date="2016-03-31" />
    <actual value="70">
      <dimension name="sex" value="female" />
    </actual>
    <target value="100">
      <dimension name="sex" value="female" />
    </target>
  </period>
  <period>
    <period-start iso-date="2016-01-01" />
    <period-end iso-date="2016-03-31" />
    <actual value="65">
      <dimension name="sex" value="female" />
      <dimension name="age" value="adult" />
    </actual>
    <target value="100">
      <dimension name="sex" value="female" />
      <dimension name="age" value="adult" />
    </target>
  </period>
</indicator>

The proposal is to permit the following format:

<!-- Proposed format -->
<indicator>
  [...]
  <period>
    <period-start iso-date="2016-01-01" />
    <period-end iso-date="2016-03-31" />
    <actual value="90">
      <dimension name="sex" value="male" />
    </actual>
    <actual value="70">
      <dimension name="sex" value="female" />
    </actual>
    <actual value="65">
      <dimension name="sex" value="female" />
      <dimension name="age" value="adult" />
    </actual>
    <target value="100">
      <dimension name="sex" value="male" />
    </target>
    <target value="100">
      <dimension name="sex" value="female" />
    </target>
    <target value="100">
      <dimension name="sex" value="female" />
      <dimension name="age" value="adult" />
    </target>
  </period>
</indicator>

Assuming this is correct, a couple of questions (apologies for my lack of subject knowledge!)…

Background

Part of the suggestion is that:

there is no sure way to know which values should be considered as disaggregations or those that belong to separate indicators

Question 1

With the existing format, each of the <period> elements are contained within the same <indicator>. As such, in what way is it unclear whether the periods are disaggregations or something else?

Background

As part of the work on adding the Humanitarian Page to the Dashboard, it was determined that a 784 element truth table is required to answer the question of “Is this activity humanitarian?” at v2.02 (because sectors). From the perspective of data use, this status of having multiple ways to represent the same information is not good.

Question 2

In what way does the proposed format provide more information to the data user than the current format? (either based on the examples I provided or an alternative that better demonstrates the additional meaning that could be conveyed)

This topic has been included for consideration in the formal 2.03 proposal subject to agreement on outstanding questions relating to its use (see previous comment by @hayfield)

Thanks for moving the conversation on with this example @hayfield, yes that captures the proposal.

re: question 1

  1. Current practice requires everyone to know that where the period-start and period-end date match the values are part of the same disaggregation. It also means duplicating data which is asking for data quality issues.

  2. We’re frequently required to disaggregate by, for example, sex and disability status. While we can set “target” figures (use two dimensions) often in practice (eg due to programme design (eg not using individual surveys for each data collection), available resources, sensitivities etc.) we may not be able to ascertain disability status in all cases. This means we need to report both one dimension values and two dimension values for “actual” . Without the proposal, per 1. period cannot be used to delineate groupings of disaggregations - how else can we tell which target disaggregations correspond to which actual disaggregations? Eg how can I represent this before the proposal:

    [...]

in particular for the female only actual result - which target do I put it with - with or without for disability status? how do I represent that the other target is fulfilled with the same actual (do I double count by repeating the value?! Do I drop the disability status dimension?) etc. These considerations can be avoided by implementing the proposal.

re: question 2
3. if the proposal is adopted then period delineates the disaggregation in example 2. , you can’t otherwise as there’s then ambiguity over period usage with 1.
4. the proposal reduces potential for data quality issues through reducing the need to repeat information, or to loose information.

If I’ve understood properly I think this answers your questions @hayfield ?

This proposal has been been included in the 2.03 upgrade. It can be viewed in the following two Discuss posts: