11.2. Categories and custom dimensions

Certain requirements for data capture necessitates a fine-grained breakdown of the dimension describing the event being counted. For instance one would want to collect the number of “Malaria cases” broken down on gender and age groups, such as “female”, “male” and “< 5 years” and “> 5 years”. What characterizes this is that the breakdown is typically repeated for a number of “base” data elements: For instance one would like to reuse this break-down for other data elements such as “TB” and “HIV”. In order to make the meta-data more dynamic, reusable and suitable for analysis it makes sense to define the mentioned diseases as data elements and create a separate model for the breakdown attributes. This can be achieved by using the category model, which is described in the following.

The category model has three main elements which is best described using the above example:

  1. The category option, which corresponds to “female”, “male” and “< 5 years” and “> 5 years”.

  2. The category, which corresponds to “gender” and “age group”.

  3. The category combination, which should in the above example be named “gender and age group” and be assigned both categories mentioned above.

This category model is in fact self-standing but is in DHIS2 loosely coupled to the data element. Loosely coupled in this regard means that there is an association between data element and category combination, but this association may be changed at any time without loosing any data. It is however not recommended to change this often since it makes the database less valuable in general since it reduces the continuity of the data. Note that there is no hard limit on the number of category options in a category or number of categories in a category combination, however there is a natural limit to where the structure becomes messy and unwieldy.

A pair of data element and category combination can now be used to represent any level of breakdown. It is important to understand that what is actually happening is that a number of custom dimensions are assigned to the data. Just like the data element represents a mandatory dimension to the data values, the categories adds custom dimensions to it. In the above example we can now through the DHIS2 output tools perform analysis based on both “gender” and “age group” for those data elements, in the same way as one can perform analysis based on data elements, organisation units and periods.

This category model can be utilized both in data entry form designs and in analysis and tabular reports. For analysis purposes, DHIS2 will automatically produce sub-totals and totals for each data element associated with a category combination. The rule for this calculation is that all category options should sum up to a meaningful total. The above example shows such a meaningful total since when summarizing “Malaria cases” captured for “female < 5 years”, “male < 5 years”, “female > 5 years” and “male > 5 years” one will get the total number of “Malaria cases”.

For data capture purposes, DHIS2 can automatically generate tabular data entry forms where the data elements are represented as rows and the category option combinations are represented as columns. This will in many situations lead to compelling forms with a minimal effort. It is necessary to note that this however represents a dilemma these two concerns are sometimes not compatible. For instance one might want to quickly create data entry forms by using categories which does not adhere to rule of a meaningful total. We do however consider this a better alternative than maintaining two independent and separate models for data entry and data analysis.

An important point about the category model is that data values are persisted and associated with a category option combination. This implies that adding or removing categories from a category combination renders these combinations invalid and a low-level database operation much be done to correct it. It is hence recommended to thoughtfully consider which breakdowns are required and to not change them too often.