4.4 Minimum maximum outlier analysis

4.4.1 About minimum maximum value based outlier analysis

You can verify the data quality at the point of data entry by setting a minimum maximum value range for each data element. You create the value ranges manually or generate them automatically.

The auto-generated minimum maximum value range is suitable only for normally distributed data. DHIS2 will determine the arithmetic mean and standard deviation of all values for a given data element, category option, organisation unit and attribute combination. Then the system will calculate the minimum maximum value range based on the Data analysis std dev factor specified in the System Settings app.

For data which is highly-skewed or zero inflated (as is often the case with aggregate data), the values which DHIS2 auto-generates may not provide an accurate minimum maximum value range. This can lead to excessive false violations, for example if you analyse values related to seasonal diseases.

Note

Minimum maximum value ranges are calculated across all attribute combination options for a given data element, category option and organisation unit combination.

4.4.2 Workflow

  1. Create a minimum maximum value range, either automatically or manually.

    • In the Data Administration app, you generate value ranges automatically.

    • In the Data Entry app, you set value ranges manually for each field.

  2. In the Data Quality app, run the Min-max outlier analysis .

4.4.3 Configure a minimum maximum outlier analysis

4.4.3.1 Create minimum maximum value range automatically

Note

Auto-generated minimum maximum value ranges can be useful for many situations, but it’s recommended to verify that the data is actually normally distributed prior to using this function.

You generate minimum maximum value ranges calculated by data set in the Data Administration app. The new value ranges override any value ranges that the system has calculated previously.

  1. Set the Data analysis std dev factor :

    1. Open the System Settings app, and click General .

    2. In the Data analysis std dev factor field, enter a value.

      This sets the number of standard deviations to use in the outlier analysis. The default value is 2. A high value will catch less outlier values than a low value.

  2. Open the Data Administration app and click Min-max value generation .

  3. Select data set(s).

  4. Select an Organisation unit .

  5. Click Generate .

    New minimum maximum value ranges for all data elements in the selected data sets for all organisation units (including descendants) of the selected organisation units are generated.

4.4.3.2 Create minimum maximum value range manually

  1. In the Data Entry app, open a data entry form.

  2. Double-click the field for which you want to set the minimum maximum value range.

  3. Enter Min limit and Max limit .

  4. Click Save .

    If values don’t fall within the new value range the next time you enter data, the data entry cell will appear with an orange background.

  5. (Optional) Type a comment to explain the reason for the discrepancy, for example an event at a facility which may have generated a large number of clients.

  6. (Optional) Click Save comment .

Tip

Click the star icon to mark the value for further follow-up.

4.4.3.3 Delete minimum maximum value range

You can permanently delete all minimum maximum value ranges for selected data sets and organisation units in the Data Administration app.

  1. Open the Data Administration app and click Min-max value generation .

  2. Select data set(s).

  3. Select an Organisation unit .

  4. Click Remove .

4.4.4 Run a minimum maximum outlier analysis

  1. Verify that you’ve created minimum maximum value ranges.

  2. Open the Data Quality app and click Min-max outlier analysis .

  3. Select From date and To date .

  4. Select which data set(s) you want to include in the analysis.

  5. Select Parent organisation unit .

    All children of the organisation unit will be included. The analysis is made on raw data “under” the parent organisation unit, not on aggregated data.

  6. Click Start .

    The analysis process duration depends on the amount of data that is being analysed. If there are validation violations, they will be presented in a list.

  7. (Optional) Click Download as PDF , Download as Excel or Download as CSV to download the list in PDF, Excel or CSV formats.

Tip

Click the star icon to mark the value for further follow-up.