21.12 Manage predictors

21.12.1 About predictors

A predictor tells DHIS2 how to generate a data value based on data values from past periods and/or the period of the data value. It defines which past periods to sample, and how to combine the data to produce a predicted value. A predictor always generates an aggregate data value, but the past data values used to calculate the predicted value may come from aggregate data, event data, or both.

A simple use of predictors would be to copy a past period data value into a new period, for example into the next month, or into the same quarter in the next year. A more complex use of predictors would be for disease surveillance, to predict what value would be expected in a given week or month of the year, based on previous data values. A validation rule could then be used to see how the actual value compares with the expected (predicted) value.

You can specify the organisation unit level(s) for which a predictor will generate values. For example in disease surveillance you can use one predictor to give the expected value at each local facility, given the amount of variation you would expect at a single facility, while using a different predictor to estimate the value you would expect summed over all facilities in a district, given the (smaller) proportional variation that you would expect when adding up the values for all facilities in the district. You could also define additional predictors at any higher levels of the organisation unit hierarchy, where you might expect different proportions of variation. Alternatively, you can define a single predictor for all these levels and use the standard deviation function to determine what amounts of deviation were measured at each level.

In the Maintenance app, you manage the following predictor objects:

Predictor objects in the Maintenance app

Object type

Available functions

Predictor

Create, edit, clone, delete, show details and translate

21.12.2 Sampling past periods

Predictors can generate data values for periods that are in the past, present, or future. These values are based on data sampled from periods before the predicted period, and/or data from the predicted period. When you use data sampled from past periods (periods before the predicted period), several parameters determine the choice of which past periods to sample from:

21.12.2.1 Sequential sample count

A predictor’s Sequential sample count gives the number of immediate prior periods to sample. For example, if a predictor’s period type is Weekly and the Sequential sample count is 4, this means to sample four prior weeks immediately preceding the predicted value week. So the predicted value for week 9 would use samples from weeks 5, 6, 7, and 8:

If a predictor’s period type is Monthly and the Sequential sample count is 4, this means to sample four prior months immediately preceding the predicted value month. So the predicted value for May would use samples from weeks January, February, March, and April:

The Sequential sample count can be greater than the number of periods in a year. For example, if you want to sample the 24 months immediately preceding the predicted value month, set the Sequential sample count to 24:

21.12.2.2 Sequential skip count

A predictor’s Sequential skip count tells how many periods should be skipped immediately prior to the predicted value period, within the Sequential sample count . This could be used, for instance, in outbreak detection to skip one or more immediately prededing samples that might in fact contain values from the beginning of an outbreak that you are trying to detect.

For example, if the Sequential sample count is 4, but the Sequential skip count is 2, then the two samples immediately preceding the predicted period will be skipped, resulting in only two periods being sampled:

21.12.2.3 Annual sample count

A predictor’s Annual sample count gives the number of prior years for which samples should be collected at the same time of year. This could be used, for instance, for disease surveillance in cases where the expected incidence of the disease varies during the year and can best be compared with the same relative period in previous years. For example, if the Annual sample count is 2 (and the Sequential sample count is zero), then samples would be collected from periods in the immediately preceding two years, at the same time of year.

21.12.2.4 Sequential and annual sample counts together

You can use the sequential and annual sample counts together to collect samples from a number of sequential periods over a number of past years. When you do this, samples will be collected in prior years during the period at the same time of year as the predicted value period, and also in previous years both before and after the same time of year, as determined by the Sequential sample count number.

For example, if the Sequential sample count is 4 and the Annual sample count is 2, samples will be collected from the 4 periods immediately preceding the predicted value period. In addition samples will be collected in the prior 2 years for the corresponding period, as well as 4 periods on either side:

21.12.2.5 Sequential, annual, and skip sample counts together

You can use the Sequential skip count together with the sequential and annual sample counts. When you do this, the Sequential skip count tells how many periods to skip in the same year as the predicted value period. For example, if the Sequential sample count is 4 and the Sequential skip count is 2, then the two preiods immediatly preceding the predicted value period period will be skipped, but the two periods before that will be sampled:

If the Sequential skip count is equal to or greater than the Sequential sample count , then no samples will be collected for the year containing the predicted value period; only periods from past years will be sampled:

21.12.2.6 Sample skip test

You can use the Sample skip test to skip samples from certain periods that would otherwise be included, based on the results of testing an expression within those periods. This could be used, for instance, in disease outbreak detection, where the sample skip test could identify previous disease outbreaks, to exclude those samples from the prediction of a non-outbreak baseline expected value.

The Sample skip test is an expression that should return a value of true or false, to indicate whether or not the period should be skipped. It can be an expression that tests any values in the preiovus period. For example, it could test for a data value that was explicitly entered to indicate that a previous period should be skipped. Or it could compare a previously predicted value for a period with the actual value recorded for that period, to determine if that period should be skipped.

Any periods for which the Sample skip test is true will not be sampled. For example:

21.12.3 Create or edit a predictor

  1. Open the Maintenance app and click Other > Predictor .

  2. Click the add button.

  3. In the Name field, type the predictor name.

  4. (Optional) In the Code field, assign a code.

  5. (Optional) Type a Description .

  6. Select an Output data element . Any value generated by this predictor is stored as a value of this data element. The value is rounded according to the value type of the data element: If the value type is an integer type, the predicted value is rounded to the nearest integer. For all other value types, the number is rounded to four significant digits. (However if there are more than four digits to the left of the decimal place, they are not replaced with zeros.)

  7. Select a Period type .

  8. Assign one or more organisation unit levels. The ouput value will be assigned to an organisation unit at this level (or these levels). The input values will come from the organisaiton unit to which the output is assigned, or from any level lower under the output organisation unit.

  9. Create a Generator . The generator is the expression that is used to calculate the predicted value.

    1. Type a Description of the generator expression.

    2. Enter the generator expression. You can build the expression by selecting data elements for aggregate data, or program data elements, attributes or indicators. Organisation unit counts are not yet supported.

      To use sampled, past period data, you should enclose any items you select in one of the following aggregate functions:

      Aggregate function

      Means

      AVG

      Average (mean) value

      COUNT

      Count of the data values

      MAX

      Maximum value

      MEDIAN

      Median value

      MIN

      Minimum value

      STDDEV

      Standard deviation

      SUM

      Sum of the values

      Any items inside an aggregate function will be evaluated for all sampled past periods, and then combined according to the formula inside the aggregate function. Any items outside an aggregate function will be evaluated for the period in which the prediction is being made.

      You can build more complex expressions by clicking on (or typing) any of the elements below the expression field: ( ) * /

        • Days. Constant numbers may be added by typing them. The Days option inserts [days] into the expression which resolves to the number of days in the period from which the data came.

      You can also use the following functions in your expression, either inside or containing aggregate functions, or independent of them:

      Function

      Means

      IF(test, valueIfTrue, valueIfFalse)

      Evaluates test which is an expression that evaluates to a boolean value – see Boolean expression notes below. If the test is true , returns the valueIfTrue expression. If it is false , returns the valueIfFalse expression.

      ISNULL(item)

      Returns the boolean value true if the item is null (missing), otherwise returns false . The item can be any selected item from the right (data element, program data element, etc.).

      Boolean expression notes: A boolean expression must evaluate to true or false . The following operators may be used to compare two values resulting in a boolean expression: <, >, !=, ==, >=, and <=. The following operators may be used to combine two boolean expressions: && (logical and), and || (logical or). The unary operator ! may be used to negate a boolean expression.

      Generator expression examples:

      Generator expression

      Means

      SUM(#{FTRrcoaog83.tMwM3ZBd7BN})

      Sum of the sampled values of data element FTRrcoaog83 and category option combination (disaggregation) tMwM3ZBd7BN

      AVG(#{FTRrcoaog83}) + 2 * STDDEV(#{FTRrcoaog83})

      Average of the sampled values of of data element FTRrcoaog83 (sum of all disaggregations) plus twice its standard deviation

      SUM(#{FTRrcoaog83}) / SUM([days])

      Sum of all sampled values of data element FTRrcoaog83 (sum of all disaggregations) divided by the number of days in all sample periods (resulting in the overall average daily value)

      SUM(#{FTRrcoaog83}) + #{T7OyqQpUpNd}

      Sum of all sampled values of data element FTRrcoaog83 plus the value of data element T7OyqQpUpNd in the period being predicted for

      1.2 * #{T7OyqQpUpNd}

      1.2 times the value of data element T7OyqQpUpNd in the period being predicted for

      IF(ISNULL(#{T7OyqQpUpNd}), 10, 20)

      If the data element T7OyqQpUpNd is null, then 10, otherwise 20.

  10. (Optional) Create a Sample skip test . The sample skip test tells which previous periods if any to exclude from the sample.

    1. Type a Description of the skip test.

    2. Enter the sample skip test expression. You can build the expression by selecting data elements for aggregate data, or program data elements, attributes or indicators. Organisation unit counts are not yet supported. As with the generator function, you may click on (or type) any of the elements below the expression field: ( ) * / + - Days. The functions IF() and ISNULL() as described above may also be used.

      The expression must evaluate to a boolean value of true or false . See Boolean expression notes above.

      Skip test expression examples:

      Skip test expression

      Means

      #{FTRrcoaog83} > #{M62VHgYT2n0}

      The value of data element FTRrcoaog83 (sum of all disaggregations) is greater than the value of data element M62VHgYT2n0 (sum of all disaggregations)

      #{uF1DLnZNlWe} > 0

      The value of data element uF1DLnZNlWe (sum of all disaggregations) is greater than the zero

      #{FTRrcoaog83} > #{M62VHgYT2n0} || #{uF1DLnZNlWe} > 0

      The value of data element FTRrcoaog83 (sum of all disaggregations) is greater than the value of data element M62VHgYT2n0 (sum of all disaggregations) or the value of data element uF1DLnZNlWe (sum of all disaggregations) is greater than the zero

  11. Enter a Sequential sample count value.

    This is for how many sequential periods the calculation should go back in time to sample data for the calculations.

  12. Enter an Annual sample count value.

    This is for how many years the calculation should go back in time to sample data for the calculations.

  13. (Optional) Enter a Sequential skip count value.

    This is how many sequential periods, immediately preceding the predicted value period, should be skipped before sampling the data.

  14. Click Save .

21.12.4 Clone metadata objects

Cloning a data element or other objects can save time when you create many similar objects.

  1. Open the Maintenance app and find the type of metadata object you want to clone.

  2. In the object list, click the options menu and select Clone .

  3. Modify the options you want.

  4. Click Save .

21.12.5 Delete metadata objects

Note

You can only delete a data element and other data element objects if no data is associated to the data element itself.

Warning

Any data set that you delete from the system is irrevocably lost. All data entry forms, and section forms which may have been developed will also be removed. Make sure that you have made a backup of your database before deleting any data set in case you need to restore it at some point in time.

  1. Open the Maintenance app and find the type of metadata object you want to delete.

  2. In the object list, click the options menu and select Delete .

  3. Click Confirm .

21.12.6 Display details of metadata objects

  1. Open the Maintenance app and find the type of metadata object you want to view.

  2. In the object list, click the options menu and select Show details .

21.12.7 Translate metadata objects

DHIS2 provides functionality for translations of database content, for example data elements, data element groups, indicators, indicator groups or organisation units. You can translate these elements to any number of locales. A locale represents a specific geographical, political, or cultural region.

Tip

To activate a translation, open the System Settings app, click > > Appearance and select a language.

  1. Open the Maintenance app and find the type of metadata object you want to translate.

  2. In the object list, click the options menu and select Translate .

    Tip

    If you want to translate an organisation unit level, click directly on the Translate icon next to each list item.

  3. Select a locale.

  4. Type a Name , Short name and Description .

  5. Click Save .