Ola Hodne Titlestad wrote:
Hi,
I've added a new blueprint here:
https://blueprints.launchpad.net/dhis2/+spec/improve-minmax-value-functionality
-which is about improving the min/max validation functionality. The
current solution is very basic and not sufficient in many ways. Here
are my thoughts on how to improve this. We can use this list for
discussion and then update the blueprint when we settle on something
concrete.
This is what I wrote in the blueprint:
A few improvements are needed to the min/max value functionality:
1) Generation of min/max values should be available from the data
administration module
Currently you need to generate min/max ranges for each orgunit/dataset
combination one by one in the data entry module. Sometimes you want to
generate ranges for all orgunits and datasets at once and then data
entry is not the place for this. In Data Administration we can add a
new menu heading called "Min/MAx validation" and in there we can allow
min/max generation for any combination of orgunit/dataset, and easily
allow all combinations to be selected. Maybe also a good idea to
include a "from" and "to" field to indicate which periods to use as the
basis for the generation, e.g. from 2008-01-01 to 2008-12-31 would
indicate that all 12 months of 2008 will be used if the dataset has
monthly period type, or the 4 quarters of 2008 will be used if
quarterly dataset etc.
Not sure which is the best way to do this, but one way could be to have
"Data quality" as an item under maintenance, where you can set ranges,
and also define and keep track of validation rules. Then, the "data
quality" page currently under services could be split, so that you take
the definition-side of it to maintenance, and the report-side of it as
a subitem in the reports menu.
2) User defined parameters that control how the generation is
done.
Currently the range values are set to 10% lower than the lowest value
and 10% higher than the highest value, which is a very crude method.
This does not take care of outliers that might already be in the system.
any suggestions for a better statistical method for this? And on how to
make it user defined?
Use some factor of standard deviation. That will take care of spread.
+/- 10 % will not work for malaria, for instance, as it fluctuates
naturally over the year, due to rainy season. I don't have here my copy
of the infamous "Statistical concepts and methods" by Bhattacharyya and
Johnson, arguably the most boring book in the world, but this would do
for an explanation: http://en.wikipedia.org/wiki/Standard_deviation.
Then, as I think it is in DHIS 1.4, you can set the factor to calculate
from, for instance 1.5, making the min and max the mean - 1.5 x st.dev
and the mean + 1.5 x st.dev, respectively.
3) I assume we would like to keep the generate min/max option in
data entry which can be useful for users that do not deal with all, but
just a limited number of orgunits and know that a new round of
generation would correct the min/max ranges. But thsi generation should
then be configured in a setting, especially how many periods to use. So
we could add another property in Data Administration->min/max
validation that defines how many periods to use as basis for the
generation, for monthly, weekly, yearly etc. period types. Do we need
one property per period type? Currently this property is hard-coded to
6 in the source code.
4) Default min/max range per data element
Normally a min/max range is linked to an orgunit/dataelement
combination, but sometimes, e.g when there is very little data or very
poor data quality in the system it is useful to have a default range
that can be used for all orgunits as a first level of validation to
avoid typos and crazy outliers. These default values need to be set
somewhere, and maybe data set management is the best suited place for
this, at least that is where it is located in DHIS 1.4. Here we need
some functionality to quickly set these ranges, even as quick as
setting the same range for all data elements in a dataset, and then
also the possibility to adjust individual data elements in the data
(set) element list.
In Data entry the procedure will be to first check whether a min/max
range exists for the orgunit/data element (the best option) and if not
then load the default range for the data element (the next best
option), and if nothing is set then leave it blank (the worst option).
I concur. In both Sierra Leone and Botswana, setting ranges for
individual facilities, for all data elements, has just created a lot of
extra work for the districts, which are not really aware of how the
process works. So this has so far been skipped in Sierra Leone. As we
want some kind of warning (colour coding and/or pop-ups), this can
create a great deal of frustration until the ranges are correctly set,
and also there are some wild typos where it looks like people have
fallen asleep on the keyboard, which we want to avoid. It would then
make sense to be able to set some global range default.
Johan
best regards,
Ola Hodne Titlestad
HISP
University of Oslo
_______________________________________________
Mailing list: https://launchpad.net/~dhis2-devs
Post to : [email protected]
Unsubscribe : https://launchpad.net/~dhis2-devs
More help : https://help.launchpad.net/ListHelp
|