Hi "Calle is right here - we do average, then calculate std dev and set the upper and lower bounds for each value. We use data from ALL available time periods to calculate this (period org unit, data element, option combo)."
Here and there and back again :-) So I wasn't off the reservation, then. We have used the normal distribution like this in DHIS 1.x for around 17 years, and it fits the majority of data elements. In general, this distribution model handles random outbreaks and disruptions reasonably well, since the impact of such outliers are dampened. Data elements representing conditions or services with strong seasonal variation do not fit so well, and some very particular issues like "Male condoms distributed" tend to vary so much that the min/max is generally disregarded (outliers here also matter a lot less - when you distribute 1-2 billion condoms annually, an error of a few thousand does not matter). In DHIS 1.4 there is also a function for setting absolute min-max values - most typically used for data elements where e.g. only 0 and 1 are valid values. For such cases, statistically calculating min-max is obviously irrelevant. I don't like the use of ALL available time periods, though, since a large number of health facilities will see significant changes in their patient mix and patient numbers over let us say a 10 year period. We have found that 12-18 months provide a good compromise. So there are still some room for improvement. Regards Calle On 20 April 2015 at 16:15, Jason Pickering <jason.p.picker...@gmail.com> wrote: > Good. I probably should have known that already, thus why I had to do some > statistical analysis outside of DHIS2 to actually calculate reasonable min > max. A quick check of the validity of a normal distribution, can be with > the skewness and kurtosis , which provide a idea of how "tilted" a given > distribution is. > > https://www.dhis2.org/doc/snapshot/en/developer/html/apas06.html > > Anyway, support for import via the API would be good. > > Regards, > Jason > > On Mon, Apr 20, 2015, 16:06 Lars Helge Øverland <larshe...@gmail.com> > wrote: > >> Hi there, >> >> Calle is right here - we do average, then calculate std dev and set the >> upper and lower bounds for each value. >> >> We use data from ALL available time periods to calculate this (period org >> unit, data element, option combo) >> >> Mind you we should not really debate whether to use standard deviations >> or not, rather if we should support additional _distributions_ to better >> handle different kinds of data. We currently use the normal distribution >> <http://en.wikipedia.org/wiki/Normal_distribution>. >> >> Rodolfo - supporting min-max in the Web API is a good idea to allow for >> third-party tools - feel free to write a blueprint. >> >> regards, >> >> Lars >> >> >> >> >> >> -- ******************************************* Calle Hedberg 46D Alma Road, 7700 Rosebank, SOUTH AFRICA Tel/fax (home): +27-21-685-6472 Cell: +27-82-853-5352 Iridium SatPhone: +8816-315-19274 Email: calle.hedb...@gmail.com Skype: calle_hedberg *******************************************
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : dhis2-devs@lists.launchpad.net Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp