Jason, "with all the truncation of data going on" - ??
Not sure what you mean by that, but I users don't regard min-max values as a kind of "hard" range - it has never been intended to be that, except (at least in DHIS 1.4) where you specify a min-max to be ABSOLUTE. For everything else, it is a simple method to highlight possible outliers - for typing mistakes an easy fix, for collection/collation/transcribing mistakes often a more involved process (query sent back to staff etc). The complexity of correcting mistakes made during the manual data collection and collation process - often made worse by people tend to be stubborn about not admitting mistakes - is one reason for moving electronic data capture closer to the actual patient encounters. A typical example is South Africa's move from capturing monthly data per facility to capturing data into the DHIS on a daily basis per consulting room) Regards Calle On 19 September 2015 at 17:26, Jason Pickering <jason.p.picker...@gmail.com> wrote: > Hi Calle, > The problem is the premise upon which this algorithm is based is flawed, I > would say. There is really no reason to believe that the data is normally > distributed, or should be, unless of course it has been proved to be a > reliable and appropriate model. What we are seeking to do is to eliminate > outliers, based on a certain statistical model (i.e the normal > distribution). Problem is, the data is often not normally distributed. Just > as a quick example, I prepare a density plot of the skewness of all > OU/DE/COC combinations for a real database with a significant amount of > data over time, which should be fairly representative of a "real" DHIS > database. As a very trivial test of normality, we can examine the skewness > and see that in fact, showing that the tendency for the database is towards > positive skew, which is somewhat expected, as there are probably going to > be fewer "higher" values than "low" values for many data elements. Zero > skewness implied a perfectly normal distribution. > > I still think we need to carefully document what the min-max generation > function is actually doing. If it works for people, great, but with all of > the truncation of data going on, it may not really be clear to people how > these values are actually generated, nor what their limitation may be, as > well as to introduce an API endpoint for the min-max values to allow people > to generate these outside of the system, based on perhaps more appropriate > models than the normal distribution. > > Regards, > Jason > > > On Fri, Sep 18, 2015 at 9:02 PM, Calle Hedberg <calle.hedb...@gmail.com> > wrote: > >> Hi, >> >> Ah - bugger, I completely forgot about then zero or positive type, which >> provides the same effect (if set). my bad.. >> >> Jason's point is correct, but in my opinion less important for most types >> of routine data where the primary function of the min-max values is to >> highlight likely data capturing mistakes. >> >> Regards >> Calle >> >> On 18 September 2015 at 13:10, jason.p.pickering < >> 1065...@bugs.launchpad.net> wrote: >> >>> Hi there. The current design is to take the mean, and calculate >>> n-standard >>> deviations away from the mean, for a given data element/orgunit/catcombo >>> set of data values. If the data value is set to be zero or positive >>> integer, and can never have a negative value and does not follow a normal >>> distribution, then flooring the projected min/max at zero makes little >>> sense, if the distribution is not normal. Another distribution would be >>> required to determine what the accepted min/max actually are (logistical, >>> zero-inflated model, etc) if the actual distribution is not normal. >>> >>> But per the bug report, the application does what it is supposed to do, >>> namely calculate the theoretical min/max based on a stastical routine, >>> which itself may not be valid without confirming that the distribution in >>> question actually is normal or not. >>> >>> Regards, >>> Jason >>> >>> >>> On Fri, Sep 18, 2015 at 11:57 AM, Lars Helge Ă˜verland < >>> larshe...@gmail.com> >>> wrote: >>> >>> > This is not a design flaw. It depends on the data element value type >>> > property. The default value type is "number", for which negative values >>> > are perfectly valid. One can set the value type to "Positive number", >>> in >>> > this case the min-max values will never be less than zero. >>> > >>> > ** Changed in: dhis2 >>> > Status: Opinion => Invalid >>> > >>> > -- >>> > You received this bug notification because you are a member of DHIS 2 >>> > developers, which is subscribed to DHIS. >>> > https://bugs.launchpad.net/bugs/1065014 >>> > >>> > Title: >>> > Min/Max generation goes into negative >>> > >>> > Status in DHIS: >>> > Invalid >>> > >>> > Bug description: >>> > A very minor bug, but the min/max generation algorithm (which I >>> assume >>> > is some std. dev) sometimes leads the minimum to be a negative >>> number. >>> > Probably not an issue per se for data quality, as the alternative >>> > would be to set it to 0 (unless there is a reason why you would enter >>> > negative numbers), but the chart you get when you double-click a data >>> > entry field is then skewed and does not look very sensible. In >>> extreme >>> > cases, with a few very high values and a few months with very low (as >>> > when you have campaigns or hand-outs), the minimum can be down to >>> > minus a lot. >>> > >>> > To manage notifications about this bug go to: >>> > https://bugs.launchpad.net/dhis2/+bug/1065014/+subscriptions >>> > >>> > _______________________________________________ >>> > Mailing list: https://launchpad.net/~dhis2-devs >>> > Post to : dhis2-devs@lists.launchpad.net >>> > Unsubscribe : https://launchpad.net/~dhis2-devs >>> > More help : https://help.launchpad.net/ListHelp >>> > >>> >>> >>> -- >>> Jason P. Pickering >>> email: jason.p.picker...@gmail.com >>> tel:+46764147049 >>> >>> -- >>> You received this bug notification because you are a member of DHIS 2 >>> developers, which is subscribed to DHIS. >>> https://bugs.launchpad.net/bugs/1065014 >>> >>> Title: >>> Min/Max generation goes into negative >>> >>> Status in DHIS: >>> Invalid >>> >>> Bug description: >>> A very minor bug, but the min/max generation algorithm (which I assume >>> is some std. dev) sometimes leads the minimum to be a negative number. >>> Probably not an issue per se for data quality, as the alternative >>> would be to set it to 0 (unless there is a reason why you would enter >>> negative numbers), but the chart you get when you double-click a data >>> entry field is then skewed and does not look very sensible. In extreme >>> cases, with a few very high values and a few months with very low (as >>> when you have campaigns or hand-outs), the minimum can be down to >>> minus a lot. >>> >>> To manage notifications about this bug go to: >>> https://bugs.launchpad.net/dhis2/+bug/1065014/+subscriptions >>> >>> _______________________________________________ >>> Mailing list: https://launchpad.net/~dhis2-devs >>> Post to : dhis2-devs@lists.launchpad.net >>> Unsubscribe : https://launchpad.net/~dhis2-devs >>> More help : https://help.launchpad.net/ListHelp >>> >> >> >> >> -- >> >> ******************************************* >> >> Calle Hedberg >> >> 46D Alma Road, 7700 Rosebank, SOUTH AFRICA >> >> Tel/fax (home): +27-21-685-6472 >> >> Cell: +27-82-853-5352 >> >> Iridium SatPhone: +8816-315-19119 >> >> Email: calle.hedb...@gmail.com >> >> Skype: calle_hedberg >> >> ******************************************* >> >> >> _______________________________________________ >> Mailing list: https://launchpad.net/~dhis2-devs >> Post to : dhis2-devs@lists.launchpad.net >> Unsubscribe : https://launchpad.net/~dhis2-devs >> More help : https://help.launchpad.net/ListHelp >> >> > > > -- > Jason P. Pickering > email: jason.p.picker...@gmail.com > tel:+46764147049 > -- ******************************************* Calle Hedberg 46D Alma Road, 7700 Rosebank, SOUTH AFRICA Tel/fax (home): +27-21-685-6472 Cell: +27-82-853-5352 Iridium SatPhone: +8816-315-19119 Email: calle.hedb...@gmail.com Skype: calle_hedberg *******************************************
_______________________________________________ Mailing list: https://launchpad.net/~dhis2-devs Post to : dhis2-devs@lists.launchpad.net Unsubscribe : https://launchpad.net/~dhis2-devs More help : https://help.launchpad.net/ListHelp