On 18/10/2019 23:57, DL Neil wrote: > On 17/10/19 7:52 AM, MRAB wrote: >> On 2019-10-16 19:43, duncan smith wrote: >>> On 16/10/2019 04:41, DL Neil wrote: >>>> On 16/10/19 1:55 PM, duncan smith wrote: >>>>> On 15/10/2019 21:36, DL Neil wrote: >>>>>> On 16/10/19 12:38 AM, Rhodri James wrote: >>>>>>> On 14/10/2019 21:55, DL Neil via Python-list wrote: >>>>>> ... >>>>>> So, yes, the "label" is unimportant - except to politicians and >>>>>> statisticians, who want precise answers from vague collections of >>>>>> data... (sigh!) >>>>>> >>>>> >>>>> [snip] >>>>> >>>>> No not (real) statisticians. People often want us to provide precise >>>>> answers, but they don't often get them. >>>>> >>>>> "It ain’t what you don’t know that gets you into trouble. It’s what >>>>> you >>>>> know for sure that just ain’t so." (Mark Twain - perhaps) >>>> >>>> +1 >>>> >>>> Although, you've undoubtedly heard people attempt to make claims of >>>> having 'accurate figures' (even, "that came from Stats") when you told >>>> them that the limitations and variations rendered the exercise >>>> laughable... >>>> >>>> My favorite (of the moment) is a local computer store who regularly >>>> offer such gems as: (underneath the sales (web-) page for an upmarket >>>> *desktop* computer) "people who bought this also bought" followed >>>> by at >>>> least two portable PC carry cases. They must be rather large >>>> carry-bags! >>>> (along with such surprises as keyboard, mouse, ...) >>>> >>>> This morning I turned-down a study for a political group. One study has >>>> already been completed and presented. The antagonist wanted an A/B >>>> comparison (backing his 'side', of course). I mildly suggested that I >>>> would do it, if he'd also pay me to do an A/B/C study, where 'C' was a >>>> costing - the economic opportunity cost of 'the people' waiting for >>>> 'the >>>> government' to make a decision - (and delaying that decision by waiting >>>> for "study" after "study" - The UK and their (MPs') inability to decide >>>> "Brexit" a particularly disastrous illustration of such) >>>> >>>> >>>> Sorry, don't want to incur the anger of the list-gods - such >>>> calculations would be performed in Python (of course) >>> >>> Clearly, all such analyses should be done in Python. Thank God for rpy2, >>> otherwise I'd have to write R code. It's bad enough having to read it >>> occasionally to figure out what's going on under the hood (I like >>> everything about R - except the syntax). >>> > I have too many examples of people ignoring random variation, testing >>> hypotheses on the data that generated the hypotheses, shifting the >>> goalposts, using cum / post hoc ergo propter hoc reasoning, assuming >>> monocausality etc. In some areas these things have become almost >>> standard practice (and they don't really hinder publication as long as >>> they are even moderately well hidden). Of course, it's often about >>> policy promotion, and the economic analyses can be just as bad (e.g. >>> comparing the negative impacts of a policy on the individual with the >>> positive impacts aggregated over a very large population). And if it's >>> about policy promotion a press release is inevitable. So we just need to >>> survey the news media for specific examples. Unfortunately there's no >>> reliable service for telling us what's crap and what isn't. (Go on, >>> somebody pay me, all my data processing / re-analysis will be in Python >>> ;-).) >>> >> Even when using Python, you have to be careful: >> >> Researchers find bug in Python script may have affected hundreds of >> studies >> https://arstechnica.com/information-technology/2019/10/chemists-discover-cross-platform-python-scripts-not-so-cross-platform/ > > > > I think both of our 'Python' comments were made tongue-in-cheek. Sadly > the tool won't guarantee the result... > > > At my first research project, before I'd even completed my first degree, > I noticed a similar fault in some code*. There was I, the youngest, > newest, least-est member of staff, telling the prof/boss and all the > other researchers that they'd made a serious error, upon which various > papers had been based plus a white-paper for government consideration. > Oops! > > (Basic-Plus on DEC PDP/Vax-en introduced a 'virtual storage array', ie > on-disk cf in-RAM. However, it did not wipe the disk-space prior to use > (whereas arrays were zero-ed, IIRC). Thus, random data purporting to be > valid data-entered. Once corrected and re-run "my results" (as they were > termed - not sure if insult or compliment) were not hugely different > from the originals). > > All we can do, is add some checks-and-balances rather than relying on > 'the computer'. > > Upon which point: those of us who learned 'complicated math' with the > aid of a slide-rule, employ a technique of mentally estimating the > result in both the first first few digits and scale - and thus noticing > any completely incongruous 'result'. Even with lessons in "The > Scientific Approach" am not aware that the 'calculator' or 'computer > generations' were/are taught such 'common sense'...
I always remember the Hubble mirror fiasco, where the problem could have been detected using a tape measure. Also, from my days in the building industry "measure twice, cut once". As far as claims in (social) scientific publications are concerned I always tell e.g. PhD students to assume nothing and check everything. I only recently discovered that this is pretty much the same as the ABC of policing, "assume nothing, believe no-one, challenge everything". Duncan -- https://mail.python.org/mailman/listinfo/python-list