My thanks to all contributors, and while I was not in the right place, I 
certainly got the answers I needed. My students will benefit, so thank you all.

Regards,
Michael Thompson M.Prof.Studies Data Science
09 975 4678
Senior Lecturer, Digital Technologies
Manukau Campus
We all, like sheep, have gone astray Isaiah 53
Personal profile: 
https://www.manukau.ac.nz/about/faculties-schools/business-and-information-technology/more-information-for-students/lecturer-profiles/michael-thompson

From: Bert Gunter [mailto:bgunter.4...@gmail.com]
Sent: Wednesday, 18 July 2018 3:02 AM
To: Roger Koenker <rkoen...@illinois.edu>
Cc: Michael Thompson <michael.thomp...@manukau.ac.nz>; r-help@r-project.org
Subject: Re: [R] Scaling - does it get any better results than not scaling?

Prof. Koenker's response probably settles the matter, but if not, this thread 
should really be taken offlist, as it is primarily about statistics and not R 
programming.
stats.stackexchange.com<http://stats.stackexchange.com> might be an alternative 
place to post; indeed, I suspect the issue has already been addressed in their 
archives.

Cheers,
Bert



Bert Gunter

"The trouble with having an open mind is that people keep coming along and 
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Tue, Jul 17, 2018 at 1:02 AM, Roger Koenker 
<rkoen...@illinois.edu<mailto:rkoen...@illinois.edu>> wrote:
In certain fields this sort of standardization has become customary based on 
some sort of (misguided) notion that it
induces “normality.”  For example, in anthropometric studies based on the 
international Demographic and Health
Surveys (DHS) childrens’ heights are often transformed to Z-scores prior to 
subsequent analysis under the dubious
presumption that variability around the Z-scores at various ages will be 
Gaussian.  In my experience this is rarely
justified, and analysts would be better off modeling the original data rather 
than doing the preliminary transformation.
This is discussed in further detail here:  
https://projecteuclid.org/euclid.bjps/1313973394.

> On Jul 17, 2018, at 5:53 AM, Michael Thompson 
> <michael.thomp...@manukau.ac.nz<mailto:michael.thomp...@manukau.ac.nz>> wrote:
>
> Hi,
> I seem to remember from classes that one effect of scaling / standardising 
> data was to get better results in any analysis. But what I'm seeing when I 
> study various explanations on scaling is that we get exactly the same 
> results, just that when we look at standardised data it's easier to see 
> proportionate effects.
> This is all very well for the data scientist to further investigate, but from 
> a practical point of view, (especially IF it doesn't improve the accuracy of 
> the result) surely it adds complication to 'telling the story'
> of the model to non-DS people?
> So, is scaling a technique for the DS to use to find effects, while 
> eventually delivering a non-scaled version to the users?
> I'd like to be able to give the true story to my students, not some fairy 
> story based on my misunderstanding. Hope you can help with this.
> Michael
>
> ______________________________________________
> R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
> UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org<mailto:R-help@r-project.org> mailing list -- To 
UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to