>>>>> Rich Shepard >>>>> on Mon, 22 Jan 2024 07:45:31 -0800 (PST) writes:
> A statistical question, not specific to R. I'm asking for > a pointer for a source of definitive descriptions of what > types of data are best summarized by the arithmetic, > geometric, and harmonic means. In spite of off-topic: I think it is a good question, not really only about geo-chemistry, but about statistics in applied sciences (and engineering for that matter). Something I sure good applied statisticians in the 1980's and 1990's would all know the answer of : To use the geometric mean instead of the arithmetic mean is basically *equivalent* to first log-transform the data and then work with that transformed data: Not just for computing average, but for more relevant modelling, inference, etc. John W Tukey (and several other of the grands of the time) had the log transform among the "First aid transformations": If the data for a continuous variable must all be positive it is also typically the case that the distribution is considerably skewed to the right. In such a case behave as a good human who sees another human in health distress: apply First Aid -- do the things you learned to do quickly without too much thought, because things must happen fast ---to hopefully save the other's life. Here: Do log transform all such variables with further ado, and only afterwards start your (exploratory and more) data analysis. Now, mean(log(y)) = log(geometricmean(y)), where mean() is the arithmetic mean as in R {mathematically; on the computer you need all.equal(), not '==' !!} I.e., according to Tukey and all the other experienced applied statisticians of the past, the geometric mean is the "best thing" to do for such positive right-skewed data in the same sense that the log-transform is the best "a priori" transformation for such data -- with the one advantage even that you need to fiddle with zeroes when log-transforming, whereas the geometric mean works already for zeroes. Martin > As an aquatic ecologist I see regulators apply the > geometric mean to geochemical concentrations rather than > using the arithmetic mean. I want to know whether the > geometric mean of a set of chemical concentrations (e.g., > in mg/L) is an appropriate representation of the expected > value. If not, I want to explain this to non-technical > decision-makers; if so, I want to understand why my > assumption is wrong. > TIA, > Rich > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and > more, see https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html and provide > commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.