Re: [R] Statistical analysis of olive dataset

Michael Friendly Sun, 13 Mar 2016 08:27:40 -0700

On 3/12/2016 12:39 PM, Axel wrote:

The main goal of my analysis is to
determine which are the fatty acids that characterize the origin of an oil. As
a secondary goal, I wolud like to insert the results of the chemical analysis
of an oil that I analyzed (I am a Chemistry student) in order to determine its
region of production. I do not know if this last thing is possibile.

There are already plenty of tools for this; don't bother trying tore-invent an already well-working wheel.

* PCA + a biplot will give you a good overview. With groups, Irecommend ggbiplot, with data ellipses for the groups.

This shows clear separation along PC1

data(olive, package="tourr")
library(ggbiplot)
olivenum <- olive[,c(3:10)]

olive.pca <- prcomp(olivenum, scale.=TRUE)
summary(olive.pca)

# region should be a factor (area has 9 levels, maybe too confusing)
olive$region <- factor(olive$region, labels=c("North", "Sardinia", "South"))

ggbiplot(olive.pca, obs.scale = 1, var.scale = 1,
         groups = olive$region, ellipse = TRUE, varname.size=4,
         circle = TRUE) +
         theme_bw() +
         theme(legend.direction = 'horizontal',
               legend.position = 'top')


* Discrimination among regions by chemical composition:
A canonical discriminant analysis will show you this in
a low-rank view.  The biggest difference is between the North
vs. the other 2.


# MLM
olive.mlm <- lm(as.matrix(olive[,c(3:10)]) ~ olive$region, data=olive)

# Canonical discriminant analysis

# (need devel. version for ellipses)
# install.packages("candisc", repos="http://R-Forge.R-project.org";)
library(candisc)
olive.can <- candisc(olive.mlm)
olive.can
plot(olive.can, ellipse=TRUE)

* You can probably use the predict() method for MASS::lda() to predict
the class for new samples.

hope this helps,
-Michael

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Statistical analysis of olive dataset

Reply via email to