This is more a question about principal components analysis than about R. You
have 4 variables and they are moderately correlated with one another (weight
and hole are only .2). When the data consist of measurements, this usually
suggests that the overall size of the object is being partly measured by each
variable. In your case object size is measured by the first principle component
(PC1) with larger objects having more negative scores so larger objects are on
the left and smaller ones are on the right of the biplot.
The biplot can only display 2 of the 4 dimensions of your data at one time. In
the first 2 dimensions, diam and height are close together, but in the 3rd
dimension (PC3), they are on opposite sides of the component. If you plot
different pairs of dimensions (e.g. 1 with 3 or 2 with 3, see below), the
arrows will look different because you are looking from different directions.
> pca
Standard deviations:
[1] 1.5264292 0.8950379 0.7233671 0.5879295
Rotation:
PC1 PC2 PC3 PC4
height -0.5210224 -0.06545193 0.80018012 -0.2897646
diam -0.5473677 0.06309163 -0.57146893 -0.6081376
hole -0.4598646 -0.70952862 -0.17476677 0.5045297
weight -0.4663141 0.69878797 -0.05090785 0.5400508
> biplot(pca, choices=c(1, 3))
> biplot(pca, choices=c(2, 3))
-------------------------------------
David L Carlson
Department of Anthropology
Texas A&M University
College Station, TX 77840-4352
-----Original Message-----
From: R-help [mailto:[email protected]] On Behalf Of Denis Francisci
Sent: Friday, March 10, 2017 4:45 AM
To: R-help Mailing List <[email protected]>
Subject: [R] problem with PCA
Hi all.
I'm newbie in PCA by I don't understand a behaviour of R.
I have this data matrix:
>mx_fus
height diam hole weight
1 2.3 3.5 1.1 18
2 2.0 3.5 0.9 17
3 3.8 4.3 0.7 34
4 2.1 3.4 0.9 15
5 2.3 3.8 1.0 19
6 2.2 3.8 1.0 19
7 3.2 4.4 0.9 34
8 3.0 4.3 1.0 30
9 2.8 3.9 0.9 21
10 3.3 4.2 1.1 33
11 2.3 3.9 0.9 25
12 2.3 3.3 0.5 17
13 0.9 2.4 0.4 10
14 1.4 2.4 0.5 10
15 2.2 3.6 0.7 22
16 2.9 3.8 0.8 30
17 2.9 3.5 0.6 27
18 2.3 3.5 0.5 24
19 1.8 2.3 0.5 29
20 1.4 2.5 0.6 34
21 0.8 2.3 0.6 21
22 1.8 2.4 0.6 23
23 1.5 2.2 0.6 7
24 0.9 1.7 0.4 14
25 2.1 2.2 0.5 25
26 1.3 2.4 0.6 33
27 1.3 2.7 0.4 39
28 0.5 2.2 0.5 13
29 1.4 4.2 0.8 23
30 1.6 2.0 0.4 30
31 1.4 2.2 0.6 25
32 1.8 2.5 0.6 28
33 1.4 2.6 0.6 41
34 1.6 2.3 0.3 32
35 1.6 2.5 0.5 41
36 2.8 2.9 0.8 47
37 0.6 2.5 0.8 21
38 1.6 2.8 0.7 13
39 1.7 3.3 0.8 17
40 1.6 3.9 1.9 20
41 1.4 4.7 0.9 26
42 1.2 4.2 0.7 21
43 3.5 4.2 0.9 47
44 2.3 3.6 0.7 24
45 2.3 3.4 0.4 21
46 1.9 2.6 0.7 14
47 1.9 3.0 0.7 15
48 2.7 3.7 0.9 26
49 3.0 3.8 0.7 35
50 1.2 2.0 0.7 5
51 1.6 2.5 0.5 15
52 1.3 2.6 0.5 16
53 2.5 3.9 0.9 32
54 0.9 3.3 0.6 9
55 1.8 2.4 0.5 17
56 2.4 3.7 1.1 30
57 2.1 3.5 1.1 22
58 2.6 3.9 1.0 38
59 2.6 3.6 1.0 27
60 2.6 4.1 1.0 34
61 2.9 3.6 0.8 32
62 2.6 3.3 0.7 22
63 1.8 2.5 0.7 26
64 3.0 2.8 1.3 2
65 0.5 2.2 0.4 3
66 1.9 3.4 0.7 14
67 1.4 3.8 0.9 18
68 2.0 4.0 1.0 30
69 3.1 4.0 1.3 21
70 2.5 4.0 0.8 19
71 2.5 4.5 1.0 20
72 1.8 3.5 1.4 18
73 2.1 3.5 1.4 25
74 1.5 2.6 0.5 9
75 2.8 3.2 1.2 16
76 1.0 5.0 0.3 32
77 0.3 5.8 0.5 56
78 0.5 1.5 0.2 1
79 0.7 1.4 0.2 1
80 0.5 1.3 0.2 1
81 0.7 3.3 0.4 7
82 1.9 4.7 1.0 24
83 3.1 4.2 0.9 49
84 2.8 3.6 0.7 28
85 2.7 3.2 0.7 29
86 3.0 4.0 0.9 36
87 1.7 2.7 0.7 14
88 1.5 2.9 0.7 18
89 2.9 3.5 0.7 30
90 3.0 3.4 0.8 30
91 2.0 2.8 0.5 14
92 2.4 3.5 0.7 24
93 0.8 4.1 0.6 12
94 1.7 2.5 0.5 23
95 1.4 2.4 0.8 31
96 1.5 2.7 0.4 20
97 2.6 3.7 0.6 31
98 2.6 3.0 0.6 18
99 2.5 5.0 0.7 40
100 2.5 3.7 0.5 30
101 2.4 2.9 0.7 17
102 2.3 3.0 0.5 15
103 2.2 3.3 0.6 19
104 1.5 2.1 0.5 5
105 2.0 2.2 0.5 10
106 2.6 3.5 0.6 26
107 2.3 3.0 0.6 15
108 2.5 4.5 0.7 40
109 2.1 3.1 0.5 15
110 1.3 2.1 0.8 14
111 0.8 2.5 0.2 5
112 0.6 3.1 0.7 8
I perform a PCA in R
>pca<-prcomp(mx_fus,scale=TRUE)
>biplot(pca, choices = c(1,2), cex=0.7)
The biplot put the arrows of diam and height very near on the first
component axis.
So I understand that these 2 variables are well represented in the PC1 and
they are correlated each other.
But if I test the correlation, the value o correlation coefficient is low
>cor(mx_fus[,1],mx_fus[,2])
0.4828185
Why the plot says a thing and correlation function says the opposite?
Two near arrows don't represent a strong correlation between the 2
variables (as I read in some manuals), but only with the component axis?
Than's in advance
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.