rm(list=ls()) yx.df<-read.csv("c:/MK-2-72.csv",sep=',',header=T,dec='.') dim(yx.df) #get X matrix y<-yx.df[,1] x<-yx.df[,2:643] #conver to matrix mat<-as.matrix(x) #get row number rownum<-nrow(mat) #remove the constant parameters mat1<-mat[,apply(mat,2,function(.col)!(all(.col[1]==.col[2:rownum])))] dim(yx.df) dim(mat1) #remove columns with numbers of zero >0.95 mat2<-mat1[,apply(mat1,2,function(.col)!(sum(.col==0)/rownum>0.95))] dim(yx.df) dim(mat2) #remove colunms that sd<0.5 mat3<-mat2[,apply(mat2,2,function(.col)!all(sd(.col)<0.5))] dim(yx.df) dim(mat3) #PCA analysis mat3.pr<-prcomp(mat3,cor=T) summary(mat3.pr,loading=T) pre.cmp<-predict(mat3.pr) cmp<-pre.cmp[,1:3] cmp DF<-cbind(Y,cmp) DF<-as.data.frame(DF) names(DF)<-c('y','p1','p2','p3') DF summary(lm(y~p1+p2+p3,data=DF)) mat3.pr<-prcomp(DF,cor=T) summary(mat3.pr) pre<-predict(mat3.pr) pre1<-pre[,1:3] pre1 colnames(pre1)<-c("x1","x2","x3") pre1 pc<-cbind(y,pre1) pc<-as.data.frame(pc) lm.pc<-lm(y~x1+x2+x3,data=pc) summary(lm.pc)
above, my code about pca, but after finishing it, the first three pcs are some large, why? and the fit value r2 are bad. belowe is my value on the firest 3 pcs. > pre1 PC1 PC2 PC3 [1,] -15181.5190 1944.392700 -1074.326182 [2,] -32152.4533 1007.113729 3201.361408 [3,] -15836.5362 2117.988273 -555.799383 [4,] -1618.5561 1481.020337 255.530132 [5,] -5407.5030 1975.779398 -84.646283 [6,] -9662.1949 2611.220928 -417.435782 [7,] -30488.2102 577.385588 1853.420297 [8,] -2135.2563 -4506.112873 1382.413284 [9,] -1584.2796 -4645.142062 929.146895 [10,] -668.7664 -4876.250486 177.691446 [11,] -2188.5914 -4495.203080 1432.428127 [12,] -19633.9581 2159.000138 -1598.710872 [13,] -26849.1088 -515.574085 -2683.552623 [14,] -9492.9503 -4868.648205 1236.986097 [15,] -13857.6517 -4810.228193 1296.342199 [16,] -11596.5097 -8181.631403 462.913210 [17,] -25948.6564 -746.442386 -3415.426682 [18,] 15386.4477 709.974524 555.160973 [19,] 21642.7516 1163.456075 -609.437740 [20,] 22236.7094 675.562564 -136.992578 [21,] 14354.9927 611.996274 -4.867054 [22,] 12569.9493 1111.842240 585.540985 [23,] 20739.0219 3078.679745 1662.902248 [24,] 9472.0249 648.769910 381.487034 [25,] 17299.5307 1424.712428 1522.311676 [26,] 13231.2735 587.761915 170.448061 [27,] 10843.5590 705.485396 -79.931518 [28,] 9402.8803 -1978.216853 -1534.244078 [29,] 13094.9525 212.042937 -363.941664 [30,] 9337.3522 537.885230 189.558999 [31,] 7747.1347 -141.004825 -1664.082447 [32,] 4640.1161 -1489.652284 -3584.574135 [33,] 13241.5054 175.630689 -486.250927 [34,] 3867.2204 814.830143 1584.358007 [35,] 8614.5030 708.274447 814.295587 [36,] -18815.6774 -480.311541 1248.369916 [37,] -1860.0810 1195.557861 269.322703 [38,] 7172.0057 4.216905 -1191.448702 [39,] -7233.2271 -2361.951658 -235.293358 [40,] 1841.3548 1187.225488 632.116420 [41,] 12465.2336 367.822405 160.751014 [42,] -39021.7259 1972.333778 3167.504098 [43,] 13098.7736 -424.152058 -567.846037 [44,] 9793.7729 -559.084900 -210.696126 [45,] 13111.1861 22.772626 -318.242722 [46,] 13169.0604 7.808885 -363.995563 [47,] 3306.6293 -694.908211 -642.996604 [48,] 10779.8582 -989.175596 -1619.861931 [49,] 10872.6913 -747.979343 -1375.317959 [50,] -3057.5633 1838.449143 1454.886518 [51,] -6854.9316 2338.753165 1113.510561 [52,] -15077.1823 1917.776905 -1158.158633 [53,] -45862.8305 1173.157521 -1707.293955 [54,] -14294.1553 1716.708462 -1794.064434 [55,] 24645.0508 2519.904889 1424.233563 [56,] 23303.5998 2250.088386 839.587354 [57,] 18865.5231 897.566446 36.240598 [58,] 227.2659 -6582.661199 -712.892569 [59,] 15336.8371 722.953549 593.903314 [60,] 13030.8715 228.509670 -312.933654 [61,] 5826.0388 331.077814 -53.417878 [62,] 13150.4446 -437.612023 -608.342969 [63,] 11728.3897 -83.151510 569.007995 [64,] 11021.5720 -869.425283 -1216.724017 [65,] 9625.3142 137.388994 138.735249 [66,] -15905.2704 3735.547166 421.846379 [67,] -15539.7628 3331.399648 104.886572 [68,] -2294.9924 1648.164750 822.075221 [69,] -10120.0153 1558.766306 -333.378256 [70,] -24241.4554 -533.700229 1516.603088 [71,] -1036.6022 -4782.136067 475.195011 [72,] -24575.2244 2655.599986 -1965.946921 the fit result below: Call: lm(formula = y ~ x1 + x2 + x3, data = pc) Residuals: Min 1Q Median 3Q Max -1.29638 -0.47622 0.01059 0.49268 1.69335 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 5.613e+00 8.143e-02 68.932 < 2e-16 *** x1 -3.089e-05 5.150e-06 -5.998 8.58e-08 *** x2 -4.095e-05 3.448e-05 -1.188 0.239 x3 -8.106e-05 6.412e-05 -1.264 0.210 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.691 on 68 degrees of freedom Multiple R-squared: 0.3644, Adjusted R-squared: 0.3364 F-statistic: 12.99 on 3 and 68 DF, p-value: 8.368e-07 x2,x3 is not significance. by pricipal, after PCA, the pcs should significance, but my data is not, why? -- View this message in context: http://old.nabble.com/after-PCA%2C-the-pc-values-are-so-large%2C-wrong--tp26240926p26240926.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.