This is exactly what I was looking for. Thank you.
Jonathan Thayn On Oct 3, 2014, at 10:32 AM, David L Carlson wrote: > You can reconstruct the data from the first component. Here's an example > using singular value decomposition on the original data matrix: > >> d <- cbind(d1, d2, d3, d4) >> d.svd <- svd(d) >> new <- d.svd$u[,1] * d.svd$d[1] > > new is basically your cp1. If we multiply it by each of the loadings, we can > create reconstructed values based on the first component: > >> dnew <- sapply(d.svd$v[,1], function(x) new * x) >> round(head(dnew), 1) > [,1] [,2] [,3] [,4] > [1,] 119.3 134.1 135.7 134.6 > [2,] 104.2 117.2 118.6 117.6 > [3,] 109.7 123.3 124.8 123.8 > [4,] 109.3 122.9 124.3 123.3 > [5,] 105.8 119.0 120.4 119.4 > [6,] 111.5 125.4 126.9 125.8 >> head(d) > d1 d2 d3 d4 > [1,] 113 138 138 134 > [2,] 108 115 120 115 > [3,] 105 127 129 120 > [4,] 103 127 129 120 > [5,] 109 119 120 117 > [6,] 115 126 126 123 > >> diag(cor(d, dnew)) > [1] 0.9233742 0.9921703 0.9890085 0.9910287 > > Since you want a single variable to stand for all four, you could scale new > to the mean: > >> newd <- new*mean(d.svd$v[,1]) >> head(newd) > [1] 130.9300 114.3972 120.3884 119.9340 116.1588 122.3983 > > ------------------------------------- > David L Carlson > Department of Anthropology > Texas A&M University > College Station, TX 77840-4352 > > > > -----Original Message----- > From: Jonathan Thayn [mailto:jth...@ilstu.edu] > Sent: Thursday, October 2, 2014 11:11 PM > To: David L Carlson > Cc: r-help@r-project.org > Subject: Re: [R] Using PCA to filter a series > > I suppose I could calculate the eigenvectors directly and not worry about > centering the time-series, since they essentially the same range to begin > with: > > vec <- eigen(cor(cbind(d1,d2,d3,d4)))$vector > cp <- cbind(d1,d2,d3,d4)%*%vec > cp1 <- cp[,1] > > I guess there is no way to reconstruct the original input data using just the > first component, though, is there? Not the original data in it entirety, just > one time-series that we representative of the general pattern. Possibly > something like the following, but with just the first component: > > o <- cp%*%solve(vec) > > Thanks for your help. It's been a long time since I've played with PCA. > > Jonathan Thayn > > > > > On Oct 2, 2014, at 4:59 PM, David L Carlson wrote: > >> I think you want to convert your principal component to the same scale as >> d1, d2, d3, and d4. But the "original space" is a 4-dimensional space in >> which d1, d2, d3, and d4 are the axes, each with its own mean and standard >> deviation. Here are a couple of possibilities >> >> # plot original values for comparison >>> matplot(cbind(d1, d2, d3, d4), pch=20, col=2:5) >> # standardize the pc scores to the grand mean and sd >>> new1 <- scale(pca$scores[,1])*sd(c(d1, d2, d3, d4)) + mean(c(d1, d2, d3, >>> d4)) >>> lines(new1) >> # Use least squares regression to predict the row means for the original >> four variables >>> new2 <- predict(lm(rowMeans(cbind(d1, d2, d3, d4))~pca$scores[,1])) >>> lines(new2, col="red") >> >> ------------------------------------- >> David L Carlson >> Department of Anthropology >> Texas A&M University >> College Station, TX 77840-4352 >> >> >> >> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On >> Behalf Of Don McKenzie >> Sent: Thursday, October 2, 2014 4:39 PM >> To: Jonathan Thayn >> Cc: r-help@r-project.org >> Subject: Re: [R] Using PCA to filter a series >> >> >> On Oct 2, 2014, at 2:29 PM, Jonathan Thayn <jth...@ilstu.edu> wrote: >> >>> Hi Don. I would like to "de-rotate� the first component back to its >>> original state so that it aligns with the original time-series. My goal is >>> to create a �cleaned�, or a �model� time-series from which noise has been >>> removed. >> >> Please cc the list with replies. It�s considered courtesy plus you�ll get >> more help that way than just from me. >> >> Your goal sounds almost metaphorical, at least to me. Your first axis >> �aligns� with the original time series already in that it captures the >> dominant variation >> across all four. Beyond that, there are many approaches to signal/noise >> relations within time-series analysis. I am not a good source of help on >> these, and you probably need a statistical consult (locally?), which is not >> the function of this list. >> >>> >>> >>> Jonathan Thayn >>> >>> >>> >>> On Oct 2, 2014, at 2:33 PM, Don McKenzie <d...@u.washington.edu> wrote: >>> >>>> >>>> On Oct 2, 2014, at 12:18 PM, Jonathan Thayn <jth...@ilstu.edu> wrote: >>>> >>>>> I have four time-series of similar data. I would like to combine these >>>>> into a single, clean time-series. I could simply find the mean of each >>>>> time period, but I think that using principal components analysis should >>>>> extract the most salient pattern and ignore some of the noise. I can >>>>> compute components using princomp >>>>> >>>>> >>>>> d1 <- c(113, 108, 105, 103, 109, 115, 115, 102, 102, 111, 122, 122, 110, >>>>> 110, 104, 121, 121, 120, 120, 137, 137, 138, 138, 136, 172, 172, 157, >>>>> 165, 173, 173, 174, 174, 119, 167, 167, 144, 170, 173, 173, 169, 155, >>>>> 116, 101, 114, 114, 107, 108, 108, 131, 131, 117, 113) >>>>> d2 <- c(138, 115, 127, 127, 119, 126, 126, 124, 124, 119, 119, 120, 120, >>>>> 115, 109, 137, 142, 142, 143, 145, 145, 163, 169, 169, 180, 180, 174, >>>>> 181, 181, 179, 173, 185, 185, 183, 183, 178, 182, 182, 181, 178, 171, >>>>> 154, 145, 147, 147, 124, 124, 120, 128, 141, 141, 138) >>>>> d3 <- c(138, 120, 129, 129, 120, 126, 126, 125, 125, 119, 119, 122, 122, >>>>> 115, 109, 141, 144, 144, 148, 149, 149, 163, 172, 172, 183, 183, 180, >>>>> 181, 181, 181, 173, 185, 185, 183, 183, 184, 182, 182, 181, 179, 172, >>>>> 154, 149, 156, 156, 125, 125, 115, 139, 140, 140, 138) >>>>> d4 <- c(134, 115, 120, 120, 117, 123, 123, 128, 128, 119, 119, 121, 121, >>>>> 114, 114, 142, 145, 145, 144, 145, 145, 167, 172, 172, 179, 179, 179, >>>>> 182, 182, 182, 182, 182, 184, 184, 182, 184, 183, 183, 181, 179, 172, >>>>> 149, 149, 149, 149, 124, 124, 119, 131, 135, 135, 134) >>>>> >>>>> >>>>> pca <- princomp(cbind(d1,d2,d3,d4)) >>>>> plot(pca$scores[,1]) >>>>> >>>>> This seems to have created the clean pattern I want, but I would like to >>>>> project the first component back into the original axes? Is there a >>>>> simple way to do that? >>>> >>>> Do you mean that you want to scale the scores on Axis 1 to the mean and >>>> range of your raw data? Or their mean and variance? >>>> >>>> See >>>> >>>> ?scale >>>>> >>>>> >>>>> >>>>> >>>>> Jonathan B. Thayn >>>>> >>>>> >>>>> ______________________________________________ >>>>> R-help@r-project.org mailing list >>>>> https://stat.ethz.ch/mailman/listinfo/r-help >>>>> PLEASE do read the posting guide >>>>> http://www.R-project.org/posting-guide.html >>>>> and provide commented, minimal, self-contained, reproducible code. >>>> >>>> Don McKenzie >>>> Research Ecologist >>>> Pacific WIldland Fire Sciences Lab >>>> US Forest Service >>>> >>>> Affiliate Professor >>>> School of Environmental and Forest Sciences >>>> College of the Environment >>>> University of Washington >>>> d...@uw.edu >>> >> >> Don McKenzie >> Research Ecologist >> Pacific WIldland Fire Sciences Lab >> US Forest Service >> >> Affiliate Professor >> School of Environmental and Forest Sciences >> College of the Environment >> University of Washington >> d...@uw.edu >> >> >> [[alternative HTML version deleted]] >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.