Hi r-help-boun...@r-project.org napsal dne 03.02.2009 09:43:04:
> > > Dear R-helpers, > > I've been thinking about this for some time, maybe someone can help. I have > a fairly large dataset with thousands of firms, call the a, b, c, etc.. > such as > > [,1] [,2] > [1,] "A" 0.5 > [2,] "" 0.2 > [3,] "" 0.3 > [4,] "B" 0.1 > [5,] "" 0.9 > [6,] "C" 0.4 > > Or to put it differently two vectors such as > > y <- c("A", "", "", "B", "", "C") > x <- c(0.5, 0.2, 0.3, 0.1, 0.9, 0.4) > > The empty lines "" always belong to the firm above. Now I want to collapse > the dataset so that each firm (A,B, C, etc) has one line only, using > summation. > > So what I would like is > > yNew <- c("A", "B", "C") > xNew <- c(1, 1, 0.4) That is what are NA values for. There are quite useful functions for handling them. y <- c("A", "", "", "B", "", "C") x <- c(0.5, 0.2, 0.3, 0.1, 0.9, 0.4) y[y==""]<-NA from package zoo y.na<-na.locf(y) tapply(x,y.na, sum) A B C 1.0 1.0 0.4 or aggregate(...) Regards Petr > > The problem I'm having is that each firm has a different number of entries > for x, so some like C have just one and others have ten or more, so I have > difficulty imagining how to use a loop in this case. > I'd be greatful for any suggestions. > Karina > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.