Dear useRs, Thanks for advice from Joris Meys, Now will try to think how to make it working for less specyfic case, to make the problem more general. Then the result should be displayed for every group between non empty string in c2 i.e. not only result for: #mean: c1 c3 c4 c5 20 Start1 Stop1 Start1-Stop1 25.48585 Start2 Stop2 Start2-Stop2
but also for every one group created by space between two closest strings in c2, that contains only seriess of Na, NA, NA, separated from time to time by one string i.e.: #mean: c1 c3 c4 c5 20 Start1 Stop1 Start1-Stop1 .. Stop1 Start2 Stop1-Start2 25.48585 Start2 Stop2 Start2-Stop2 i.e. to rewrite this maybe for another simpler version of command but also for every one group created by space between two closest strings in c2, that contains only seriess of Na, NA, NA, separated from time to time by one string A, NA, NA, NA, NA, B, NA, NA, NA, C, NA,NA,NA,NA,D, NA,NA i.e.: #mean: c1 c3 c4 c5 20 A B A-B .. B C B-C 25.48585 C D C-D ................... Looking for more general method (function), grouping between these letters in c2, I will now try to study solution proposed by Joris Meys Thanks for immediate aswer Kaluza -----Wiadomo¶æ oryginalna----- Od: Joris Meys [mailto:jorism...@gmail.com] Wys³ano: Cz 2010-06-24 15:14 Do: Eugeniusz Ka³u¿a DW: r-help@r-project.org Temat: Re: [R] ?to calculate sth for groups defined between points in one variable (string), / value separating/ spliting variable into groups by i.e. between start, NA, NA, stop1, start2, NA, stop2 On Thu, Jun 24, 2010 at 1:18 PM, Eugeniusz Kaluza <eugeniusz.kal...@polsl.pl> wrote: > > Dear useRs, > > Thanks for any advices > > # I do not know where are the examples how to mark groups > # based on signal occurence in the additional variable: cf. variable c2, > # How to calculate different calculations for groups defined by (split by > occurence of c2 characteristic data) > > > #First example of simple data > #mexample 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 > c0<-rbind( 1, 2 , 3, 4, 5, 6, 7, 8, 9,10,11, 12,13,14,15,16,17 > ) > c0 > c1<-rbind(10, 20 ,30,40, 50,10,60,20,30,40,50, 30,10, > 0,NA,20,10.3444) > c1 > c2<-rbind(NA,"Start1",NA,NA,"Stop1",NA,NA,NA,NA,NA,NA,"Start2",NA,NA,NA,NA,"Stop2") > c2 > C.df<-data.frame(cbind(c0,c1,c2)) > colnames(C.df)<-c("c0","c1","c2") > C.df > > # preparation of form for explaining further needed result (next 3 lines are > not needed indeed, they are only to explain how to obtain final result > c3<-rbind(NA,"Start1","Start1","Start1","Start1","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2","Start2") > c4<-rbind(NA, "Stop1", "Stop1", "Stop1", "Stop1", "Stop2", "Stop2", "Stop2", > "Stop2", "Stop2", "Stop2", "Stop2", "Stop2", "Stop2", "Stop2", "Stop2", > "Stop2") > C.df<-data.frame(cbind(c0,c1,c2,c3,c4)) > colnames(C.df)<-c("c0","c1","c2","c3","c4") > C.df$c5<-paste(C.df$c3,C.df$c4,sep="-") > C.df > Now this is something I don't get. The list "Start2-Stop2" starts way before Start2, actually at Stop1. Sure that's what you want? I took the liberty of showing how to get the data between start and stop for every entry, and how to apply functions to it. If you don't get the code, look at ?lapply ?apply ?grep I also adjusted your example, as you caused all variables to be factors by using the cbind in the data.frame function. Never do this unless you're really sure you have to. But I can't think of a case where that would be beneficial... ... C.df<-data.frame(c0,c1,c2) C.df # find positions Start <- grep("Start",C.df$c2) Stop <- grep("Stop",C.df$c2) # create indices idx <- apply(cbind(Start,Stop),1,function(i) i[1]:i[2]) names(idx) <- paste("Start",1:length(Start),"-Stop",1:length(Start),sep="") # Apply the function summary and get a list back named by the interval. out <- lapply(idx,function(i) summary(C.df[i,1:2])) out If you really need to start Start2 right after Stop1, you can use a similar approach. Cheers Joris > # NEEDED RESULTS > # needed result > # for Stat1-Stop1: mean(20,30,40,50) > # for Stat2-Stop2: mean(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T) > #mean: > c1 c3 c4 c5 > 20 Start1 Stop1 Start1-Stop1 > 25.48585 Start2 Stop2 Start2-Stop2 > > #sum > # for Stat1-Stop1: sum(20,30,40,50) > # for Stat2-Stop2: sum(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T) > #sum: > c1 c3 c4 c5 > 140 Start1 Stop1 Start1-Stop1 > 280.3444 Start2 Stop2 Start2-Stop2 > > # for Stat1-Stop1: max(20,30,40,50) > # for Stat2-Stop2: max(c(10,60,20,30,40,50,30,10,0,NA,20,10.3444), na.rm=T) > #max: > c1 c3 c4 c5 > 50 Start1 Stop1 Start1-Stop1 > 60 Start2 Stop2 Start2-Stop2 > > # place of max (in Start1-Stop1: 4 th element in gruop Start1-Stop1 > # place of max (in Start1-Stop1: 2 nd element in gruop Start1-Stop1 > > c0 c3 c4 c5 > 4 Start1 Stop1 Start1-Stop1 > 2 Start2 Stop2 Start2-Stop2 > > > Thanks for any suggestion, > Kaluza > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joris Meys Statistical consultant Ghent University Faculty of Bioscience Engineering Department of Applied mathematics, biometrics and process control tel : +32 9 264 59 87 joris.m...@ugent.be ------------------------------- Disclaimer : http://helpdesk.ugent.be/e-maildisclaimer.php [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.