#or library(plyr) res<-ddply(df1,.(INDX),summarize,Debut=head(Debut,1),Fin=tail(Fin,1)) res$INDX<-factor(res$INDX,levels=unique(df1$INDX)) res[order(res$INDX),-1] # Debut Fin #3 24/01/1995 31/12/1997 #4 02/02/1995 12/03/1995 #1 13/03/1995 30/06/1995 #2 01/01/1996 31/01/1996 A.K.
----- Original Message ----- From: arun <smartpink...@yahoo.com> To: Arnaud Michel <michel.arn...@cirad.fr> Cc: R help <r-help@r-project.org>; Rui Barradas <ruipbarra...@sapo.pt> Sent: Wednesday, July 17, 2013 4:14 PM Subject: Re: [R] simplify a dataframe Hi, You could try: df1[,1:2]<-lapply(df1[,1:2],as.character) df2New<- data.frame(Deb=unique(with(df1,ave(Debut,INDX,FUN=function(x) head(x,1)))),Fin=unique(with(df1,ave(Fin,INDX,FUN=function(x) tail(x,1))))) identical(df2New,df2) #[1] TRUE A.K. ----- Original Message ----- From: Arnaud Michel <michel.arn...@cirad.fr> To: Rui Barradas <ruipbarra...@sapo.pt>; R help <r-help@r-project.org>; arun <smartpink...@yahoo.com> Cc: Sent: Wednesday, July 17, 2013 4:03 PM Subject: Re: [R] simplify a dataframe Thank you for the question (1) Sorry for the imprecision for the question (2) : Suppose the date frame df df1 <- data.frame( Debut =c ( "24/01/1995", "01/05/1997" ,"31/12/1997", "02/02/1995" ,"28/02/1995" ,"01/03/1995", "13/03/1995", "01/01/1996", "31/01/1996") , Fin = c ( "30/04/1997", "30/12/1997" ,"31/12/1997", "27/02/1995", "28/02/1995", "12/03/1995", "30/06/1995", "30/01/1996", "31/01/1996") , INDX = c(6,6,6, 11,11,11, 4, 5,5) ) I would like replace df1 by df2 df2 <- data.frame( Deb = c("24/01/1995", "02/02/1995", "13/03/1995", "01/01/1996") , Fin = c("31/12/1997", "12/03/1995", "30/06/1995", "31/01/1996") ) Explication : The lines 1, 2 3 of df1 (who have same value of index =6) are replaced by only one line with value of Debut of df2 = Debut of line 1 of df1 value of Fin of df2 = Fin of line 3 of df1 The lines 4,5,6 of df1 (who have same value of index =11) are replaced by only one line with value of Debut of df2 = Debut of line 4 of df1 and value of fin of df2 = Fin of line 6 of df1 The line 7 of df1 (who have same value of index =4) are replaced by only one line with value of Debut of df2 = Debut of line 7of df1 and value of fin of df2 = Fin of line 7of df1 ==> No change The lines 8,9 of df1 (who have same value of index =5) are replaced by only one line with value of Debut of df2 = Debut of line 8of df1 and value of fin of df2 = Fin of line 9 of df1 df1 Debut Fin INDX 1 24/01/1995 30/04/1997 6 2 01/05/1997 30/12/1997 6 3 31/12/1997 31/12/1997 6 4 02/02/1995 27/02/1995 11 5 28/02/1995 28/02/1995 11 6 01/03/1995 12/03/1995 11 7 13/03/1995 30/06/1995 4 8 01/01/1996 30/01/1996 5 9 31/01/1996 31/01/1996 5 Deb Fin 1 24/01/1995 31/12/1997 2 02/02/1995 12/03/1995 3 13/03/1995 30/06/1995 4 01/01/1996 31/01/1996 Thank you for your helps Michel Le 17/07/2013 19:57, Rui Barradas a écrit : > Hello, > > As for question (1), try the following. > > > y2 <- cumsum(c(TRUE, diff(x1) > 0)) > identical(as.integer(y1), y2) # y1 is of class "numeric" > > > As for question (2) I'm not understanding it. > > Hope this helps, > > Rui Barradas > > Em 17-07-2013 18:21, Arnaud Michel escreveu: >> Hi Arun >> >> I have two questions always about the question of symplify a dataframe >> >> I would like >> 1) to transform the vector x1 into the vector y1 >> x1 <- c(1,1,1,-1000, 1,-1000, 1,1,1,1,1,1,-1000) >> y1 <- c(1,1,1,1, 2,2, 3,3,3,3,3,3,3) >> >> >> 2) to transform the vectors Debut and Fin by taking into account INDX >> into the two vectors Deb and Fin >> Debut <- c ( >> "24/01/1995", "01/05/1997" ,"31/12/1997", "02/02/1995" ,"28/02/1995" >> ,"01/03/1995", >> "13/03/1995", "01/01/1996", "31/01/1996", "24/01/1995", "01/07/1995" >> ,"01/09/1995", >> "01/07/1997", "01/01/1998", "01/08/1998", "01/01/2000", >> "17/01/2000","29/02/2000") >> >> Fin <- c ( >> "30/04/1997", "30/12/1997" ,"31/12/1997", "27/02/1995", "28/02/1995", >> "12/03/1995", >> "30/06/1995", "30/01/1996", "31/01/1996", "30/06/1995", "31/08/1995", >> "30/06/1997", >> "31/12/1997", "31/07/1998", "31/12/1999", "16/01/2000", "28/02/2000", >> "29/02/2000") >> >> INDX <- c(6,6,6, 11,11,11, 4, 5,5) >> >> >> Deb <- c("*24/01/1995*", "*02/02/1995*", "*13/03/1995*", >> "*01/01/1996*") >> Fi n <- c("*31/12/1997*", "*12/03/1995*", "*30/06/1995*", >> "*31/01/1996*") >> >> >> Debut Fin INDX >> *24/01/1995* 30/04/1997 6 >> 01/05/1997 30/12/1997 6 >> 31/12/1997 *31/12/1997* 6 >> *02/02/1995* 27/02/1995 11 >> 28/02/1995 28/02/1995 11 >> 01/03/1995 *12/03/1995* 11 >> *13/03/1995* *30/06/1995* 4 >> *01/01/1996* 30/01/1996 5 >> 31/01/1996 *31/01/1996* 5 >> ................ >> >> Thanks for your help >> >> >> >> ______________________________________________ >> R-help@r-project.org mailing list >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. >> > -- Michel ARNAUD Chargé de mission auprès du DRH DGDRD-Drh - TA 174/04 Av Agropolis 34398 Montpellier cedex 5 tel : 04.67.61.75.38 fax : 04.67.61.57.87 port: 06.47.43.55.31 ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.