Hi, Try this: which(duplicated(res10Percent)) # [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 379 #[20] 413 415 417 441 459 461 477 479 505 res10PercentSub1<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1) #most of the duplicated are dummy==1 res10PercentSub0<-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0) indx1<-as.numeric(row.names(res10PercentSub1)) indx11<-sort(c(indx1,indx1+1)) indx0<- as.numeric(row.names(res10PercentSub0)) indx00<- sort(c(indx0,indx0-1)) indx10<- sort(c(indx11,indx00))
nrow(res10Percent[-indx10,]) #[1] 452 res10PercentNew<-res10Percent[-indx10,] nrow(subset(res10PercentNew,dummy==1)) #[1] 226 nrow(subset(res10PercentNew,dummy==0)) #[1] 226 nrow(unique(res10PercentNew)) #[1] 452 A.K. ----- Original Message ----- From: Cecilia Carmo <cecilia.ca...@ua.pt> To: arun <smartpink...@yahoo.com> Cc: Sent: Monday, June 10, 2013 10:19 AM Subject: RE: please check this But I don't want it like this. Once a firm is paired with another, these two firms should not be paired again. Could you solve this? Thanks, Cecília ________________________________________ De: arun [smartpink...@yahoo.com] Enviado: segunda-feira, 10 de Junho de 2013 15:12 Para: Cecilia Carmo Assunto: Re: please check this I did look into that. If you look for the nrow() in each category, then it will be different. It means that the duplicates are not pairwise, but in the whole `result`. The explanation is again with the multiple matches. So, here we selected the one with dummy==0 that closely matches the dimension of one dummy==1. Suppose, the value of dimension with dummy==1` is `2554` and it got a match with dummy==0 with `2580`. Now, consider another case with dimension as `2570` with dummy==1 (which also comes within the same split group). Then it got a match with `2580' with dummy==0. I guess it was based on the way in which it was tested. ________________________________ From: Cecilia Carmo <cecilia.ca...@ua.pt> To: arun <smartpink...@yahoo.com> Sent: Monday, June 10, 2013 10:02 AM Subject: please check this When I do res10Percent<- fun1(final3New,0.1,200) dim(res10Percent) [1] 508 5 #[1] 508 5 nrow(subset(res10Percent,dummy==0)) #[1] 254 nrow(subset(res10Percent,dummy==1)) #[1] 254 testingDuplicates<-unique(res10Percent) nrow(testingDuplicates) [1] 480 #this should be 508, if not there are duplicated rows, or not? Thanks Cecilia ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.