My dataframe looks like this one: SightingID<-c(2001,2002,2003,2004,2005,2006,2007,2008,2009,2010,2011,2012,2013)PA1<-c(0,1,0,0,1,1,1,1,0,0,-99,1,1)PA2<-c(1,NA,1,1,NA,-99,-99,NA,1,1,1,NA,NA)PlotID<-c(1,1,2,2,2,3,3,3,4,4,4,4,5)Area<-c(0.2,0.3,0.25,0.2,0.3,0.4,0.3,0.35,0.4,0.4,0.5,0.3,0.2)DF<-cbind(SightingID,PA1,PA2,PlotID,Area) There are several SightingID for a same PlotID value and I need to select only one SightingID for each PlotID value.The SightingID selected for the PlotID value need to be:- the one with PA1=0- if there are several SightingID with PA1=0, select the one with the highest Area value- if there are several SightingID with PA1=0 and the same highest Area value, select one at random- if for one PlotID value there is no SightingID with PA1=0, select the one with the highest Area value (and at random if there are several with the same highest Area value) I would like to have this kind of result: SightingID2<-c(2001,2003,2006,2009,2013)PA12<-c(0,0,1,0,1)PA22<-c(1,1,-99,1,NA)PlotID2<-c(1,2,3,4,5)Area2<-c(0.2,0.25,0.4,0.4,0.2)DF2<-cbind(SightingID2,PA12,PA22,PlotID2,Area2) Can someone help me ?Thanks Sarah [[alternative HTML version deleted]]
______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.