Hi all. So I have a data frame with multiple columns/variables. The first variable is a major sample name for which there are some sub-samples. Currently I have used the following command to remove the duplicates:
Samps_working<-Samps[-c(which(duplicated(Samps$ESR_Ref_edit))),] This removes all of the duplicated sample rows. However, I just realised that, of course, this removes the first observation of each duplicated set. However, I wish to retain any that have the code "Y" in another variable Samps$Loaded. I'm at a bit of a loss as to how best to approach this problem. Just to reiterate. I want to remove all duplicate lines based on sample name, but, I want the lines to be removed with a preference given to those that do not include a "Y" in the Loaded variable column. -- View this message in context: http://n4.nabble.com/Remove-duplicates-from-a-data-frame-but-with-some-special-requirements-tp965745p965745.html Sent from the R help mailing list archive at Nabble.com. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.