My understanding was that the discordant names has been identified. So in the example the OP gave, removing rows with first = "Alex" is done by:
df[df$first !="Alex",] If that is not the case, as others have pointed out, various forms of tapply() (by, ave, etc.) can be used. I agree that that is not so "basic," so I apologize if my understanding was incorrect. Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Feb 11, 2017 at 10:04 PM, Rolf Turner <r.tur...@auckland.ac.nz> wrote: > > On 12/02/17 18:36, Bert Gunter wrote: >> >> Basic stuff! >> >> Either subscripting or ?subset. >> >> There are many good R tutorials on the web. You should spend some >> (more?) time with some. > > > Uh, Bert, perhaps I'm being obtuse (a common occurrence) but it doesn't seem > basic to me. The only way that I can see how to go at it is via > a for loop: > > rdln <- function(X) { > # Remove discordant last names. > ok <- logical(nrow(X)) > for(nm in unique(X$first)) { > xxx <- unique(X$last[X$first==nm]) > if(length(xxx)==1) ok[X$first==nm] <- TRUE > } > Y <- X[ok,] > Y <- Y[order(Y$first),] > rownames(Y) <- 1:nrow(Y) > Y > } > > Calling the toy data frame "melvin" rather than "df" (since "df" is the name > of the built in F density function, it is bad form to use it as the name of > another object) I get: > >> rdln(melvin) > first week last > 1 Bob 1 John > 2 Bob 2 John > 3 Bob 3 John > 4 Cory 1 Jack > 5 Cory 2 Jack > > which is the desired output. If there is a "basic stuff" way to do this > I'd like to see it. Perhaps I will then be toadally embarrassed, but they > say that this is good for one. > > cheers, > > Rolf > > -- > Technical Editor ANZJS > Department of Statistics > University of Auckland > Phone: +64-9-373-7599 ext. 88276 > >> On Sat, Feb 11, 2017 at 9:02 PM, Val <valkr...@gmail.com> wrote: >>> >>> Hi all, >>> I have a big data set and want to remove rows conditionally. >>> In my data file each person were recorded for several weeks. Somehow >>> during the recording periods, their last name was misreported. For >>> each person, the last name should be the same. Otherwise remove from >>> the data. Example, in the following data set, Alex was found to have >>> two last names . >>> >>> Alex West >>> Alex Joseph >>> >>> Alex should be removed from the data. if this happens then I want >>> remove all rows with Alex. Here is my data set >>> >>> df <- read.table(header=TRUE, text='first week last >>> Alex 1 West >>> Bob 1 John >>> Cory 1 Jack >>> Cory 2 Jack >>> Bob 2 John >>> Bob 3 John >>> Alex 2 Joseph >>> Alex 3 West >>> Alex 4 West ') >>> >>> Desired output >>> >>> first week last >>> 1 Bob 1 John >>> 2 Bob 2 John >>> 3 Bob 3 John >>> 4 Cory 1 Jack >>> 5 Cory 2 Jack ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.