Thank you very much guys! On Fri, Feb 27, 2015 at 11:04 AM, William Dunlap <wdun...@tibco.com> wrote: > You could define functions like > is.true <- function(x) !is.na(x) & x > is.false <- function(x) !is.na(x) & !x > and use them in your selections. E.g., > > x <- data.frame(a=1:10,b=2:11,c=c(1,NA,3,NA,5,NA,7,NA,NA,10)) > > x[is.true(x$c >= 6), ] > a b c > 7 7 8 7 > 10 10 11 10 > > > Bill Dunlap > TIBCO Software > wdunlap tibco.com > > On Fri, Feb 27, 2015 at 7:27 AM, Dimitri Liakhovitski > <dimitri.liakhovit...@gmail.com> wrote: >> >> Thank you very much, Duncan. >> All this being said: >> >> What would you say is the most elegant and most safe way to solve such >> a seemingly simple task? >> >> Thank you! >> >> On Fri, Feb 27, 2015 at 10:02 AM, Duncan Murdoch >> <murdoch.dun...@gmail.com> wrote: >> > On 27/02/2015 9:49 AM, Dimitri Liakhovitski wrote: >> >> So, Duncan, do I understand you correctly: >> >> >> >> When I use x$x<6, R doesn't know if it's TRUE or FALSE, so it returns >> >> a logical value of NA. >> > >> > Yes, when x$x is NA. (Though I think you meant x$c.) >> > >> >> When this logical value is applied to a row, the R says: hell, I don't >> >> know if I should keep it or not, so, just in case, I am going to keep >> >> it, but I'll replace all the values in this row with NAs? >> > >> > Yes. Indexing with a logical NA is probably a mistake, and this is one >> > way to signal it without actually triggering a warning or error. >> > >> > BTW, I should have mentioned that the example where you indexed using >> > -which(x$c>=6) is a bad idea: if none of the entries were 6 or more, >> > this would be indexing with an empty vector, and you'd get nothing, not >> > everything. >> > >> > Duncan Murdoch >> > >> > >> >> >> >> On Fri, Feb 27, 2015 at 9:13 AM, Duncan Murdoch >> >> <murdoch.dun...@gmail.com> wrote: >> >>> On 27/02/2015 9:04 AM, Dimitri Liakhovitski wrote: >> >>>> I know how to get the output I need, but I would benefit from an >> >>>> explanation why R behaves the way it does. >> >>>> >> >>>> # I have a data frame x: >> >>>> x = data.frame(a=1:10,b=2:11,c=c(1,NA,3,NA,5,NA,7,NA,NA,10)) >> >>>> x >> >>>> # I want to toss rows in x that contain values >=6. But I don't want >> >>>> to toss my NAs there. >> >>>> >> >>>> subset(x,c<6) # Works correctly, but removes NAs in c, understand why >> >>>> x[which(x$c<6),] # Works correctly, but removes NAs in c, understand >> >>>> why >> >>>> x[-which(x$c>=6),] # output I need >> >>>> >> >>>> # Here is my question: why does the following line replace the values >> >>>> of all rows that contain an NA # in x$c with NAs? >> >>>> >> >>>> x[x$c<6,] # Leaves rows with c=NA, but makes the whole row an NA. >> >>>> Why??? >> >>>> x[(x$c<6) | is.na(x$c),] # output I need - I have to be >> >>>> super-explicit >> >>>> >> >>>> Thank you very much! >> >>> >> >>> Most of your examples (except the ones using which()) are doing >> >>> logical >> >>> indexing. In logical indexing, TRUE keeps a line, FALSE drops the >> >>> line, >> >>> and NA returns NA. Since "x$c < 6" is NA if x$c is NA, you get the >> >>> third kind of indexing. >> >>> >> >>> Your last example works because in the cases where x$c is NA, it >> >>> evaluates NA | TRUE, and that evaluates to TRUE. In the cases where >> >>> x$c >> >>> is not NA, you get x$c < 6 | FALSE, and that's the same as x$c < 6, >> >>> which will be either TRUE or FALSE. >> >>> >> >>> Duncan Murdoch >> >>> >> >> >> >> >> >> >> > >> >> >> >> -- >> Dimitri Liakhovitski >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide >> http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. > >
-- Dimitri Liakhovitski ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.