Dear Karl, Since you compare a character with a numeric, R converts the numeric silently. And then you're into trouble.
as.character(99999) # "99999" as.character(100000) # "1e+5" Bottom line, use the same type on both sides of the binary operator. Best regards, ir. Thierry Onkelinx Instituut voor natuur- en bosonderzoek / Research Institute for Nature and Forest team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance Kliniekstraat 25 1070 Anderlecht Belgium To call in the statistician after the experiment is done may be no more than asking him to perform a post-mortem examination: he may be able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher The plural of anecdote is not data. ~ Roger Brinner The combination of some data and an aching desire for an answer does not ensure that a reasonable answer can be extracted from a given body of data. ~ John Tukey 2015-11-17 20:14 GMT+01:00 Karl Schilling <karl.schill...@uni-bonn.de>: > Dear all, > > I have one observation that I do not quite understand. Maybe someone > can clarify this issue for me. > > I have a data frame which I want to subset based on a grouping variable, > say "group". Actually, "group" is a numeric value, but it is saved as a > character. I give some code to generate an exemplary data frame below. > > Now, if I use > > MySubset <- subset(Data, Data$group == "..") > > everything works fine, as expected. ".." stands here for the value of > group given as a character string. > > Surprisingly, I also get a correct subsetting if I simply give the plain > numeric value of group (like MySubset <- subset(Data, Data$group == ..), AS > LONG AS this numeric value is less then 100000. > > If the numeric value is 100000 or larger, I get an empty subset. > > OK, I know how to avoid this situation, but I wonder what the explanation > for this for me rather strange behavior might be. > > Thank you so much for your suggestions. > > > Karl Schilling > > > ##### > Exemplary code for reproducing the above described problem: > > options(stringsAsFactors = F) > > # set up some data frame > value <- c(1:6) > group <- rep(c("20000", "99999", "100000"), each = 2) > Data <- data.frame(value = value, group = group) > str(Data) > > # subset data frame based on the value of the variable "group", > # treating this value once as a character, and once as a number: > > Data20 <- subset(Data, Data$group =="20000") > str(Data20) > Data20N <- subset(Data, Data$group ==20000) > str(Data20N) > > > Data99 <- subset(Data, Data$group =="99999") > str(Data99) > Data99N <- subset(Data, Data$group ==99999) > str(Data99N) > Data100 <- subset(Data, Data$group =="100000") > str(Data100) > Data100N <- subset(Data, Data$group ==100000) > str(Data100N) > > -- > Karl Schilling > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.