Thanks for your hints, but I'm still stuck... In dataset I mentioned (N=134) there are only 3 NA's in variable, and 41% : 59% distribution of the two values. It doesn't look like it was because of the data...
I changed and simplified my function, now it prints levels before doing the rest. Here's a "funny" error result: > myfun(data, 'varname') Levels = 2 Error in t.test.formula(data[[nam[v]]] ~ data[[g]]) : grouping factor must have exactly 2 levels ... I'll paste simplified code, maybe it'd give someone a clue what is going wrong: myfun <- function(data, g) { require(stats) data <- as.data.frame(data) nam <- names(data) res <- matrix(NA,ncol(data)) cat("\n Levels =", nlevels(factor(data[[g]])),"\n\n") for (v in 1:ncol(data)) { if (nam[v] != g) { res[v] <- list(t.test(data[[nam[v]]]~data[[g]])) }} res } What is going wrong here? Greetz, Timo 2009/7/10 Marc Schwartz <marc_schwa...@me.com>: > On Jul 9, 2009, at 5:04 PM, Tymek W wrote: > >> Hi, >> >> Could anyone tell me what is wrong: >> >>> length(unique(mydata$myvariable)) >> >> [1] 2 >>> >> >> and in t-test: >> >> (...) >> Error in t.test.formula(othervariable ~ myvariable, mydata) : >> grouping factor must have exactly 2 levels >>> >> >> I re-checked the code and still don't get what is wrong. >> >> Moreover, there is some strange behavior: >> >> /1 It seems that the error is vulnerable to NA'a, because it affects >> some variables in data set with NA's and doesn't affect same ones in >> dataset with NA's removed. >> >> /2 It seems it works differently with different ways of using >> variables in t.test: >> >> eg. it hapends here: t.test(x~y, dataset) and does not here: >> t.test(dataset[['x']]~dataset[['y']]) >> >> Does anyone have any ideas? >> >> Greetz, >> Timo > > > Check the output of: > > na.omit(cbind(mydata$othervariable, mydata$myvariable)) > > which will give you some insight into what data is actually available to be > used in the t test. This will remove any rows that have missing data. Your > first test above, checking the number of levels, is before missing data is > removed. > > The likelihood is that once missing values have been removed, you are only > left with one unique grouping value in mydata$myvariable. > > For your note number 2, it should be the same for both examples, as in both > cases, the same basic approach is used. For example: > > DF <- data.frame(x = c(1:3, NA, NA, NA), y = rep(1:2, each = 3)) > >> DF > x y > 1 1 1 > 2 2 1 > 3 3 1 > 4 NA 2 > 5 NA 2 > 6 NA 2 > > # Remove missing data >> na.omit(DF) > x y > 1 1 1 > 2 2 1 > 3 3 1 > >> t.test(x ~ y, data = DF) > Error in t.test.formula(x ~ y, data = DF) : > grouping factor must have exactly 2 levels > >> t.test(DF$x ~ DF$y) > Error in t.test.formula(DF$x ~ DF$y) : > grouping factor must have exactly 2 levels > > > If you have a small reproducible example where the two function calls behave > differently, please post back with it. > > HTH, > > Marc Schwartz > > -- pozdrawiam, Tymek W ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.