I'm a student. I'm working on a research using the statistical program "R 2.15.1". Here's my problem: how i can do a regression considering only values over a certain limit? For example, considering the dataset "Workinghour" of the "Ecdat" package, is possible to build a predictive model that express the probability that a wife works more than 8 hours per day? The dataset includes 3382 observation on the number of hours spent working by wifes per year in USA.
hoursday=hours/240 index<-which(hoursday>=8) hoursday[index] As you see, I'm able to extract the values that in 'hoursday' (which is hours/240 working days in one year) are > 8,0 but obviously i can't do a regression cause the extracted data are a subset of the entire dataset (955 observations), while the other variables, like age, occupation, income, etc. are still complete(3382). So i can't do: lm = lm(hoursday[index] ~ income+age+education+unemp+child5+child13+child17+nonwhite+owned+mortgage+occupation) In fact "R" gives me: Error in model.frame.default(formula = hoursday[index] ~ income, drop.unused.levels = TRUE) : variable lengths differ (found for 'income'). Can you help me? Thank you. Giorgio [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.