[R] Problem with a regression - Dataset Workinghours

Giorgio Monti Sat, 28 Jul 2012 21:32:18 -0700

I'm a student. I'm working on a research using the statistical program "R
2.15.1".
Here's my problem: how i can do a regression considering only values over a
certain limit?
For example, considering the dataset "Workinghour" of the "Ecdat" package,
is possible to build a predictive model that express the probability that a
wife works more than 8 hours per day?
The dataset includes 3382 observation on the number of hours spent working
by wifes per year in USA.


hoursday=hours/240
index<-which(hoursday>=8)
hoursday[index]

As you see, I'm able to extract the values that in 'hoursday' (which is
hours/240 working days in one year) are > 8,0 but obviously i can't do a
regression cause the extracted data are a subset of the entire dataset (955
observations), while the other variables, like age, occupation, income,
etc. are still complete(3382).

So i can't do:
lm = lm(hoursday[index] ~
income+age+education+unemp+child5+child13+child17+nonwhite+owned+mortgage+occupation)
In fact "R" gives me: Error in model.frame.default(formula =
hoursday[index] ~ income, drop.unused.levels = TRUE) : variable lengths
differ (found for 'income').

Can you help me?

Thank you.

Giorgio

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem with a regression - Dataset Workinghours

Reply via email to