Have you considered the implications of that solution? -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare greg.s...@imail.org 801.408.8111
> -----Original Message----- > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r- > project.org] On Behalf Of Kum-Hoe Hwang > Sent: Wednesday, February 17, 2010 1:41 AM > To: r-help@r-project.org > Subject: Re: [R] Error of Stepwise Regression with number of rows in > use has changed: remove missing values? > > I thank those who helped to solve a error in stepwise regression with > missing values. > > > Kum > > * > * > > A good solution that I have tried was Andreas's advice. > > ===================================================================== > > Try > > data<-na.omit(original database) before you run step() or stepAIC() > > On Tue, Feb 16, 2010 at 8:09 PM, Peter Ehlers <ehl...@ucalgary.ca> > wrote: > > > On 2010-02-16 1:24, Kum-Hoe Hwang wrote: > > > >> Howdy, R Grues > >> > >> I have enjoyed R, but I cannot solve one problem easily. Please help > my > >> problem. > >> When I tried the R script, I got the following Error. This error > >> results from input data file exported through a Excel spreadsheet > >> software. > >> > >> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > >> as.numeric(nation.grant) + : > >> number of rows in use has changed: remove missing values? > >> > >> Could you direct me to solve the Error? > >> Thanks in advance, > >> > > > > This is a common situation when you use step() on data where > > the predictors have missing values. > > > > A case (row) is included in the model only if all the > > predictors for that model are non-missing for the case. > > > > As you vary which predictors are to be in the model, the > > included cases will vary, resulting in models based on > > different data. (Think of your cases as subjects; you want > > all your models to be based on the same set of subjects.) > > > > Finally: (Re-)read the help page and note the 'warning'. > > > > -Peter Ehlers > > > > > > > >> > >> ############### outputs from R console ############### > >>> pop<- step( > >>> > >> + lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > >> as.numeric(nation.grant) > >> + + as.numeric(do.grant) + as.numeric(city.grant) + > >> as.numeric(DMZ.dist) + as.numeric(Seoul.dist), data=borderI.data, > >> na.action = na.omit) > >> + ) > >> Start: AIC=494.27 > >> pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) > >> + > >> as.numeric(do.grant) + as.numeric(city.grant) + > as.numeric(DMZ.dist) + > >> as.numeric(Seoul.dist) > >> Df Sum of Sq RSS AIC > >> - as.numeric(do.grant) 1 0.71 6622.9 492.28 > >> - as.factor(policy) 1 1.21 6623.4 492.29 > >> - as.numeric(DMZ.dist) 1 1.91 6624.1 492.30 > >> - as.numeric(city.grant) 1 5.07 6627.3 492.36 > >> - as.numeric(nation.grant) 1 11.51 6633.7 492.47 > >> - as.numeric(year) 1 29.58 6651.8 492.80 > >> <none> 6622.2 494.27 > >> - as.numeric(Seoul.dist) 1 673.22 7295.4 503.79 > >> Step: AIC=492.28 > >> pop.rate ~ as.numeric(year) + as.factor(policy) + > as.numeric(nation.grant) > >> + > >> as.numeric(city.grant) + as.numeric(DMZ.dist) + > as.numeric(Seoul.dist) > >> Df Sum of Sq RSS AIC > >> - as.factor(policy) 1 1.99 6624.9 490.32 > >> - as.numeric(DMZ.dist) 1 2.09 6625.0 490.32 > >> - as.numeric(city.grant) 1 7.18 6630.1 490.41 > >> - as.numeric(nation.grant) 1 20.08 6643.0 490.64 > >> - as.numeric(year) 1 28.89 6651.8 490.80 > >> <none> 6622.9 492.28 > >> - as.numeric(Seoul.dist) 1 697.46 7320.4 502.20 > >> Step: AIC=490.32 > >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + > >> as.numeric(city.grant) + > >> as.numeric(DMZ.dist) + as.numeric(Seoul.dist) > >> Df Sum of Sq RSS AIC > >> - as.numeric(DMZ.dist) 1 2.08 6627.0 488.35 > >> - as.numeric(city.grant) 1 10.65 6635.6 488.51 > >> - as.numeric(nation.grant) 1 31.30 6656.2 488.88 > >> - as.numeric(year) 1 31.44 6656.4 488.88 > >> <none> 6624.9 490.32 > >> - as.numeric(Seoul.dist) 1 732.88 7357.8 500.80 > >> Step: AIC=488.35 > >> pop.rate ~ as.numeric(year) + as.numeric(nation.grant) + > >> as.numeric(city.grant) + > >> as.numeric(Seoul.dist) > >> Df Sum of Sq RSS AIC > >> - as.numeric(city.grant) 1 9.86 6636.9 486.53 > >> - as.numeric(year) 1 31.42 6658.4 486.92 > >> - as.numeric(nation.grant) 1 33.33 6660.3 486.95 > >> <none> 6627.0 488.35 > >> - as.numeric(Seoul.dist) 1 754.40 7381.4 499.18 > >> > >> Error in step(lm(pop.rate ~ as.numeric(year) + as.factor(policy) + > >> as.numeric(nation.grant) + : > >> > >> -------------------------------------------------------------------- > ----------------------------------------------------------------------- > >> number of rows in use has changed: remove missing values? > >> > >> -------------------------------------------------------------------- > ---------------------- > >> > >> > >> > >> > >> -- > >> Kum-Hoe Hwang, Ph.D. > >> > >> Phone : 82-31-250-3516 > >> Email : phdhw...@gmail.com > >> > >> > > -- > > Peter Ehlers > > University of Calgary > > > > > > -- > Kum-Hoe Hwang, Ph.D. > > Phone : 82-31-250-3516 > Email : phdhw...@gmail.com > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting- > guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.