Re: [R] Predictions with missing inputs

Bill.Venables Fri, 11 Feb 2011 20:42:35 -0800

With R it is always possible to shoot yourself squarely in the foot, as you 
seem keen to do, but R does at least often make it difficult.


When you predict, you need to have values for ALL variables used in the model.  
Just leaving out the coefficients corresponding to absent predictors is 
equivalent to assuming that those coefficients are zero, and there is no basis 
whatever for so assuming.  (In this constructed example things are different 
because the missing variable is a nonsense variable and the coefficient should 
be roughly zero, as it is, but in general that is not going to be the case.)

So you need to supply some value for each of the missing predictors if you are 
going to use the standard prediction tools.  An obvious plug is the mean of 
that variable in the training data, though more sophisticated alternatives 
would often be available.

Here is a suggestion for your case.

## fit some linear model to random data

x <- matrix(rnorm(100*3),100,3)
y <- sample(1:2, 100, replace = TRUE)
mydata <- data.frame(y, x)
library(splines)                            ## missing from your code.
mymodel <- lm(y ~ ns(X1, df = 3) + X2 + X3, data = mydata)
summary(mymodel)

## create new data with 1 missing input

mynewdata <- within(data.frame(matrix(rnorm(100*2), 100, 2)),  ## add in an X3
                                   X3 <- mean(mydata$X3))
mypred <- predict(mymodel, mynewdata)

________________________________________
From: r-help-boun...@r-project.org [r-help-boun...@r-project.org] On Behalf Of 
Axel Urbiz [axel.ur...@gmail.com]
Sent: 12 February 2011 11:51
To: R-help@r-project.org
Subject: [R] Predictions with missing inputs

Dear users,

I'll appreciate your help with this (hopefully) simple problem.

I have a model object which was fitted to inputs X1, X2, X3. Now, I'd like
to use this object to make predictions on a new data set where only X1 and
X2 are available (just use the estimated coefficients for these variables in
making predictions and ignoring the coefficient on X3). Here's my attempt
but, of course, didn't work.

#fit some linear model to random data

x=matrix(rnorm(100*3),100,3)
y=sample(1:2,100,replace=TRUE)
mydata <- data.frame(y,x)
mymodel <- lm(y ~ ns(X1, df=3) + X2 + X3, data=mydata)
summary(mymodel)

#create new data with 1 missing input

mynewdata <- data.frame(matrix(rnorm(100*2),100,2))
mypred <- predict(mymodel, mynewdata)
Thanks in advance for your help!

Axel.

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Predictions with missing inputs

Reply via email to