Hi All,

I'm a newbie and have two questions.  Please pardon me if they are very basic.


1.  I'm using a regression tree to predict the selling prices of 10 new records 
(homes).  The following code is resulting in an error message:  pred <- 
predict(model, newdata = outOfSample[, -6]) 

The error message is:

Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = 
attr(object,  : 
factor Sq. Feet has new levels 1375, 1421, 1547, 1621, 1868, 2211, 2265, 2530, 
2672, 3365


Does anybody know what is causing this?  I've pasted a snippet of my original 
dataset (Crankshaw) and my out-of-sample dataset below.  Below it appears all 
code which I entered leading up to that point.  The error message appears at 
the end of that code.


2.  How can I get the regression tree to display in a more "friendly" way?  
Unfortunately I cannot paste a picture of it in this email, but it displays the 
values of individual records at each node instead of the decision rule logic 
(e.g., Age >= 28).  I'm using the command > fancyRpartPlot(model) to display 
the tree.


Thank you!
Gary

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Original Data (Crankshaw):

Sq. Feet                Age     Bedrm   Bathrm  Garage  Sell Price ($)
1620            17      3       2       2       185500
1864            28      3       2       2       195250
1628            15      3       2       2       190750
1670            1       4       3       2       195750
1762            23      3       4       2       197250
1520            1       3       3       2       192900


Out-of-Sample Data:

NEW RECORDS:                                    
Sq. Feet                Age     Bedrm   Bathrm  Garage  Sell Price ($)
3365            8       4       4       3       
1547            28      3       2       2       
1375            36      2       1       1       
1621            53      3       1       2       
2530            23      4       3       2       
1868            42      3       2       2       
2211            23      3       2       2       
1421            39      2       1       1       
2672            3       4       2       3       
2265            7       3       2       2       


All Code Entered:

> Crankshaw <- read_excel("C:/Data/Excel/Crankshaw.xlsx")
> View(Crankshaw)
> outOfSample <- Crankshaw[305:nrow(Crankshaw), ]
> Crankshaw <- Crankshaw[1:300, ]
> install.packages("caret")
Installing package into ‘C:/Users/Jason/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/caret_6.0-78.zip'
Content type 'application/zip' length 5155836 bytes (4.9 MB)
downloaded 4.9 MB

package ‘caret’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\Jason\AppData\Local\Temp\RtmpmAxrJR\downloaded_packages
> install.packages("rattle")
Installing package into ‘C:/Users/Jason/Documents/R/win-library/3.4’
(as ‘lib’ is unspecified)
trying URL 'https://cran.rstudio.com/bin/windows/contrib/3.4/rattle_5.1.0.zip'
Content type 'application/zip' length 1287407 bytes (1.2 MB)
downloaded 1.2 MB

package ‘rattle’ successfully unpacked and MD5 sums checked

The downloaded binary packages are in
        C:\Users\Jason\AppData\Local\Temp\RtmpmAxrJR\downloaded_packages
> library(rpart)
> library(caret)
Loading required package: lattice
Loading required package: ggplot2
Warning messages:
1: package ‘caret’ was built under R version 3.4.3 
2: package ‘ggplot2’ was built under R version 3.4.3 
> library(rattle)
> n <- nrow(Crankshaw)
> train <- sample(1:n, size = 0.5 * n, replace = FALSE)
> CrankshawTrain <- Crankshaw[train, ]
> temp <- (1:n)[-train]
> val <- sample(temp, size = (0.3 / 0.5) * length(temp), replace = FALSE)
> CrankshawVal <- Crankshaw[val, ]
> test <- (1:n)[-c(train, val)]
> CrankshawTest <- Crankshaw[test, ]
> model <- rpart(`Selling Price ($)` ~ ., method = "anova", data = 
> CrankshawTrain)
> fancyRpartPlot(model)
> pred <- predict(model, newdata = outOfSample[, -6])
Error in model.frame.default(Terms, newdata, na.action = na.action, xlev = 
attr(object,  : 
  factor Sq. Feet has new levels 1375, 1421, 1547, 1621, 1868, 2211, 2265, 
2530, 2672, 3365


---
This email has been checked for viruses by Avast antivirus software.
https://www.avast.com/antivirus

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to