You need to define the levels of the training set to include all levels that you might see. Something like this
> A <- factor(letters[1:5]) > B <- factor(letters[c(1,3,5,7,9)]) > A [1] a b c d e Levels: a b c d e > B [1] a c e g i Levels: a c e g i > training <- factor(A, levels=unique(c(levels(A), levels(B)))) > training [1] a b c d e Levels: a b c d e g i > In the future please "provide commented, minimal, self-contained, reproducible code." On Mon, Jan 12, 2015 at 9:00 PM, HelponR <suncert...@gmail.com> wrote: > It looks like gbm, glm all has this issue > > I wonder if any R package is immune of this? > > In reality, it is very normal that test data has data unseen in training > data. It looks like I have to give up R? > > Thanks! > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.