Hi dear R-help memebers,

When building a CART model (specifically classification tree) using rpart,
it is sometimes obvious that there are variables (X's) that are meaningful
for predicting some of the outcome (y) variables - while other predictors
are relevant for other outcome variables (y's only).

*How can it be estimated, which explanatory variable is "used" for which of
the predicted value in the outcome variable?*

Here is an example code in which x2 is the only important variable for
predicting "b" (one of the y outcomes). There is no predicting variable for
"c", and x1 is a predictor for "a", assuming that x2 permits it.

How can this situation be shown using the an rpart fitted model?

N <- 200
set.seed(5123)

x1 <- runif(N)

x2 <- runif(N)

x3 <- runif(N)

y <- sample(letters[1:3], N, T)

y[x1 <.5] <- "a"

y[x2 <.1] <- "b"

fit <- rpart(y ~ x1+x2)

fit2 <- prune(fit, cp= 0.07)

plot(fit2)

text(fit2, use.n=TRUE)

Thanks,

Tal



----------------Contact
Details:-------------------------------------------------------
Contact me: tal.gal...@gmail.com |  972-52-7275845
Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) |
www.r-statistics.com (English)
----------------------------------------------------------------------------------------------

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to