Hi dear R-help memebers, When building a CART model (specifically classification tree) using rpart, it is sometimes obvious that there are variables (X's) that are meaningful for predicting some of the outcome (y) variables - while other predictors are relevant for other outcome variables (y's only).
*How can it be estimated, which explanatory variable is "used" for which of the predicted value in the outcome variable?* Here is an example code in which x2 is the only important variable for predicting "b" (one of the y outcomes). There is no predicting variable for "c", and x1 is a predictor for "a", assuming that x2 permits it. How can this situation be shown using the an rpart fitted model? N <- 200 set.seed(5123) x1 <- runif(N) x2 <- runif(N) x3 <- runif(N) y <- sample(letters[1:3], N, T) y[x1 <.5] <- "a" y[x2 <.1] <- "b" fit <- rpart(y ~ x1+x2) fit2 <- prune(fit, cp= 0.07) plot(fit2) text(fit2, use.n=TRUE) Thanks, Tal ----------------Contact Details:------------------------------------------------------- Contact me: tal.gal...@gmail.com | 972-52-7275845 Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | www.r-statistics.com (English) ---------------------------------------------------------------------------------------------- [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.