Hi, I am using rpart decision trees to analyze customer churn. I am finding that the decision trees created are not effective because they are not able to recognize factors that influence churn. I have created an example situation below. What do I need to do to for rpart to build a tree with the variable experience? My guess is that this would happen if rpart used the loss matrix while creating the tree.
> experience <- as.factor(c(rep("good",90), rep("bad",10))) > cancel <- as.factor(c(rep("no",85), rep("yes",5), rep("no",5), rep("yes",5))) > table(experience, cancel) cancel experience no yes bad 5 5 good 85 5 > rpart(cancel ~ experience) n= 100 node), split, n, loss, yval, (yprob) * denotes terminal node 1) root 100 10 no (0.9000000 0.1000000) * I tried the following commands with no success. rpart(cancel ~ experience, control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(split='information')) rpart(cancel ~ experience, parms=list(split='information'), control=rpart.control(cp=.0001)) rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,10000,0), nrow=2, ncol=2))) Thanks a lot for your help. Best regards, Robert [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.