As some additional information, I re-ran the model across the range of n = 50
to 150 (n being the 'top n' parameters returned by chi.squared), and this
time used a completed different subset of the data for both training and
test. Nearly identical results, with the typical train AUC about 0.98 and
the typical test AUC about 0.56.  The other change I made: 30k records
(instances) for training this time and 20k for test.

I'll check to see if the set of class labels I'm using (I'm currently only
running one of the 3 sets) is the least balanced and if so, I'll grab the
most balanced. However, none of the three sets is much better than 90/10 I
don't think.



--
View this message in context: 
http://r.789695.n4.nabble.com/Analyzing-Poor-Performance-Using-naiveBayes-tp4639825p4639985.html
Sent from the R help mailing list archive at Nabble.com.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to