Dear list
I've a generic question about how to tune an SVM
I'm trying to classify  with caret package some population data from a
case-control study . In each column of my matrix there are the SNP
genotypes , in each row there are the individuals.
I correctly splitted my total dataset in training(132 individuals) and test
(50 individuals) (respecting the total observed genotypic frequencies and
the % of cases and controls)
After training (with radial RBF function)  I have an accuracy of the best
model of 76% but applying the model to my test dataset the accuracy
decreases to 52%.
Obviously i expected the decrease but this appear to be quite big in my
opinion.
I manually checked the predictions for my test dataset and some cases that
have no risk allele are not well classified. Similar cases in my training
dataset are well recognized.
Please could you suggest to me which parameters modify  in order to improve
the classification for the test dataset? or better which could be the
causes that could originate this big discrepancy?
I know that my question is very generic but i'm very newbie to this kind of
analysis so please any suggestion is the welcome
thank you very much
Guido

-- 
Guido Leoni
National Research Institute on Food and Nutrition
(I.N.R.A.N.)
via Ardeatina 546
00178 Rome
Italy

tel     + 39 06 51 49 41 (operator)
        + 39 06 51 49 4498 (direct)

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to