[ https://issues.apache.org/jira/browse/SPARK-16098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sean Owen resolved SPARK-16098. ------------------------------- Resolution: Won't Fix > Multiclass SVM Learning > ----------------------- > > Key: SPARK-16098 > URL: https://issues.apache.org/jira/browse/SPARK-16098 > Project: Spark > Issue Type: Improvement > Components: ML, MLlib > Affects Versions: 2.0.0 > Environment: Spark MLLib and ML 1.6.1 > Reporter: Hayri Volkan Agun > Priority: Minor > Original Estimate: 1,512h > Remaining Estimate: 1,512h > > There exists a OneVsRest classifier for using all binary classification > classifiers in multi-class classification. However for Linear SVM using > OneVsRest may create an imbalanced dataset scenarios where SVM of Spark > certainly fails. I verified this by creating LinearSVM classifier and > implemented predictRaw method of ClassificationModel class. In all > experiments the results came very poor in terms of F-Measure. The only > explanation is SVM is very sensitive to imbalanced dataset, and naturally > OneVsRest classifier creates an imbalanced dataset. > For multi-class classification, linear SVM can be optimized by considering > imbalanced datasets. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org