[
https://issues.apache.org/jira/browse/SPARK-17987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sean Owen resolved SPARK-17987.
-------------------------------
Resolution: Not A Problem
> ML Evaluator fails to handle null values in the dataset
> -------------------------------------------------------
>
> Key: SPARK-17987
> URL: https://issues.apache.org/jira/browse/SPARK-17987
> Project: Spark
> Issue Type: Improvement
> Components: ML
> Affects Versions: 1.6.2, 2.0.1
> Reporter: bo song
>
> Take the RegressionEvaluator as an example, when the predictionCol is null in
> a row, en exception "scala.MatchEror" will be thrown. The missing null
> prediction is a common case, for example when an predictor is missing, or its
> value is out of bound, almost machine learning models could not produce
> correct predictions, then null predictions would be returned. Evaluators
> should handle the null values instead of an exception thrown, the common way
> to handle missing null values is to ignore them. Besides of the null value,
> the NAN value need to be handled correctly too.
> Those three evaluators RegressionEvaluator, BinaryClassificationEvaluator and
> MulticlassClassificationEvaluator have the same problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]