Hello,
reading from spark-csv, got some lines with missing data (not invalid).
applying map() to create a LabeledPoint with denseVector. Using map( Row =>
Row.getDouble(col_index) )
To this point:
res173: org.apache.spark.mllib.regression.LabeledPoint =
(-1.530132691E9,[162.89431,13.55811,18.3346818,-1.6653182])
As running the following code:
val model = new LogisticRegressionWithLBFGS().
setNumClasses(2).
setValidateData(true).
run(data_map)
java.lang.RuntimeException: Failed to check null bit for primitive double
value.
Debugging this, I am pretty sure this is because rows that look like
-2.593849123898,392.293891,,,,