Is it just me or does MSE tend to increase with more iterations of Linear
Regression?

Using 1.0.2 (or 1.1)

%flink
import org.apache.flink.ml.optimization.SimpleGradientDescent
import org.apache.flink.ml.optimization.LearningRateMethod
import org.apache.flink.ml.regression.MultipleLinearRegression
import org.apache.flink.ml.common.LabeledVector
import org.apache.flink.ml.math.DenseVector

val survival = env.readCsvFile[(String, String, String,
String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
val survivalLV = survival
  .map{tuple =>
    val list = tuple.productIterator.toList
    val numList = list.map(_.asInstanceOf[String].toDouble)
    LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
  }


val mlr_default = MultipleLinearRegression()
                            .setIterations(5)


mlr_default.fit(survivalLV)

val mse1 = mlr_default.squaredResidualSum(survivalLV).collect()


val mlr_default = MultipleLinearRegression()
                            .setIterations(10)

mlr_default.fit(survivalLV)

val mse2 = mlr_default.squaredResidualSum(survivalLV).collect()
println(mse1 , mse2 )


Results in :

(Buffer(4.047910100612734E28),Buffer(2.6223205846507677E52))



Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org

*"Fortunate is he, who is able to know the causes of things."  -Virgil*

Reply via email to