Is it just me or does MSE tend to increase with more iterations of Linear
Regression?
Using 1.0.2 (or 1.1)
%flink
import org.apache.flink.ml.optimization.SimpleGradientDescent
import org.apache.flink.ml.optimization.LearningRateMethod
import org.apache.flink.ml.regression.MultipleLinearRegression
import org.apache.flink.ml.common.LabeledVector
import org.apache.flink.ml.math.DenseVector
val survival = env.readCsvFile[(String, String, String,
String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
val survivalLV = survival
.map{tuple =>
val list = tuple.productIterator.toList
val numList = list.map(_.asInstanceOf[String].toDouble)
LabeledVector(numList(3), DenseVector(numList.take(3).toArray))
}
val mlr_default = MultipleLinearRegression()
.setIterations(5)
mlr_default.fit(survivalLV)
val mse1 = mlr_default.squaredResidualSum(survivalLV).collect()
val mlr_default = MultipleLinearRegression()
.setIterations(10)
mlr_default.fit(survivalLV)
val mse2 = mlr_default.squaredResidualSum(survivalLV).collect()
println(mse1 , mse2 )
Results in :
(Buffer(4.047910100612734E28),Buffer(2.6223205846507677E52))
Trevor Grant
Data Scientist
https://github.com/rawkintrevo
http://stackexchange.com/users/3002022/rawkintrevo
http://trevorgrant.org
*"Fortunate is he, who is able to know the causes of things." -Virgil*