*Thank so much Aljoscha* :)
I was stucked in this point. I didn't know that the print or collect method
collecting all the data in one place.

The execution time has dropped a lot.
However, I still get that Flink is slower (just for 7 seconds).

I really think I'm not getting all the performance out of Flink.
Because Flink draws the execution in a cyclic dependency graph meanwhile
Spark uses a DAG,
so it's clear that the Flin's way results in superior scalability and
performance compared to DAG approach.

So... Which is the problem with my code?

//Read data
val data: DataSet[org.apache.flink.ml.common.LabeledVector] =
MLUtils.readLibSVM(benv, "/inputPath/_.libsvm")

// Create multiple linear regression learner
val mlr = MultipleLinearRegression()

val model = mlr.fit(data)

data.writeAsText("file:///outputPath") 

benv.execute()



--
View this message in context: 
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Methods-that-trigger-execution-tp12972p13537.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at 
Nabble.com.

Reply via email to