Re: What is the equivalent of Spark RDD is Flink

2015-12-28 Thread Chiwan Park
Hi Filip, Spark executes job also lazily. But It is slightly different from Flink. Flink can execute lazily a whole job which Spark cannot execute lazily. One of example is iterative job. In Spark, each stage of the iteration is submitted, scheduled as a job and executed because of calling act

Re: Explanation of the output of timeWindowAll(Time.milliseconds(3))

2015-12-28 Thread Fabian Hueske
Hi Nirmalya, event time events (such as an event time trigger to compute a window) are triggered when a watermark is received that is larger than the triggers timestamp. By default, watermarks are emitted with a fixed time interval, i.e., every x milliseconds. When a new watermark is emitted, Flin

Re: Understanding Kmeans in Flink

2015-12-28 Thread Márton Balassi
Hey Hajira, Basically lines 2) to 5) determine the "mean" (centroid) of the new clusters that we have just defined by assigning the points in line 1). As calculating the mean is a non-associative function we break it down to two associative parts: summation and counting - which is followed by divi

Re: What is the equivalent of Spark RDD is Flink

2015-12-28 Thread Filip Łęczycki
Hi Aljoscha, Sorry for a little off-topic, but I wanted to calrify whether my understanding is right. You said that "Contrary to Spark, a Flink job is executed lazily", however as I read in available sources, for example http://spark.apache.org/docs/latest/programming-guide.html, chapter "RDD oper

Understanding Kmeans in Flink

2015-12-28 Thread Hajira Jabeen
Hello everyone, I am trying to understand Kmeans in Flink, Scala. I can see that the attached Kmeans-snippet (taken from Flink examples) updates centroids. in (1) map function assigns points to centroids, in (3) centroids are grouped by their ids. in (4) the x and y coordinates are being added