RE: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

LINZ, Arnaud Tue, 15 Mar 2016 04:03:08 -0700

Hi,

All right… I find this new behavior dangerous since you’ll always miss the last 
elements of a source that does not last forever if you use processing time 
windows.
I’ve created a source wrapper that sleeps at the end of the last element so 
that unit test that use processing time work.


Cheers,
Arnaud


De : Till Rohrmann [mailto:trohrm...@apache.org]
Envoyé : lundi 14 mars 2016 15:11
À : user@flink.apache.org
Objet : Re: TimeWindow not getting last elements any longer with flink 1.0 vs 
0.10.1


Hi Arnaud,

with version 1.0 the behaviour for window triggering in case of a finite stream 
was slightly changed. If you use event time, then all unfinished windows are 
triggered in case that your stream ends. This can be motivated by the fact that 
the end of a stream is equivalent to no elements will arrive until the maximum 
time (infinity) has been reached. This knowledge, allows you to emit a 
Long.MaxValue watermark when an event time stream is finished, which will 
trigger all lingering windows.

In contrast to event time, you cannot say the same about a finished processing 
time stream. There we don’t have logical time but the actual processing time we 
use to reason about windows. When a stream finishes, then we cannot fast 
forward the processing time to a point where the windows will fire. This can 
only happen if we keep the operators alive until the wall clock tells us that 
it’s time to fire the windows. However, there is no such feature implemented 
yet in Flink.

I hope this helps you to understand the failing test cases.

Cheers,
Till


On Mon, Mar 14, 2016 at 1:14 PM, LINZ, Arnaud 
<al...@bouyguestelecom.fr<mailto:al...@bouyguestelecom.fr>> wrote:
Hello,

I’ve switched my Flink version from 0.10.1 to 1.0 and I have a regression in 
some  of my unit tests.

To narrow the problem, here is what I’ve figured out:


-          I use a simple Streaming application with a source defined as 
“fromElements("Element 1", "Element 2", "Element 3")

-          I use a simple time window function with a 3 second window : 
timeWindowAll(Time.seconds(3))

-          I use an apply() function and counts the total number of elements I 
get with a global counter

With the previous version, I got all three elements because, not because they 
are  triggered under 3 seconds, but because the source ends
With the 1.0 version, I don’t get any elements, and that’s annoying because as 
the source ends the application ends even if I sleep 5 seconds after the 
execute() method.

(If I replace fromElement with fromCollection with a 10000 element list and 
Time.second(3) with Time.millisecond(1), I get a random number of elements)

Is this behavior wanted ? If yes, how do I get my last elements now ?

Best regards,
Arnaud




________________________________

L'intégrité de ce message n'étant pas assurée sur internet, la société 
expéditrice ne peut être tenue responsable de son contenu ni de ses pièces 
jointes. Toute utilisation ou diffusion non autorisée est interdite. Si vous 
n'êtes pas destinataire de ce message, merci de le détruire et d'avertir 
l'expéditeur.

The integrity of this message cannot be guaranteed on the Internet. The company 
that sent this message cannot therefore be held liable for its content nor 
attachments. Any unauthorized use or dissemination is prohibited. If you are 
not the intended recipient of this message, then please delete it and notify 
the sender.

RE: TimeWindow not getting last elements any longer with flink 1.0 vs 0.10.1

Reply via email to