😊
https://issues.apache.org/jira/browse/FLINK-7473 ________________________________ From: Steve Jerman <st...@kloudspot.com> Sent: Thursday, August 17, 2017 11:34:09 AM To: Nico Kruber; user@flink.apache.org Subject: Re: Question about Global Windows. Thank you Nico. I *think* I should have one stream per key... the stream I get is pretty fast and there may be some corner cases I'm not aware of. However, I really need to process as a single window per key. I am worried about the cardinality of the key ... I wanted to use a timeout to remove the window for a key. If not the memory requirements would grow quickly (which I think is what is happening). The stream has 60K unique keys per 5 minutes window (maybe 1/2 million total unique per day...). Anyway I'll write a test to investigate further... Thanks for your thoughts Steve ________________________________ From: Nico Kruber <n...@data-artisans.com> Sent: Wednesday, August 16, 2017 3:22:41 AM To: user@flink.apache.org Cc: Steve Jerman Subject: Re: Question about Global Windows. Hi Steve, are you sure a GlobalWindows assigner fits your needs? This may be the case if all your events always come in order and you do not ever have overlapping sessions since a GlobalWindows assigner simply puts all events (per key) into a single window (per key). If you have overlapping sessions, you may need your own window assigner that handles multiple windows (see the EventTimeSessionWindows assigner for our take on event-time session windows). Regarding the timer: if you set it via `#registerEventTimeTimer()`, it only fires if a watermark passes the given timestamp, so you need to make sure your sources create them (see [1] and its sub-topics). Depending on your further constraints in your application, it may be ok to use `registerProcessingTimeTimer()` instead. Does this help already? If not, we'd need some (minimal) example of how your using these things to debug further into your memory issues. Nico [1] https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/ Apache Flink 1.3 Documentation: Application Development<https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/> ci.apache.org Application Development; Application Development event_time.html On Monday, 14 August 2017 06:32:01 CEST Steve Jerman wrote: > Hi Folks, > > I have a question regarding Global Windows. > > I have a stream with a large number of records. The records have a key which > has a very high cardinality. They also have a state ( start, status, > finish). > I need to do some processing where I look at the records separated into > windows using the ‘state’ property. > From the documentation, I believe I should be using a Global Window with a > custom trigger to identify the windows…. > I have this implemented.. the Trigger returns ‘CONTINUE” for ‘start', and > FIRE_AND_PURGE for ‘finish'. > I also need to avoid running out of memory since sometimes I don’t get > ‘finish’ records… so I added a timer to the Trigger which PURGE’s if it > fires.. > Is this the correct approach? > > I say this since I do in fact see a memory leak … is there anything else I > need to be aware of? > Thanks > > Steve