Yes, please go ahead. That would be helpful. On Mon, 2 May 2016 at 21:56 Christopher Santiago <ch...@ninjametrics.com> wrote:
> Hi Aljoscha, > > Yes, there is still a high partition/window count since I have to keyby > the userid so that I get unique users. I believe what I see happening is > that the second window with the timeWindowAll is not getting all the > results or the results from the previous window are changing when the > second window is running. I can see the date/unique user count increase > and decrease as it is running for a particular day. > > I can share the eclipse project and the sample data file I am working off > of with you if that would be helpful. > > Thanks, > Chris > > On Mon, May 2, 2016 at 12:55 AM, Aljoscha Krettek [via Apache Flink User > Mailing List archive.] <[hidden email] > <http:///user/SendEmail.jtp?type=node&node=6626&i=0>> wrote: > >> Hi, >> what do you mean by "still experiencing the same issues"? Is the key >> count still very hight, i.e. 500k windows? >> >> For the watermark generation, specifying a lag of 2 days is very >> conservative. If the watermark is this conservative I guess there will >> never arrive elements that are behind the watermark, thus you wouldn't need >> the late-element handling in your triggers. The late-element handling in >> Triggers is only required to compensate for the fact that the watermark can >> be a heuristic and not always correct. >> >> Cheers, >> Aljoscha >> >> On Thu, 28 Apr 2016 at 21:24 Christopher Santiago <[hidden email] >> <http:///user/SendEmail.jtp?type=node&node=6601&i=0>> wrote: >> >>> Hi Aljoscha, >>> >>> >>> Aljoscha Krettek wrote >>> >>is there are reason for keying on both the "date only" field and the >>> "userid". I think you should be fine by just specifying that you want >>> 1-day >>> windows on your timestamps. >>> >>> My mistake, this was from earlier tests that I had performed. I removed >>> it >>> and went to keyBy(2) and I am still experiencing the same issues. >>> >>> >>> Aljoscha Krettek wrote >>> >>Also, do you have a timestamp extractor in place that takes the >>> timestamp >>> from your data and sets it as the internal timestamp field. >>> >>> Yes there is, it is from the BoundedOutOfOrdernessGenerator example: >>> >>> public static class BoundedOutOfOrdernessGenerator implements >>> AssignerWithPeriodicWatermarks<Tuple3<DateTime, String, String>> { >>> private static final long serialVersionUID = 1L; >>> private final long maxOutOfOrderness = >>> Time.days(2).toMilliseconds(); >>> private long currentMaxTimestamp; >>> >>> @Override >>> public long extractTimestamp(Tuple3<DateTime, String, String> >>> element, long previousElementTimestamp) { >>> long timestamp = element.f0.getMillis(); >>> currentMaxTimestamp = Math.max(timestamp, >>> currentMaxTimestamp); >>> return timestamp; >>> } >>> >>> @Override >>> public Watermark getCurrentWatermark() { >>> return new Watermark(currentMaxTimestamp - >>> maxOutOfOrderness); >>> } >>> } >>> >>> Thanks, >>> Chris >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-windows-with-large-number-of-partitions-tp6521p6562.html >>> Sent from the Apache Flink User Mailing List archive. mailing list >>> archive at Nabble.com. >>> >> >> >> ------------------------------ >> If you reply to this email, your message will be added to the discussion >> below: >> >> http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-windows-with-large-number-of-partitions-tp6521p6601.html >> To unsubscribe from Multiple windows with large number of partitions, click >> here. >> NAML >> <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml> >> > > > ------------------------------ > View this message in context: Re: Multiple windows with large number of > partitions > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-windows-with-large-number-of-partitions-tp6521p6626.html> > Sent from the Apache Flink User Mailing List archive. mailing list archive > <http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/> at > Nabble.com. >