Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-18 Thread John Stone
Fabian, I believe so, yes. Many thanks, John

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-18 Thread Fabian Hueske
Hi John, Just to clarify, this missing data is due to the starting overhead and not due to a bug? Best, Fabian 2018-09-18 15:35 GMT+02:00 John Stone : > Thank you all for your assistance. I believe I've found the root cause if > the behavior I am seeing. > > If I just use "SELECT * FROM MyEven

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-18 Thread John Stone
Thank you all for your assistance. I believe I've found the root cause if the behavior I am seeing. If I just use "SELECT * FROM MyEventTable" (Fabian's question), I find that events received in the first 3 seconds are ignored as opposed to the original 5. What I'm seeing seems to suggest that

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Xingcan Cui
Hi John, I suppose that was caused by the groupBy field “timestamp”. You were actually grouping on two time fields simultaneously, the processing time and the time from your producer. As @Rong suggested, try removing the additional groupBy field “timestamp” and check the result again. Best, Xi

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Rong Rong
This is in fact a very strange behavior. To add to the discussion, when you mentioned: "raw Flink (windowed or not) nor when using Flink CEP", how were the comparisons being done? Also, were you able to get the results correct without the additional GROUP BY term of "foo" or "userId"? -- Rong On

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Fabian Hueske
Hmm, that's interesting. HOP and TUMBLE window aggregations are directly translated into their corresponding DataStream counterparts (Sliding, Tumble). There should be no filtering of records. I assume you tried a simple query like "SELECT * FROM MyEventTable" and received all expected data? Fabi

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread elliotstone
Yes, I am certain events are being ignored or dropped during the first five seconds. Further investigation on my part reveals that the "ignore" period is exactly the first five seconds of the stream - regardless of the size of the window. Situation I have a script which pushes an event into K

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Xingcan Cui
Hi John, I’ve not dug into this yet, but IMO, it shouldn’t be the case. I just wonder how do you judge that the data in the first five seconds are not processed by the system? Best, Xingcan > On Sep 17, 2018, at 11:21 PM, John Stone wrote: > > Hello, > > I'm checking if this is intentional

Re: Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread Fabian Hueske
Hi John, Are you sure that the first rows of the first window are dropped? When a query with processing time windows is terminated, the last window is not computed. This in fact intentional and does not apply to event-time windows. Best, Fabian 2018-09-17 17:21 GMT+02:00 John Stone : > Hello,

Potential bug in Flink SQL HOP and TUMBLE operators

2018-09-17 Thread John Stone
Hello, I'm checking if this is intentional or a bug in Apache Flink SQL (Flink 1.6.0). I am using processing time with a RocksDB backend. I have not checked if this issue is also occurring in the Table API. I have not checked if this issue also exists for event time (although I suspect it does)