Re: Event time issues when Kinesis consumer receives batched messages

2020-12-09 Thread Randal Pitt
Thanks Roman, I'll look into how I go about doing that. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Event time issues when Kinesis consumer receives batched messages

2020-12-08 Thread Khachatryan Roman
Thanks, Randal, Yes, I think the only way is to partition the stream the same way as kinesis does (as I wrote before). Regards, Roman On Tue, Dec 8, 2020 at 1:38 PM Randal Pitt wrote: > Hi Roman, > > We're using a custom watermarker that uses a histogram to calculate a "best > fit" event time

Re: Event time issues when Kinesis consumer receives batched messages

2020-12-08 Thread Randal Pitt
Hi Roman, We're using a custom watermarker that uses a histogram to calculate a "best fit" event time as the data we receive can be very unordered. As you can see we're using the timestamp from the first event in the batch, so we're essentially sampling the timestamps rather than using them all.

Re: Event time issues when Kinesis consumer receives batched messages

2020-12-08 Thread Khachatryan Roman
Hi Randal, Can you share the code for the 1st approach (FlinkKinesisConsumer.setPeriodicWatermarkAssigner))? I think the 2nd approach (flatMap) can be improved by partitioning the stream the same way kinesis does (i.e. same partition key). Regards, Roman On Mon, Dec 7, 2020 at 2:44 PM Randal Pi

Event time issues when Kinesis consumer receives batched messages

2020-12-07 Thread Randal Pitt
Hi there, We're using Flink to read from a Kinesis stream. The stream contains messages that themselves contain lists of events and we want our Flink jobs (using the event time characteristic) to process those events individually. We have this working using flatMap in the DataStream but we're havi