Re: Using event time with Python DataStreamAPI

2021-08-04 Thread Ignacio Taranto
That's what I thought Dian. The problem is that setting the watermark strategy like that didn't work either, the method on_event_time is never called. I did some reading of https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/event-time/generating_watermarks/#watermark

Re: Using event time with Python DataStreamAPI

2021-08-03 Thread Dian Fu
Hi Ignacio, Yes, you are right that you need to define the watermark strategy explicitly in case of event time processing. Regarding to *with_timestamp_assigner*, this is optional. If you don’t define it, it will generate watermark according to the timestamp extracted from the Kafka record (Co

Re: Using event time with Python DataStreamAPI

2021-08-03 Thread Ignacio Taranto
I assumed that the event time and watermarks were already handled by the Kafka connector. So, basically, I need to do something like: stream.assign_timestamps_and_watermarks( WatermarkStrategy.for_bounded_out_of_orderness(Duration.of_seconds(30)) ) Do I also need to set the timestamps myself by

Re: Using event time with Python DataStreamAPI

2021-08-03 Thread Ignacio Taranto
I'm trying to so a simple word count example, here's the main function: def main(): parser = argparse.ArgumentParser() parser.add_argument('--kafka-clients-jar', required=True) parser.add_argument('--flink-connector-kafka-jar', required=True) args = parser.parse_args() env =

Re: Using event time with Python DataStreamAPI

2021-08-02 Thread Dian Fu
Regarding "Kafka consumer doesn’t read any message”: I’m wondering about this. Usually the processing logic should not affect the Kafka consumer. Did you judge this as there is no output for the job? If so, I’m guessing that it’s because the window wasn’t triggered in case of event-time. Could