Re: PyFlink Perfomance

2021-11-17 Thread Dian Fu
Hi, Is it possible to perform some benchmark for the first map (not the whole job)? Then you could get a basic understanding of whether the map implementation is a problem. Besides the map implementation, there is also some overhead introduced by the framework, e.g. the Java and Python process com

Re: PyFlink SQL window aggregation data not written out to file

2021-11-17 Thread Guoqin Zheng
Hi Roman, Thanks for the detailed explanation. I did try 1.13 and 1.14, but it still didn't work. I explicitly enabled the checkpoint with: `env.enable_checkpointing(10)`. Any other configurations I need to set? Thanks, -Guoqin On Wed, Nov 17, 2021 at 4:30 AM Roman Khachatryan wrote: > Hi Gu

PyFlink Perfomance

2021-11-17 Thread Thomas Portugal
Hello community, My team is developing an application using Pyflink. We are using the Datastream API. Basically, we read from a kafka topic, do some maps, and write on another kafka topic. One restriction about it is the first map, that has to be serialized and with parallelism equals to one. This

Re: Fabric8 does not support EC keys

2021-11-17 Thread Nicolás Ferrario
Hi Yang, after looking at the source code I tried this other env and this time it worked! It's not failing because of a missing jar but I can add it manually. *export KUBERNETES_CERTS_CLIENT_KEY_ALGO=EC* [ec2-user@ip-10-150-120-176 ~]$ export KUBERNETES_CERTS_CLIENT_KEY_ALGO=EC > > [ec2-user@ip-

Re: PyFlink SQL window aggregation data not written out to file

2021-11-17 Thread Roman Khachatryan
Hi Guoqin, Thanks for the clarification. Processing time windows actually don't need watermarks: they fire when window end time comes. But the job will likely finish earlier because of the bounded input. Handling of this case was improved in 1.14 as part of FLIP-147, as well as in previous versi