Re: PyFlink Perfomance

2021-12-21 Thread Francis Conroy
Hi Dian, I'll build up something similar and post it, my current test code contains proprietary information. On Wed, 22 Dec 2021 at 14:49, Dian Fu wrote: > Hi Francis, > > Could you share the benchmark code you use? > > Regards, > Dian > > On Wed, Dec 22, 2021 at 11:31 AM Francis Conroy < > fran

Re: PyFlink Perfomance

2021-12-21 Thread Dian Fu
Hi Francis, Could you share the benchmark code you use? Regards, Dian On Wed, Dec 22, 2021 at 11:31 AM Francis Conroy < francis.con...@switchdin.com> wrote: > I've just run an analysis using a similar example which involves a single > python flatmap operator and we're getting 100x less through

Re: PyFlink Perfomance

2021-12-21 Thread Francis Conroy
I've just run an analysis using a similar example which involves a single python flatmap operator and we're getting 100x less through by using python over java. I'm interested to know if you can do such a comparison. I'm using Flink 14.0. Thanks, Francis On Thu, 18 Nov 2021 at 02:20, Thomas Portu

Re: PyFlink Perfomance

2021-11-17 Thread Dian Fu
Hi, Is it possible to perform some benchmark for the first map (not the whole job)? Then you could get a basic understanding of whether the map implementation is a problem. Besides the map implementation, there is also some overhead introduced by the framework, e.g. the Java and Python process com

PyFlink Perfomance

2021-11-17 Thread Thomas Portugal
Hello community, My team is developing an application using Pyflink. We are using the Datastream API. Basically, we read from a kafka topic, do some maps, and write on another kafka topic. One restriction about it is the first map, that has to be serialized and with parallelism equals to one. This