Hi Xingbo, all,

That is good to know, thank you. Is there any Jira issue I can track? I'm
curious to follow this progress! Do you have any recommendations with
regard to these two configuration values, to get somewhat reasonable
performance?

Thanks a lot!
Wouter

On Thu, 8 Jul 2021 at 10:26, Xingbo Huang <hxbks...@gmail.com> wrote:

> Hi Wouter,
>
> In fact, our users have encountered the same problem. Whenever the `bundle
> size` or `bundle time` is reached, the data in the buffer needs to be sent
> from the jvm to the pvm, and then waits for the pym to be processed and
> sent back to the jvm to send all the results to the downstream operator,
> which leads to a large delay, especially when it is a small size event as
> small messages are hard to be processed in pipeline.
>
> I have been solving this problem recently and I plan to make this
> optimization to release-1.14.
>
> Best,
> Xingbo
>
> Wouter Zorgdrager <zorgdrag...@gmail.com> 于2021年7月8日周四 下午3:41写道:
>
>> Hi Dian, all,
>>
>>  I will come back to the other points asap. However, I’m still confused
>> about this performance. Is this what I can expect in PyFlink in terms of
>> performance? ~ 1000ms latency for single events? I also had a very simple
>> setup where I send 1000 events to Kafka per second and response
>> times/latencies was around 15 seconds for single events. I understand there
>> is some Python/JVM overhead but since Flink is so performant, I would
>> expect much better numbers. In the current situation, PyFlink would just be
>> unusable if you care about latency. Is this something that you expect to be
>> improved in the future?
>>
>> I will verify how this works out for Beam in a remote environment.
>>
>> Thanks again!
>> Wouter
>>
>>
>> On Thu, 8 Jul 2021 at 08:28, Dian Fu <dian0511...@gmail.com> wrote:
>>
>>> Hi Wouter,
>>>
>>> 1) Regarding the performance difference between Beam and PyFlink, I
>>> guess it’s because you are using an in-memory runner when running it
>>> locally in Beam. In that case, the code path is totally differently
>>> compared to running in a remote cluster.
>>> 2) Regarding to `flink run`, I’m surprising that it’s running locally.
>>> Could you submit a java job with similar commands to see how it runs?
>>> 3) Regarding to `flink run-application`, could you share the exception
>>> stack?
>>>
>>> Regards,
>>> Dian
>>>
>>> 2021年7月6日 下午4:58,Wouter Zorgdrager <zorgdrag...@gmail.com> 写道:
>>>
>>> uses
>>>
>>>
>>>

Reply via email to