Re: [DISCUSS] Performance of Beam compare to "Bare Runner"

Maximilian Michels Mon, 29 Apr 2019 09:32:10 -0700

Hi Jozef,

Yes there is potential for overhead with running Beam pipelines ondifferent Runners. The Beam model has an execution framework which eachRunner utilizes in a slightly different way.

Timers in Flink, for example, are uniquely identified by a namespace anda timestamp. In Beam, they are only identified by a namespace and anypending timers with the same namespace will get overwritten in case anew timer with the same namespace is set. To implement this on top ofFlink, we have to maintain a table of timers by namespace; though itseems this did not cause a slowdown in your case.

I think it would be very helpful to compile a list of issues that couldslow down pipelines. How about filing JIRA issues for what youdiscovered during profiling? We could use a "performance" tag fordiscoverability. I'd be eager to investigate some of those.


Thanks,
Max

PS: We have performance regression tests:https://apache-beam-testing.appspot.com/explore?dashboard=5699257587728384


On 29.04.19 12:47, Jozef Vilcek wrote:

Hello,
I am interested in any knowledge or thoughts on what should be / is anoverhead of running Beam pipelines instead of pipelines written on "barerunner". Is this something which is being tested or investigated bycommunity? Is there a consensus in what bounds should the overheadtypically be? I realise this is very runner specific, but certain thingsare imposed also by SDK model itself.
I tested simple streaming pipeline on Flink vs Beam-Flink and found verynoticeable differences. I want to stress out, it was not a performancetest. Job does following:
Read Kafka -> Deserialize to Proto -> Filter deserialisation errors ->Reshuffle -> Report counter.inc() to metrics for throughput
Both jobs had same configuration and same state backed with samecheckpointing strategy. What I noticed from few simple test runs:
* first run on Flink 1.5.0 from CPU profiles on one worker I have foundout that ~50% time was spend either on removing timersfrom HeapInternalTimerService or in java.io.ByteArrayOutputStream fromCoderUtils.clone()
* problem with timer delete was addressed by FLINK-9423. I have retestedon Flink 1.7.2 and there was not much time is spend in timer delete now,but root cause was not removed. It still remains that timers arefrequently registered and removed ( I believefrom ReduceFnRunner.scheduleGarbageCollectionTimer() in which case it iscalled per processed element? ) which is noticeable in GC activity,Heap and State ...
* in Flink I use FileSystem state backed which keeps state in memoryCopyOnWriteStateTable which after some time is full of PaneInfo objects.Maybe they come from PaneInfoTracker activity
* Coder clone is painfull. Pure Flink job does copy between operatorstoo, in my case it is via Kryo.copy() but this is not noticeable in CPUprofile. Kryo.copy() does copy on object level not boject -> bytes ->object which is cheaper
Overall, my observation is that pure Flink can be roughly 3x faster.
I do not know what I am trying to achieve here :) Probably just start adiscussion and collect thoughts and other experiences on the cost ofrunning some data processing on Beam and particular runner.

Re: [DISCUSS] Performance of Beam compare to "Bare Runner"

Reply via email to