Re: CoFlatMap has high back pressure

2020-05-20 Thread sundar
Thanks a lot for all the help! I was able to figure out the bug just now. I had some extra code in the coFlatMap function(emitting stats) which was inefficient and causing high GC usage. Fixing that fixed the issue. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble

Re: CoFlatMap has high back pressure

2020-05-19 Thread Arvid Heise
Hi Sundar, in general, you wouldn't load the static data in the driver, but upon opening the map on the processing nodes. If your processing nodes could hold the data, it might be the easiest to switch to this pattern. You usually load it once per node across all subtasks by using some kind of sta

Re: CoFlatMap has high back pressure

2020-05-19 Thread sundar
Hi Guaowei, Here is what the code for my pipeline looks like. Class CoFlatMapFunc extends CoFlatMapFunction { ValueState cache; public void open(Configuration parameters){ //initialize cache } //read element from file and update cache. public void

Re: CoFlatMap has high back pressure

2020-05-19 Thread Guowei Ma
Hi Sundar, 1. Could you check the GC status of the process? or you could increase the memory size of your TM. (I find that you use the value state and I assume that you use the MemoryStatebackend) 2. AFAIK there is no performance limitation in using the `connect` operator for mixing the bounded/unb

CoFlatMap has high back pressure

2020-05-19 Thread sundar
Initially, we had a flink pipeline which did the following - Kafka source -> KeyBy ID -> Map -> Kafka Sink. We now want to enrich this pipeline with a side input from a file. Similar to the recommendation here

Re: CoFlatMap has high back pressure

2020-05-19 Thread Guowei Ma
Hi, Sundar 1. I think you might need to jstack the java process that is a bottleneck and find where the task is stuck. 2. Could you share the code that your job looked like? I think maybe it could help people to know what exactly happened. Best, Guowei sundar 于2020年5月20日周三 上午5:24写道: > Initial