Hi,
Yes as you correctly analysed parallelism 1 was causing problems, because it
meant that all of the records must been gathered over the network from all of
the task managers. Keep in mind that even if you increase parallelism to āpā,
every change in parallelism can slow down your application
Hello!
I found out that the cause of the problem was the map that I have after the
parallel join with parallelism 1.
When I changed it to .map(new MyMapMeter).setParallelism(p) then when I
increase the number of parallelism p the completion time decreases, which is
reasonable. Somehow it was a bot
Hello everyone!
I have implemented a custom parallel hashjoin algorithm (without windows
feature) in order to calculate the join of two input streams on a common
attribute using the CoFlatMap function and the state. After the join
operator (which has parallelism p = #processors) operator I have a