Hi all, I added a CoGroup to my batch job, and it’s now running much slower, primarily due to back pressure from the CoGroup operator.
I assume it’s because this operator is having to sort/buffer-to-disk all incoming data. Looks like about 1TB from one side of the join, currently very little from the other but will be up to 2TB in the future. I don’t see lots of GC, I’m using about 60% of available network buffers, per TM server load (for all 8 servers) is about 40% average, and both SSDs on each TM are being used for …/flink-io-xxx/yyy.channel files. What are techniques for improving the performance of a CoGroup? Thanks! — Ken -------------------------- Ken Krugler http://www.scaleunlimited.com custom big data solutions & training Hadoop, Cascading, Cassandra & Solr