Hello Flink Community, The same problem with record amplification as described in FLIP-415: Introduce a new join operator to support minibatch[1] exists for most of implementations of AbstractTopNFunction. Especially when the rank is provided to output. For example, when calculating Top100 with rank output, every input record might produce 100 -U records and 100 +U records.
According to my POC (which is similar to FLIP-415) the record amplification could be significantly reduced by using input or output buffer. What do you think if we implement such optimization for TopNFunctions? [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-415 %3A+Introduce+a+new+join+operator+to+support+minibatch -- Best regards, Roman Boyko e.: ro.v.bo...@gmail.com m.: +79059592443 telegram: @rboyko