Hi Peter,
The performance drops probably be due to de/serialization.
When tasks are chained, records are simply forwarded as Java objects via
method calls.
When a task chain in broken into multiple operators, the records (Java
objects) are serialized by the sending task, possibly shipped over the
Hi all
We have a pipeline (runs on YARN, Flink v1.7.1) which consumes a union of
Kafka and
HDFS sources. We remarked that the throughput is 10 times higher if only
one of these sources is consumed. While trying to identify the problem I
implemented a no-op source which was unioned with one of the