Dear all, I was recently investigating why the chaining behavior of a Flink SQL job containing union ops is a bit surprising. The SQL, simplified to the extreme, is as below:
CREATE TABLE datagen_source (word VARCHAR) WITH ('connector' = 'datagen', 'rows-per-second' = '5'); CREATE TABLE blackhole_sink (word VARCHAR) WITH ('connector' = 'blackhole'); INSERT INTO blackhole_sink SELECT word FROM ( SELECT word FROM datagen_source WHERE word = '1' UNION ALL SELECT word FROM datagen_source WHERE word = '1' ) With all the operators having the same parallelism, I thought all the ops should be chained, but it turns out that the sink is not chained. I found the following comment in the code piece for checking the eligibility of chaining in JobGraphGenerator::createSingleInputVertex: "first op after union is stand-alone, because union is merged" that could be relevant, but I'm not sure what it means. Could anyone enlighten me how to understand this? Best, Zhanghao Chen