Thanks Guowei! I'll check it out. Best, Zhanghao Chen ________________________________ From: Guowei Ma <guowei....@gmail.com> Sent: Wednesday, April 6, 2022 16:01 To: Zhanghao Chen <zhanghao.c...@outlook.com> Cc: user@flink.apache.org <user@flink.apache.org> Subject: Re: Why first op after union cannot be chained?
Hi Zhanghao AFAIK, you might to see the `StreamingJobGraphGenerator` not the `JobGraphGenerator` which is only used by the old flink stream sql stack. >From comment of the `StreamingJobGraphGenerator::isChainableInput` the `an >union operator` does not support chain currently. Best, Guowei On Wed, Apr 6, 2022 at 12:11 AM Zhanghao Chen <zhanghao.c...@outlook.com<mailto:zhanghao.c...@outlook.com>> wrote: Dear all, I was recently investigating why the chaining behavior of a Flink SQL job containing union ops is a bit surprising. The SQL, simplified to the extreme, is as below: CREATE TABLE datagen_source (word VARCHAR) WITH ('connector' = 'datagen', 'rows-per-second' = '5'); CREATE TABLE blackhole_sink (word VARCHAR) WITH ('connector' = 'blackhole'); INSERT INTO blackhole_sink SELECT word FROM ( SELECT word FROM datagen_source WHERE word = '1' UNION ALL SELECT word FROM datagen_source WHERE word = '1' ) With all the operators having the same parallelism, I thought all the ops should be chained, but it turns out that the sink is not chained. I found the following comment in the code piece for checking the eligibility of chaining in JobGraphGenerator::createSingleInputVertex: "first op after union is stand-alone, because union is merged" that could be relevant, but I'm not sure what it means. Could anyone enlighten me how to understand this? Best, Zhanghao Chen