Hi Jiabao, Thanks for clarifying this. While by "scan.filter-push-down.enabled takes a higher priority" I meant that this value should be respected whenever it is set explicitly.
The conclusion that 2. "table.optimizer.source.predicate" = "true" and > "scan.filter-push-down.enabled" = "false" > Allow the planner to perform predicate pushdown, but individual sources do > not enable filter pushdown. > This indicates that the option "scan.filter-push-down.enabled = false" for an individual source connector does indeed override the global-level planner settings to make a difference. And thus "has a higher priority". While for 3. "table.optimizer.source.predicate" = "false" > Predicate pushdown is not allowed for the planner. > Regardless of the value of the "scan.filter-push-down.enabled" > configuration, filter pushdown is disabled. > In this scenario, the behavior remains consistent with the old version as > well. > I still think "scan.filter-push-down.enabled" should also be respected if it is enabled for individual connectors. WDYT? Best, Jane On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid> wrote: > Thanks Benchao for the feedback. > > For the current proposal, we recommend keeping the default value of > "table.optimizer.source.predicate" as true, > and setting the the default value of newly introduced option > "scan.filter-push-down.enabled" to true as well. > > The main purpose of doing this is to maintain consistency with previous > versions, as whether to perform > filter pushdown in the old version solely depends on the > "table.optimizer.source.predicate" option. > That means by default, as long as a TableSource implements the > SupportsFilterPushDown interface, filter pushdown is allowed. > And it seems that we don't have much benefit in changing the default value > of "table.optimizer.source.predicate" to false. > > Regarding the priority of these two configurations, I believe that > "table.optimizer.source.predicate" > takes precedence over "scan.filter-push-down.enabled" and it exhibits the > following behavior. > > 1. "table.optimizer.source.predicate" = "true" and > "scan.filter-push-down.enabled" = "true" > This is the default behavior, allowing filter pushdown for sources. > > 2. "table.optimizer.source.predicate" = "true" and > "scan.filter-push-down.enabled" = "false" > Allow the planner to perform predicate pushdown, but individual sources do > not enable filter pushdown. > > 3. "table.optimizer.source.predicate" = "false" > Predicate pushdown is not allowed for the planner. > Regardless of the value of the "scan.filter-push-down.enabled" > configuration, filter pushdown is disabled. > In this scenario, the behavior remains consistent with the old version as > well. > > > From an implementation perspective, setting the priority of > "scan.filter-push-down.enabled" higher than > "table.optimizer.source.predicate" is difficult to achieve now. > Because the PushFilterIntoSourceScanRuleBase at the planner level takes > precedence over the source-level FilterPushDownSpec. > Only when the PushFilterIntoSourceScanRuleBase is enabled, will the > Source-level filter pushdown be performed. > > Additionally, in my opinion, there doesn't seem to be much benefit in > setting a higher priority for "scan.filter-push-down.enabled". > It may instead affect compatibility and increase implementation complexity. > > WDYT? > > Best, > Jiabao > > > > 2023年10月25日 11:56,Benchao Li <libenc...@apache.org> 写道: > > > > I agree with Jane that fine-grained configurations should have higher > > priority than job level configurations. > > > > For current proposal, we can achieve that: > > - Set "table.optimizer.source.predicate" = "true" to enable by > > default, and set ""scan.filter-push-down.enabled" = "false" to disable > > it per table source > > - Set "table.optimizer.source.predicate" = "false" to disable by > > default, and set ""scan.filter-push-down.enabled" = "true" to enable > > it per table source > > > > Jane Chan <qingyue....@gmail.com> 于2023年10月24日周二 23:55写道: > >> > >>> > >>> I believe that the configuration "table.optimizer.source.predicate" > has a > >>> higher priority at the planner level than the configuration at the > source > >>> level, > >>> and it seems easy to implement now. > >>> > >> > >> Correct me if I'm wrong, but I think the fine-grained configuration > >> "scan.filter-push-down.enabled" should have a higher priority because > the > >> default value of "table.optimizer.source.predicate" is true. As a > result, > >> turning off filter push-down for a specific source will not take effect > >> unless the default value of "table.optimizer.source.predicate" is > changed > >> to false, or, alternatively, let users manually set > >> "table.optimizer.source.predicate" to false first and then selectively > >> enable filter push-down for the desired sources, which is less > intuitive. > >> WDYT? > >> > >> Best, > >> Jane > >> > >> On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun <jiabao....@xtransfer.cn > .invalid> > >> wrote: > >> > >>> Thanks Jane, > >>> > >>> I believe that the configuration "table.optimizer.source.predicate" > has a > >>> higher priority at the planner level than the configuration at the > source > >>> level, > >>> and it seems easy to implement now. > >>> > >>> Best, > >>> Jiabao > >>> > >>> > >>>> 2023年10月24日 17:36,Jane Chan <qingyue....@gmail.com> 写道: > >>>> > >>>> Hi Jiabao, > >>>> > >>>> Thanks for driving this discussion. I have a small question that will > >>>> "scan.filter-push-down.enabled" take precedence over > >>>> "table.optimizer.source.predicate" when the two parameters might > conflict > >>>> each other? > >>>> > >>>> Best, > >>>> Jane > >>>> > >>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun <jiabao....@xtransfer.cn > >>> .invalid> > >>>> wrote: > >>>> > >>>>> Thanks Jark, > >>>>> > >>>>> If we only add configuration without adding the enableFilterPushDown > >>>>> method in the SupportsFilterPushDown interface, > >>>>> each connector would have to handle the same logic in the > applyFilters > >>>>> method to determine whether filter pushdown is needed. > >>>>> This would increase complexity and violate the original behavior of > the > >>>>> applyFilters method. > >>>>> > >>>>> On the contrary, we only need to pass the configuration parameter in > the > >>>>> newly added enableFilterPushDown method > >>>>> to decide whether to perform predicate pushdown. > >>>>> > >>>>> I think this approach would be clearer and simpler. > >>>>> WDYT? > >>>>> > >>>>> Best, > >>>>> Jiabao > >>>>> > >>>>> > >>>>>> 2023年10月24日 16:58,Jark Wu <imj...@gmail.com> 写道: > >>>>>> > >>>>>> Hi JIabao, > >>>>>> > >>>>>> I think the current interface can already satisfy your requirements. > >>>>>> The connector can reject all the filters by returning the input > filters > >>>>>> as `Result#remainingFilters`. > >>>>>> > >>>>>> So maybe we don't need to introduce a new method to disable > >>>>>> pushdown, but just introduce an option for the specific connector. > >>>>>> > >>>>>> Best, > >>>>>> Jark > >>>>>> > >>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard Xu <xbjt...@gmail.com> wrote: > >>>>>> > >>>>>>> Thanks @Jiabao for kicking off this discussion. > >>>>>>> > >>>>>>> Could you add a section to explain the difference between proposed > >>>>>>> connector level config `scan.filter-push-down.enabled` and existing > >>>>> query > >>>>>>> level config `table.optimizer.source.predicate-pushdown-enabled` ? > >>>>>>> > >>>>>>> Best, > >>>>>>> Leonard > >>>>>>> > >>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> > 写道: > >>>>>>>> > >>>>>>>> Hi Devs, > >>>>>>>> > >>>>>>>> I would like to start a discussion on FLIP-377: support > configuration > >>>>> to > >>>>>>> disable filter pushdown for Table/SQL Sources[1]. > >>>>>>>> > >>>>>>>> Currently, Flink Table/SQL does not expose fine-grained control > for > >>>>>>> users to enable or disable filter pushdown. > >>>>>>>> However, filter pushdown has some side effects, such as additional > >>>>>>> computational pressure on external systems. > >>>>>>>> Moreover, Improper queries can lead to issues such as full table > >>> scans, > >>>>>>> which in turn can impact the stability of external systems. > >>>>>>>> > >>>>>>>> Suppose we have an SQL query with two sources: Kafka and a > database. > >>>>>>>> The database is sensitive to pressure, and we want to configure > it to > >>>>>>> not perform filter pushdown to the database source. > >>>>>>>> However, we still want to perform filter pushdown to the Kafka > source > >>>>> to > >>>>>>> decrease network IO. > >>>>>>>> > >>>>>>>> I propose to support configuration to disable filter push down for > >>>>>>> Table/SQL sources to let user decide whether to perform filter > >>> pushdown. > >>>>>>>> > >>>>>>>> Looking forward to your feedback. > >>>>>>>> > >>>>>>>> [1] > >>>>>>> > >>>>> > >>> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768 > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Jiabao > >>>>>>> > >>>>>>> > >>>>> > >>>>> > >>> > >>> > > > > > > > > -- > > > > Best, > > Benchao Li > >