Thanks Hang for the suggestion.
I think the configuration of TableSource is not closely related to SourceReader, so I prefer to introduce a independent configuration class TableSourceOptions in the flink-table-common module, similar to LookupOptions. For the second point, I suggest adding Java doc to the SupportsXXXPushDown interfaces, providing detailed information on these options that needs to be supported. I have made updates in the FLIP document. Please help check it again. Best, Jiabao > 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com> 写道: > > Thanks for the improvements, Jiabao. > > There are some details that I am not sure about. > 1. The new option `source.filter-push-down.enabled` will be added to which > class? I think it should be `SourceReaderOptions`. > 2. How are the connector developers able to know and follow the FLIP? Do we > need an abstract base class or provide a default method? > > Best, > Hang > > Jiabao Sun <jiabao....@xtransfer.cn.invalid> 于2023年10月30日周一 14:45写道: > >> Hi, all, >> >> Thanks for the lively discussion. >> >> Based on the discussion, I have made some adjustments to the FLIP document: >> >> 1. The name of the newly added option has been changed to >> "source.filter-push-down.enabled". >> 2. Considering compatibility with older versions, the newly added >> "source.filter-push-down.enabled" option needs to respect the optimizer's >> "table.optimizer.source.predicate-pushdown-enabled" option. >> But there is a consideration to remove the old option in Flink 2.0. >> 3. We can provide more options to disable other source abilities with side >> effects, such as “source.aggregate.enabled” and “source.projection.enabled" >> This is not urgent and can be continuously introduced. >> >> Looking forward to your feedback again. >> >> Best, >> Jiabao >> >> >>> 2023年10月29日 08:45,Becket Qin <becket....@gmail.com> 写道: >>> >>> Thanks for digging into the git history, Jark. I agree it makes sense to >>> deprecate this API in 2.0. >>> >>> Cheers, >>> >>> Jiangjie (Becket) Qin >>> >>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <imj...@gmail.com> wrote: >>> >>>> Hi Becket, >>>> >>>> I checked the history of " >>>> *table.optimizer.source.predicate-pushdown-enabled*", >>>> it seems it was introduced since the legacy FilterableTableSource >>>> interface >>>> which might be an experiential feature at that time. I don't see the >>>> necessity >>>> of this option at the moment. Maybe we can deprecate this option and >> drop >>>> it >>>> in Flink 2.0[1] if it is not necessary anymore. This may help to >>>> simplify this discussion. >>>> >>>> >>>> Best, >>>> Jark >>>> >>>> [1]: https://issues.apache.org/jira/browse/FLINK-32383 >>>> >>>> >>>> >>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <becket....@gmail.com> wrote: >>>> >>>>> Thanks for the proposal, Jiabao. My two cents below: >>>>> >>>>> 1. If I understand correctly, the motivation of the FLIP is mainly to >>>>> make predicate pushdown optional on SOME of the Sources. If so, >> intuitively >>>>> the configuration should be Source specific instead of general. >> Otherwise, >>>>> we will end up with general configurations that may not take effect for >>>>> some of the Source implementations. This violates the basic rule of a >>>>> configuration - it does what it says, regardless of the implementation. >>>>> While configuration standardization is usually a good thing, it should >> not >>>>> break the basic rules. >>>>> If we really want to have this general configuration, for the sources >>>>> this configuration does not apply, they should throw an exception to >> make >>>>> it clear that this configuration is not supported. However, that seems >> ugly. >>>>> >>>>> 2. I think the actual motivation of this FLIP is about "how a source >>>>> should implement predicate pushdown efficiently", not "whether >> predicate >>>>> pushdown should be applied to the source." For example, if a source >> wants >>>>> to avoid additional computing load in the external system, it can >> always >>>>> read the entire record and apply the predicates by itself. However, >> from >>>>> the Flink perspective, the predicate pushdown is applied, it is just >>>>> implemented differently by the source. So the design principle here is >> that >>>>> Flink only cares about whether a source supports predicate pushdown or >> not, >>>>> it does not care about the implementation efficiency / side effect of >> the >>>>> predicates pushdown. It is the Source implementation's responsibility >> to >>>>> ensure the predicates pushdown is implemented efficiently and does not >>>>> impose excessive pressure on the external system. And it is OK to have >>>>> additional configurations to achieve this goal. Obviously, such >>>>> configurations will be source specific in this case. >>>>> >>>>> 3. Regarding the existing configurations of >> *table.optimizer.source.predicate-pushdown-enabled. >>>>> *I am not sure why we need it. Supposedly, if a source implements a >>>>> SupportsXXXPushDown interface, the optimizer should push the >> corresponding >>>>> predicates to the Source. I am not sure in which case this >> configuration >>>>> would be used. Any ideas @Jark Wu <imj...@gmail.com>? >>>>> >>>>> Thanks, >>>>> >>>>> Jiangjie (Becket) Qin >>>>> >>>>> >>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun >>>>> <jiabao....@xtransfer.cn.invalid> wrote: >>>>> >>>>>> Thanks Jane for the detailed explanation. >>>>>> >>>>>> I think that for users, we should respect conventions over >>>>>> configurations. >>>>>> Conventions can be default values explicitly specified in >>>>>> configurations, or they can be behaviors that follow previous >> versions. >>>>>> If the same code has different behaviors in different versions, it >> would >>>>>> be a very bad thing. >>>>>> >>>>>> I agree that for regular users, it is not necessary to understand all >>>>>> the configurations related to Flink. >>>>>> By following conventions, they can have a good experience. >>>>>> >>>>>> Let's get back to the practical situation and consider it. >>>>>> >>>>>> Case 1: >>>>>> The user is not familiar with the purpose of the >>>>>> table.optimizer.source.predicate-pushdown-enabled configuration but >> follows >>>>>> the convention of allowing predicate pushdown to the source by >> default. >>>>>> Just understanding the source.predicate-pushdown-enabled configuration >>>>>> and performing fine-grained toggle control will work well. >>>>>> >>>>>> Case 2: >>>>>> The user understands the meaning of the >>>>>> table.optimizer.source.predicate-pushdown-enabled configuration and >> has set >>>>>> its value to false. >>>>>> We have reason to believe that the user understands the meaning of the >>>>>> predicate pushdown configuration and the intention is to disable >> predicate >>>>>> pushdown (rather than whether or not to allow it). >>>>>> The previous choice of globally disabling it is likely because it >>>>>> couldn't be disabled on individual sources. >>>>>> From this perspective, if we provide more fine-grained configuration >>>>>> support and provide detailed explanations of the configuration >> behaviors in >>>>>> the documentation, >>>>>> users can clearly understand the differences between these two >>>>>> configurations and use them correctly. >>>>>> >>>>>> Also, I don't agree that >>>>>> table.optimizer.source.predicate-pushdown-enabled = true and >>>>>> source.predicate-pushdown-enabled = false means that the local >>>>>> configuration overrides the global configuration. >>>>>> On the contrary, both configurations are functioning correctly. >>>>>> The optimizer allows predicate pushdown to all sources, but some >> sources >>>>>> can reject the filters pushed down by the optimizer. >>>>>> This is natural, just like different components at different levels >> are >>>>>> responsible for different tasks. >>>>>> >>>>>> The more serious issue is that if "source.predicate-pushdown-enabled" >>>>>> does not respect "table.optimizer.source.predicate-pushdown-enabled”, >>>>>> the "table.optimizer.source.predicate-pushdown-enabled" configuration >>>>>> will be invalidated. >>>>>> This means that regardless of whether >>>>>> "table.optimizer.source.predicate-pushdown-enabled" is set to true or >>>>>> false, it will have no effect. >>>>>> >>>>>> Best, >>>>>> Jiabao >>>>>> >>>>>> >>>>>>> 2023年10月25日 22:24,Jane Chan <qingyue....@gmail.com> 写道: >>>>>>> >>>>>>> Hi Jiabao, >>>>>>> >>>>>>> Thanks for the in-depth clarification. Here are my cents >>>>>>> >>>>>>> However, "table.optimizer.source.predicate-pushdown-enabled" and >>>>>>>> "scan.filter-push-down.enabled" are configurations for different >>>>>>>> components(optimizer and source operator). >>>>>>>> >>>>>>> >>>>>>> We cannot assume that every user would be interested in understanding >>>>>> the >>>>>>> internal components of Flink, such as the optimizer or connectors, >> and >>>>>> the >>>>>>> specific configurations associated with each component. Instead, >> users >>>>>>> might be more concerned about knowing which configuration enables or >>>>>>> disables the filter push-down feature for all source connectors, and >>>>>> which >>>>>>> parameter provides the flexibility to override this behavior for a >>>>>> single >>>>>>> source if needed. >>>>>>> >>>>>>> So, from this perspective, I am inclined to divide these two >> parameters >>>>>>> based on the scope of their impact from the user's perspective (i.e. >>>>>>> global-level or operator-level), rather than categorizing them based >>>>>> on the >>>>>>> component hierarchy from a developer's point of view. Therefore, >> based >>>>>> on >>>>>>> this premise, it is intuitive and natural for users to >>>>>>> understand fine-grained configuration options can override global >>>>>>> configurations. >>>>>>> >>>>>>> Additionally, if "scan.filter-push-down.enabled" doesn't respect to >>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" and the default >>>>>> value >>>>>>>> of "scan.filter-push-down.enabled" is defined as true, >>>>>>>> it means that just modifying >>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as false will >>>>>> have no >>>>>>>> effect, and filter pushdown will still be performed. >>>>>>>> >>>>>>>> If we define the default value of "scan.filter-push-down.enabled" as >>>>>>>> false, it would introduce a difference in behavior compared to the >>>>>> previous >>>>>>>> version. >>>>>>>> >>>>>>> >>>>>>> <1>If I understand correctly, "scan.filter-push-down.enabled" is a >>>>>>> connector option, which means the only way to configure it is to >>>>>> explicitly >>>>>>> specify it in DDL (no matter whether disable or enable), and the SET >>>>>>> command is not applicable, so I think it's natural to still respect >>>>>> user's >>>>>>> specification here. Otherwise, users might be more confused about why >>>>>> the >>>>>>> DDL does not work as expected, and the reason is just because some >>>>>> other >>>>>>> "optimizer" configuration is set to a different value. >>>>>>> >>>>>>> <2> From the implementation side, I am inclined to keep the >> parameter's >>>>>>> priority consistent for all conditions. >>>>>>> >>>>>>> Let "global" denote >>>>>> "table.optimizer.source.predicate-pushdown-enabled", >>>>>>> and let "per-source" denote "scan.filter-push-down.enabled" for >>>>>> specific >>>>>>> source T, the following Truth table (based on the current design) >>>>>>> indicates the inconsistent behavior for "per-source override global". >>>>>>> >>>>>>> .------------.---------------.------------------- >>>>>>> ----.-------------------------------------. >>>>>>> | global | per-source | push-down for T | per-source override >> global >>>>>> | >>>>>>> >>>>>> >> :-----------+--------------+-----------------------+------------------------------------: >>>>>>> | true | false | false | Y >>>>>>> | >>>>>>> >>>>>> >> :-----------+--------------+-----------------------+------------------------------------: >>>>>>> | false | true | false | N >>>>>>> | >>>>>>> >>>>>> >> .------------.---------------.-----------------------.-------------------------------------. >>>>>>> >>>>>>> Best, >>>>>>> Jane >>>>>>> >>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <jiabao....@xtransfer.cn >>>>>> .invalid> >>>>>>> wrote: >>>>>>> >>>>>>>> Thanks Benchao for the feedback. >>>>>>>> >>>>>>>> I understand that the configuration of global parallelism and task >>>>>>>> parallelism is at different granularities but with the same >>>>>> configuration. >>>>>>>> However, "table.optimizer.source.predicate-pushdown-enabled" and >>>>>>>> "scan.filter-push-down.enabled" are configurations for different >>>>>>>> components(optimizer and source operator). >>>>>>>> >>>>>>>> From a user's perspective, there are two scenarios: >>>>>>>> >>>>>>>> 1. Disabling all filter pushdown >>>>>>>> In this case, setting >>>>>> "table.optimizer.source.predicate-pushdown-enabled" >>>>>>>> to false is sufficient to meet the requirement. >>>>>>>> >>>>>>>> 2. Disabling filter pushdown for specific sources >>>>>>>> In this scenario, there is no need to adjust the value of >>>>>>>> "table.optimizer.source.predicate-pushdown-enabled". >>>>>>>> Instead, the focus should be on the configuration of >>>>>>>> "scan.filter-push-down.enabled" to meet the requirement. >>>>>>>> In this case, users do not need to set >>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" to false and >>>>>> manually >>>>>>>> enable filter pushdown for specific sources. >>>>>>>> >>>>>>>> Additionally, if "scan.filter-push-down.enabled" doesn't respect to >>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" and the default >>>>>> value >>>>>>>> of "scan.filter-push-down.enabled" is defined as true, >>>>>>>> it means that just modifying >>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as false will >>>>>> have no >>>>>>>> effect, and filter pushdown will still be performed. >>>>>>>> >>>>>>>> If we define the default value of "scan.filter-push-down.enabled" as >>>>>>>> false, it would introduce a difference in behavior compared to the >>>>>> previous >>>>>>>> version. >>>>>>>> The same SQL query that could successfully push down filters in the >>>>>> old >>>>>>>> version but would no longer do so after the upgrade. >>>>>>>> >>>>>>>> Best, >>>>>>>> Jiabao >>>>>>>> >>>>>>>> >>>>>>>>> 2023年10月25日 17:10,Benchao Li <libenc...@apache.org> 写道: >>>>>>>>> >>>>>>>>> Thanks Jiabao for the detailed explanations, that helps a lot, I >>>>>>>>> understand your rationale now. >>>>>>>>> >>>>>>>>> Correct me if I'm wrong. Your perspective is from "developer", >> which >>>>>>>>> means there is an optimizer and connector component, and if we want >>>>>> to >>>>>>>>> enable this feature (pushing filters down into connectors), you >> must >>>>>>>>> enable it firstly in optimizer, and only then connector has the >>>>>> chance >>>>>>>>> to decide to use it or not. >>>>>>>>> >>>>>>>>> My perspective is from "user" that (Why a user should care about >> the >>>>>>>>> difference of optimizer/connector) , this is a feature, and has two >>>>>>>>> way to control it, one way is to config it job-level, the other one >>>>>> is >>>>>>>>> in table properties. What a user expects is that they can control a >>>>>>>>> feature in a tiered way, that setting it per job, and then >>>>>>>>> fine-grained tune it per table. >>>>>>>>> >>>>>>>>> This is some kind of similar to other concepts, such as >> parallelism, >>>>>>>>> users can set a job level default parallelism, and then >> fine-grained >>>>>>>>> tune it per operator. There may be more such debate in the future >>>>>>>>> e.g., we can have a job level config about adding key-by before >>>>>> lookup >>>>>>>>> join, and also a hint/table property way to fine-grained control it >>>>>>>>> per lookup operator. Hence we'd better find a unified way for all >>>>>>>>> those similar kind of features. >>>>>>>>> >>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> 于2023年10月25日周三 >> 15:27写道: >>>>>>>>>> >>>>>>>>>> Thanks Jane for further explanation. >>>>>>>>>> >>>>>>>>>> These two configurations correspond to different levels. >>>>>>>> "scan.filter-push-down.enabled" does not make >>>>>>>> "table.optimizer.source.predicate" invalid. >>>>>>>>>> The planner will still push down predicates to all sources. >>>>>>>>>> Whether filter pushdown is allowed or not is determined by the >>>>>> specific >>>>>>>> source's "scan.filter-push-down.enabled" configuration. >>>>>>>>>> >>>>>>>>>> However, "table.optimizer.source.predicate" does directly affect >>>>>>>> "scan.filter-push-down.enabled”. >>>>>>>>>> When the planner disables predicate pushdown, the source-level >>>>>> filter >>>>>>>> pushdown will also not be executed, even if the source allows filter >>>>>>>> pushdown. >>>>>>>>>> >>>>>>>>>> Whatever, in point 1 and 2, our expectation is consistent. >>>>>>>>>> For the 3rd point, I still think that the planner-level >>>>>> configuration >>>>>>>> takes precedence over the source-level configuration. >>>>>>>>>> It may seem counterintuitive when we globally disable predicate >>>>>>>> pushdown but allow filter pushdown at the source level. >>>>>>>>>> >>>>>>>>>> Best, >>>>>>>>>> Jiabao >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> 2023年10月25日 14:35,Jane Chan <qingyue....@gmail.com> 写道: >>>>>>>>>>> >>>>>>>>>>> Hi Jiabao, >>>>>>>>>>> >>>>>>>>>>> Thanks for clarifying this. While by >> "scan.filter-push-down.enabled >>>>>>>> takes a >>>>>>>>>>> higher priority" I meant that this value should be respected >>>>>> whenever >>>>>>>> it is >>>>>>>>>>> set explicitly. >>>>>>>>>>> >>>>>>>>>>> The conclusion that >>>>>>>>>>> >>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and >>>>>>>>>>>> "scan.filter-push-down.enabled" = "false" >>>>>>>>>>>> Allow the planner to perform predicate pushdown, but individual >>>>>>>> sources do >>>>>>>>>>>> not enable filter pushdown. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> This indicates that the option "scan.filter-push-down.enabled = >>>>>> false" >>>>>>>> for >>>>>>>>>>> an individual source connector does indeed override the >>>>>> global-level >>>>>>>>>>> planner settings to make a difference. And thus "has a higher >>>>>>>> priority". >>>>>>>>>>> >>>>>>>>>>> While for >>>>>>>>>>> >>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false" >>>>>>>>>>>> Predicate pushdown is not allowed for the planner. >>>>>>>>>>>> Regardless of the value of the "scan.filter-push-down.enabled" >>>>>>>>>>>> configuration, filter pushdown is disabled. >>>>>>>>>>>> In this scenario, the behavior remains consistent with the old >>>>>>>> version as >>>>>>>>>>>> well. >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> I still think "scan.filter-push-down.enabled" should also be >>>>>> respected >>>>>>>> if >>>>>>>>>>> it is enabled for individual connectors. WDYT? >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Jane >>>>>>>>>>> >>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun < >>>>>> jiabao....@xtransfer.cn >>>>>>>> .invalid> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks Benchao for the feedback. >>>>>>>>>>>> >>>>>>>>>>>> For the current proposal, we recommend keeping the default value >>>>>> of >>>>>>>>>>>> "table.optimizer.source.predicate" as true, >>>>>>>>>>>> and setting the the default value of newly introduced option >>>>>>>>>>>> "scan.filter-push-down.enabled" to true as well. >>>>>>>>>>>> >>>>>>>>>>>> The main purpose of doing this is to maintain consistency with >>>>>>>> previous >>>>>>>>>>>> versions, as whether to perform >>>>>>>>>>>> filter pushdown in the old version solely depends on the >>>>>>>>>>>> "table.optimizer.source.predicate" option. >>>>>>>>>>>> That means by default, as long as a TableSource implements the >>>>>>>>>>>> SupportsFilterPushDown interface, filter pushdown is allowed. >>>>>>>>>>>> And it seems that we don't have much benefit in changing the >>>>>> default >>>>>>>> value >>>>>>>>>>>> of "table.optimizer.source.predicate" to false. >>>>>>>>>>>> >>>>>>>>>>>> Regarding the priority of these two configurations, I believe >> that >>>>>>>>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>> takes precedence over "scan.filter-push-down.enabled" and it >>>>>> exhibits >>>>>>>> the >>>>>>>>>>>> following behavior. >>>>>>>>>>>> >>>>>>>>>>>> 1. "table.optimizer.source.predicate" = "true" and >>>>>>>>>>>> "scan.filter-push-down.enabled" = "true" >>>>>>>>>>>> This is the default behavior, allowing filter pushdown for >>>>>> sources. >>>>>>>>>>>> >>>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and >>>>>>>>>>>> "scan.filter-push-down.enabled" = "false" >>>>>>>>>>>> Allow the planner to perform predicate pushdown, but individual >>>>>>>> sources do >>>>>>>>>>>> not enable filter pushdown. >>>>>>>>>>>> >>>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false" >>>>>>>>>>>> Predicate pushdown is not allowed for the planner. >>>>>>>>>>>> Regardless of the value of the "scan.filter-push-down.enabled" >>>>>>>>>>>> configuration, filter pushdown is disabled. >>>>>>>>>>>> In this scenario, the behavior remains consistent with the old >>>>>>>> version as >>>>>>>>>>>> well. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> From an implementation perspective, setting the priority of >>>>>>>>>>>> "scan.filter-push-down.enabled" higher than >>>>>>>>>>>> "table.optimizer.source.predicate" is difficult to achieve now. >>>>>>>>>>>> Because the PushFilterIntoSourceScanRuleBase at the planner >> level >>>>>>>> takes >>>>>>>>>>>> precedence over the source-level FilterPushDownSpec. >>>>>>>>>>>> Only when the PushFilterIntoSourceScanRuleBase is enabled, will >>>>>> the >>>>>>>>>>>> Source-level filter pushdown be performed. >>>>>>>>>>>> >>>>>>>>>>>> Additionally, in my opinion, there doesn't seem to be much >>>>>> benefit in >>>>>>>>>>>> setting a higher priority for "scan.filter-push-down.enabled". >>>>>>>>>>>> It may instead affect compatibility and increase implementation >>>>>>>> complexity. >>>>>>>>>>>> >>>>>>>>>>>> WDYT? >>>>>>>>>>>> >>>>>>>>>>>> Best, >>>>>>>>>>>> Jiabao >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <libenc...@apache.org> 写道: >>>>>>>>>>>>> >>>>>>>>>>>>> I agree with Jane that fine-grained configurations should have >>>>>> higher >>>>>>>>>>>>> priority than job level configurations. >>>>>>>>>>>>> >>>>>>>>>>>>> For current proposal, we can achieve that: >>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "true" to enable by >>>>>>>>>>>>> default, and set ""scan.filter-push-down.enabled" = "false" to >>>>>>>> disable >>>>>>>>>>>>> it per table source >>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "false" to disable >> by >>>>>>>>>>>>> default, and set ""scan.filter-push-down.enabled" = "true" to >>>>>> enable >>>>>>>>>>>>> it per table source >>>>>>>>>>>>> >>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com> 于2023年10月24日周二 23:55写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I believe that the configuration >>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>> has a >>>>>>>>>>>>>>> higher priority at the planner level than the configuration >> at >>>>>> the >>>>>>>>>>>> source >>>>>>>>>>>>>>> level, >>>>>>>>>>>>>>> and it seems easy to implement now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Correct me if I'm wrong, but I think the fine-grained >>>>>> configuration >>>>>>>>>>>>>> "scan.filter-push-down.enabled" should have a higher priority >>>>>>>> because >>>>>>>>>>>> the >>>>>>>>>>>>>> default value of "table.optimizer.source.predicate" is true. >> As >>>>>> a >>>>>>>>>>>> result, >>>>>>>>>>>>>> turning off filter push-down for a specific source will not >> take >>>>>>>> effect >>>>>>>>>>>>>> unless the default value of "table.optimizer.source.predicate" >>>>>> is >>>>>>>>>>>> changed >>>>>>>>>>>>>> to false, or, alternatively, let users manually set >>>>>>>>>>>>>> "table.optimizer.source.predicate" to false first and then >>>>>>>> selectively >>>>>>>>>>>>>> enable filter push-down for the desired sources, which is less >>>>>>>>>>>> intuitive. >>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Jane >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun < >>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>> .invalid> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks Jane, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I believe that the configuration >>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>> has a >>>>>>>>>>>>>>> higher priority at the planner level than the configuration >> at >>>>>> the >>>>>>>>>>>> source >>>>>>>>>>>>>>> level, >>>>>>>>>>>>>>> and it seems easy to implement now. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <qingyue....@gmail.com> 写道: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Hi Jiabao, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks for driving this discussion. I have a small question >>>>>> that >>>>>>>> will >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take precedence over >>>>>>>>>>>>>>>> "table.optimizer.source.predicate" when the two parameters >>>>>> might >>>>>>>>>>>> conflict >>>>>>>>>>>>>>>> each other? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>> Jane >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun < >>>>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>>>>> .invalid> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks Jark, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> If we only add configuration without adding the >>>>>>>> enableFilterPushDown >>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown interface, >>>>>>>>>>>>>>>>> each connector would have to handle the same logic in the >>>>>>>>>>>> applyFilters >>>>>>>>>>>>>>>>> method to determine whether filter pushdown is needed. >>>>>>>>>>>>>>>>> This would increase complexity and violate the original >>>>>> behavior >>>>>>>> of >>>>>>>>>>>> the >>>>>>>>>>>>>>>>> applyFilters method. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On the contrary, we only need to pass the configuration >>>>>>>> parameter in >>>>>>>>>>>> the >>>>>>>>>>>>>>>>> newly added enableFilterPushDown method >>>>>>>>>>>>>>>>> to decide whether to perform predicate pushdown. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I think this approach would be clearer and simpler. >>>>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <imj...@gmail.com> 写道: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi JIabao, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I think the current interface can already satisfy your >>>>>>>> requirements. >>>>>>>>>>>>>>>>>> The connector can reject all the filters by returning the >>>>>> input >>>>>>>>>>>> filters >>>>>>>>>>>>>>>>>> as `Result#remainingFilters`. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> So maybe we don't need to introduce a new method to >> disable >>>>>>>>>>>>>>>>>> pushdown, but just introduce an option for the specific >>>>>>>> connector. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>> Jark >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard Xu < >> xbjt...@gmail.com >>>>>>> >>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this discussion. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Could you add a section to explain the difference between >>>>>>>> proposed >>>>>>>>>>>>>>>>>>> connector level config `scan.filter-push-down.enabled` >> and >>>>>>>> existing >>>>>>>>>>>>>>>>> query >>>>>>>>>>>>>>>>>>> level config >>>>>>>> `table.optimizer.source.predicate-pushdown-enabled` ? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>> Leonard >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <jiabao....@xtransfer.cn >>>>>>>> .INVALID> >>>>>>>>>>>> 写道: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Devs, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I would like to start a discussion on FLIP-377: support >>>>>>>>>>>> configuration >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> disable filter pushdown for Table/SQL Sources[1]. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not expose fine-grained >>>>>>>> control >>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>> users to enable or disable filter pushdown. >>>>>>>>>>>>>>>>>>>> However, filter pushdown has some side effects, such as >>>>>>>> additional >>>>>>>>>>>>>>>>>>> computational pressure on external systems. >>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead to issues such as >> full >>>>>>>> table >>>>>>>>>>>>>>> scans, >>>>>>>>>>>>>>>>>>> which in turn can impact the stability of external >> systems. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with two sources: Kafka >> and a >>>>>>>>>>>> database. >>>>>>>>>>>>>>>>>>>> The database is sensitive to pressure, and we want to >>>>>>>> configure >>>>>>>>>>>> it to >>>>>>>>>>>>>>>>>>> not perform filter pushdown to the database source. >>>>>>>>>>>>>>>>>>>> However, we still want to perform filter pushdown to the >>>>>> Kafka >>>>>>>>>>>> source >>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>> decrease network IO. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I propose to support configuration to disable filter >> push >>>>>>>> down for >>>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide whether to perform >>>>>> filter >>>>>>>>>>>>>>> pushdown. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Looking forward to your feedback. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Benchao Li >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Benchao Li >>>>>>>> >>>>>>>> >>>>>> >>>>>> >> >>