Hi Becket, Actually, for FileSystemSource, it is not always desired, only OCR file formats support filter pushdown.
We can disable predicate pushdown for FileSystemSource by setting 'table.optimizer.source.predicate-pushdown-enabled' to false. I think we can also disable filter pushdown at a more granular level through fine-grained configuration. Best, Jiabao > 2023年10月31日 16:50,Becket Qin <becket....@gmail.com> 写道: > > Hi Jiabao, > > Thanks for the explanation. Maybe it's easier to explain with an example. > > Let's take FileSystemTableSource as an example. Currently it implements > SupportsFilterPushDown interface. With your proposal, does it have to > support `source.filter-push-down.enabled` as well? But this configuration > does not quite make sense for the FileSystemTableSource because filter > pushdown is always desired. However, because this configuration is a part > of the SupportsFilterPushDown interface (which sounds confusing to begin > with), the FileSystemTableSource can only do one of the following: > > 1. Ignore the user configuration to always apply the pushed down filters - > this is an apparent anti-pattern because a configuration should always do > what it says. > 2. Throw an exception telling users that this configuration is not > applicable to the FileSystemTableSource. > 3. Implement this configuration to push back the pushed down filters, even > though this is never desired. > > None of the above options looks awkward. I am curious what your solution is > here? > > Thanks, > > Jiangjie (Becket) Qin > > On Tue, Oct 31, 2023 at 3:11 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid> > wrote: > >> Thanks Becket for the further explanation. >> >> Perhaps I didn't explain it clearly. >> >> 1. If a source does not implement the SupportsFilterPushDown interface, >> the newly added configurations do not need to be added to either the >> requiredOptions or optionalOptions. >> Similar to LookupOptions, if a source does not implement >> LookupTableSource, there is no need to add LookupOptions to either >> requiredOptions or optionalOptions. >> >> 2. "And these configs are specific to those sources, instead of common >> configs." >> The newly introduced configurations define standardized names and default >> values. >> They still belong to the configuration at the individual source level. >> The purpose is to avoid scattered configuration items when different >> sources implement the same logic. >> Whether a source should accept these configurations is determined by the >> source's Factory. >> >> Best, >> Jiabao >> >> >>> 2023年10月31日 13:47,Becket Qin <becket....@gmail.com> 写道: >>> >>> Hi Jiabao, >>> >>> Please see the replies inline. >>> >>> Introducing common configurations does not mean that all sources must >>>> accept these configuration options. >>>> The configuration options supported by a source are determined by the >>>> requiredOptions and optionalOptions in the Factory interface. >>> >>> This is not true. Both required and optional options are SUPPORTED. That >>> means they are implemented and if one specifies an optional config it >> will >>> still take effect. An OptionalConfig is "Optional" because this >>> configuration has a default value. Hence, it is OK that users do not >>> specify their own value. In another word, it is "optional" for the end >>> users to set the config, but the implementation and support for that >> config >>> is NOT optional. In case a source does not support a common config, an >>> exception must be thrown when the config is provided by the end users. >>> However, the config we are talking about in this FLIP is a common config >>> optional to implement, meaning that sometimes the claimed behavior won't >> be >>> there even if users specify that config. >>> >>> Similar to sources that do not implement the LookupTableSource interface, >>>> sources that do not implement the SupportsFilterPushDown interface also >> do >>>> not need to accept newly introduced options. >>> >>> First of all, filter pushdown is a behavior of the query optimizer, not >> the >>> behavior of Sources. The Sources tells the optimizer that it has the >>> ability to accept pushed down filters by implementing the >>> SupportsFilterPushDown interface. And this is the only contract between >> the >>> Source and Optimizer regarding whether filters should be pushed down. As >>> long as a specific source implements this decorative interface, filter >>> pushdown should always take place, i.e. >>> *SupportsFilterPushDown.applyFilters()* will be called. There should be >> no >>> other config to disable that call. However, Sources can decide how to >>> behave based on their own configurations after *applyFilters()* is >> called. >>> And these configs are specific to those sources, instead of common >> configs. >>> Please see the examples I mentioned in the previous email. >>> >>> Thanks, >>> >>> Jiangjie (Becket) Qin >>> >>> On Tue, Oct 31, 2023 at 10:27 AM Jiabao Sun <jiabao....@xtransfer.cn >> .invalid> >>> wrote: >>> >>>> Hi Becket, >>>> >>>> Sorry, there was a typo in the second point. Let me correct it: >>>> >>>> Introducing common configurations does not mean that all sources must >>>> accept these configuration options. >>>> The configuration options supported by a source are determined by the >>>> requiredOptions and optionalOptions in the Factory interface. >>>> >>>> Similar to sources that do not implement the LookupTableSource >> interface, >>>> sources that do not implement the SupportsFilterPushDown interface also >> do >>>> not need to accept newly introduced options. >>>> >>>> Best, >>>> Jiabao >>>> >>>> >>>>> 2023年10月31日 10:13,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道: >>>>> >>>>> Thanks Becket for the feedback. >>>>> >>>>> 1. Currently, the SupportsFilterPushDown#applyFilters method returns a >>>> result that includes acceptedFilters and remainingFilters. The source >> can >>>> decide to push down some filters or not accept any of them. >>>>> 2. Introducing common configuration options does not mean that a source >>>> that supports the SupportsFilterPushDown capability must accept this >>>> configuration. Similar to LookupOptions, only sources that implement the >>>> LookupTableSource interface are necessary to accept these configuration >>>> options. >>>>> >>>>> Best, >>>>> Jiabao >>>>> >>>>> >>>>>> 2023年10月31日 07:49,Becket Qin <becket....@gmail.com> 写道: >>>>>> >>>>>> Hi Jiabao and Ruanhang, >>>>>> >>>>>> Adding a configuration of source.filter-push-down.enabled as a common >>>>>> source configuration seems problematic. >>>>>> 1. The config name is misleading. filter pushdown should only be >>>> determined >>>>>> by whether the SupportsFilterPushdown interface is implemented or not. >>>>>> 2. The behavior of this configuration is only applicable to some >> source >>>>>> implementations. Why is it a common configuration? >>>>>> >>>>>> Here's my suggestion for design principles: >>>>>> 1. Only add source impl specific configuration to corresponding >> sources. >>>>>> 2. The configuration name should not overrule existing common >> contracts. >>>>>> >>>>>> For example, in the case of MySql source. There are several options: >>>>>> 1. Have a configuration of `*mysql.avoid.remote.full.table.scan`*. If >>>> this >>>>>> configuration is set, and a filter pushdown does not hit an index, the >>>>>> MySql source impl would not further pushdown the filter to MySql >>>> servers. >>>>>> Note that this assumes the MySql source can retrieve the index >>>> information >>>>>> from the MySql servers. >>>>>> 2. If the MySql index information is not available to the MySql >> source, >>>> the >>>>>> configuration could be something like >>>> *`mysql.pushback.pushed.down.filters`*. >>>>>> Once set to true, MySql source would just add all the filters to the >>>>>> RemainingFilters in the Result returned by >>>>>> *SupportsFilterPushdown.applyFilters().* >>>>>> 3. An alternative to option 2 is to have a ` >>>>>> *mysql.apply.predicates.after.scan*`. When it is set to true, MySql >>>> source >>>>>> will not push the filter down to the MySql servers, but apply the >>>> filters >>>>>> inside the MySql source itself. >>>>>> >>>>>> As you may see, the above configurations do not disable filter >> pushdown >>>>>> itself. They just allow various implementations of filter pushdown. >> And >>>> the >>>>>> configuration name does not give any illusion that filter pushdown is >>>>>> disabled. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Jiangjie (Becket) Qin >>>>>> >>>>>> On Mon, Oct 30, 2023 at 11:58 PM Jiabao Sun <jiabao....@xtransfer.cn >>>> .invalid> >>>>>> wrote: >>>>>> >>>>>>> Thanks Hang for the suggestion. >>>>>>> >>>>>>> >>>>>>> I think the configuration of TableSource is not closely related to >>>>>>> SourceReader, >>>>>>> so I prefer to introduce a independent configuration class >>>>>>> TableSourceOptions in the flink-table-common module, similar to >>>>>>> LookupOptions. >>>>>>> >>>>>>> For the second point, I suggest adding Java doc to the >>>> SupportsXXXPushDown >>>>>>> interfaces, providing detailed information on these options that >> needs >>>> to >>>>>>> be supported. >>>>>>> >>>>>>> I have made updates in the FLIP document. >>>>>>> Please help check it again. >>>>>>> >>>>>>> >>>>>>> Best, >>>>>>> Jiabao >>>>>>> >>>>>>> >>>>>>>> 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com> 写道: >>>>>>>> >>>>>>>> Thanks for the improvements, Jiabao. >>>>>>>> >>>>>>>> There are some details that I am not sure about. >>>>>>>> 1. The new option `source.filter-push-down.enabled` will be added to >>>>>>> which >>>>>>>> class? I think it should be `SourceReaderOptions`. >>>>>>>> 2. How are the connector developers able to know and follow the >> FLIP? >>>> Do >>>>>>> we >>>>>>>> need an abstract base class or provide a default method? >>>>>>>> >>>>>>>> Best, >>>>>>>> Hang >>>>>>>> >>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> 于2023年10月30日周一 >> 14:45写道: >>>>>>>> >>>>>>>>> Hi, all, >>>>>>>>> >>>>>>>>> Thanks for the lively discussion. >>>>>>>>> >>>>>>>>> Based on the discussion, I have made some adjustments to the FLIP >>>>>>> document: >>>>>>>>> >>>>>>>>> 1. The name of the newly added option has been changed to >>>>>>>>> "source.filter-push-down.enabled". >>>>>>>>> 2. Considering compatibility with older versions, the newly added >>>>>>>>> "source.filter-push-down.enabled" option needs to respect the >>>>>>> optimizer's >>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" option. >>>>>>>>> But there is a consideration to remove the old option in Flink 2.0. >>>>>>>>> 3. We can provide more options to disable other source abilities >> with >>>>>>> side >>>>>>>>> effects, such as “source.aggregate.enabled” and >>>>>>> “source.projection.enabled" >>>>>>>>> This is not urgent and can be continuously introduced. >>>>>>>>> >>>>>>>>> Looking forward to your feedback again. >>>>>>>>> >>>>>>>>> Best, >>>>>>>>> Jiabao >>>>>>>>> >>>>>>>>> >>>>>>>>>> 2023年10月29日 08:45,Becket Qin <becket....@gmail.com> 写道: >>>>>>>>>> >>>>>>>>>> Thanks for digging into the git history, Jark. I agree it makes >>>> sense >>>>>>> to >>>>>>>>>> deprecate this API in 2.0. >>>>>>>>>> >>>>>>>>>> Cheers, >>>>>>>>>> >>>>>>>>>> Jiangjie (Becket) Qin >>>>>>>>>> >>>>>>>>>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <imj...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Becket, >>>>>>>>>>> >>>>>>>>>>> I checked the history of " >>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled*", >>>>>>>>>>> it seems it was introduced since the legacy FilterableTableSource >>>>>>>>>>> interface >>>>>>>>>>> which might be an experiential feature at that time. I don't see >>>> the >>>>>>>>>>> necessity >>>>>>>>>>> of this option at the moment. Maybe we can deprecate this option >>>> and >>>>>>>>> drop >>>>>>>>>>> it >>>>>>>>>>> in Flink 2.0[1] if it is not necessary anymore. This may help to >>>>>>>>>>> simplify this discussion. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Best, >>>>>>>>>>> Jark >>>>>>>>>>> >>>>>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-32383 >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <becket....@gmail.com> >>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Thanks for the proposal, Jiabao. My two cents below: >>>>>>>>>>>> >>>>>>>>>>>> 1. If I understand correctly, the motivation of the FLIP is >>>> mainly to >>>>>>>>>>>> make predicate pushdown optional on SOME of the Sources. If so, >>>>>>>>> intuitively >>>>>>>>>>>> the configuration should be Source specific instead of general. >>>>>>>>> Otherwise, >>>>>>>>>>>> we will end up with general configurations that may not take >>>> effect >>>>>>> for >>>>>>>>>>>> some of the Source implementations. This violates the basic rule >>>> of a >>>>>>>>>>>> configuration - it does what it says, regardless of the >>>>>>> implementation. >>>>>>>>>>>> While configuration standardization is usually a good thing, it >>>>>>> should >>>>>>>>> not >>>>>>>>>>>> break the basic rules. >>>>>>>>>>>> If we really want to have this general configuration, for the >>>> sources >>>>>>>>>>>> this configuration does not apply, they should throw an >> exception >>>> to >>>>>>>>> make >>>>>>>>>>>> it clear that this configuration is not supported. However, that >>>>>>> seems >>>>>>>>> ugly. >>>>>>>>>>>> >>>>>>>>>>>> 2. I think the actual motivation of this FLIP is about "how a >>>> source >>>>>>>>>>>> should implement predicate pushdown efficiently", not "whether >>>>>>>>> predicate >>>>>>>>>>>> pushdown should be applied to the source." For example, if a >>>> source >>>>>>>>> wants >>>>>>>>>>>> to avoid additional computing load in the external system, it >> can >>>>>>>>> always >>>>>>>>>>>> read the entire record and apply the predicates by itself. >>>> However, >>>>>>>>> from >>>>>>>>>>>> the Flink perspective, the predicate pushdown is applied, it is >>>> just >>>>>>>>>>>> implemented differently by the source. So the design principle >>>> here >>>>>>> is >>>>>>>>> that >>>>>>>>>>>> Flink only cares about whether a source supports predicate >>>> pushdown >>>>>>> or >>>>>>>>> not, >>>>>>>>>>>> it does not care about the implementation efficiency / side >>>> effect of >>>>>>>>> the >>>>>>>>>>>> predicates pushdown. It is the Source implementation's >>>> responsibility >>>>>>>>> to >>>>>>>>>>>> ensure the predicates pushdown is implemented efficiently and >> does >>>>>>> not >>>>>>>>>>>> impose excessive pressure on the external system. And it is OK >> to >>>>>>> have >>>>>>>>>>>> additional configurations to achieve this goal. Obviously, such >>>>>>>>>>>> configurations will be source specific in this case. >>>>>>>>>>>> >>>>>>>>>>>> 3. Regarding the existing configurations of >>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled. >>>>>>>>>>>> *I am not sure why we need it. Supposedly, if a source >> implements >>>> a >>>>>>>>>>>> SupportsXXXPushDown interface, the optimizer should push the >>>>>>>>> corresponding >>>>>>>>>>>> predicates to the Source. I am not sure in which case this >>>>>>>>> configuration >>>>>>>>>>>> would be used. Any ideas @Jark Wu <imj...@gmail.com>? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> >>>>>>>>>>>> Jiangjie (Becket) Qin >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun >>>>>>>>>>>> <jiabao....@xtransfer.cn.invalid> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Thanks Jane for the detailed explanation. >>>>>>>>>>>>> >>>>>>>>>>>>> I think that for users, we should respect conventions over >>>>>>>>>>>>> configurations. >>>>>>>>>>>>> Conventions can be default values explicitly specified in >>>>>>>>>>>>> configurations, or they can be behaviors that follow previous >>>>>>>>> versions. >>>>>>>>>>>>> If the same code has different behaviors in different versions, >>>> it >>>>>>>>> would >>>>>>>>>>>>> be a very bad thing. >>>>>>>>>>>>> >>>>>>>>>>>>> I agree that for regular users, it is not necessary to >> understand >>>>>>> all >>>>>>>>>>>>> the configurations related to Flink. >>>>>>>>>>>>> By following conventions, they can have a good experience. >>>>>>>>>>>>> >>>>>>>>>>>>> Let's get back to the practical situation and consider it. >>>>>>>>>>>>> >>>>>>>>>>>>> Case 1: >>>>>>>>>>>>> The user is not familiar with the purpose of the >>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled configuration >>>> but >>>>>>>>> follows >>>>>>>>>>>>> the convention of allowing predicate pushdown to the source by >>>>>>>>> default. >>>>>>>>>>>>> Just understanding the source.predicate-pushdown-enabled >>>>>>> configuration >>>>>>>>>>>>> and performing fine-grained toggle control will work well. >>>>>>>>>>>>> >>>>>>>>>>>>> Case 2: >>>>>>>>>>>>> The user understands the meaning of the >>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled configuration >>>> and >>>>>>>>> has set >>>>>>>>>>>>> its value to false. >>>>>>>>>>>>> We have reason to believe that the user understands the meaning >>>> of >>>>>>> the >>>>>>>>>>>>> predicate pushdown configuration and the intention is to >> disable >>>>>>>>> predicate >>>>>>>>>>>>> pushdown (rather than whether or not to allow it). >>>>>>>>>>>>> The previous choice of globally disabling it is likely because >> it >>>>>>>>>>>>> couldn't be disabled on individual sources. >>>>>>>>>>>>> From this perspective, if we provide more fine-grained >>>> configuration >>>>>>>>>>>>> support and provide detailed explanations of the configuration >>>>>>>>> behaviors in >>>>>>>>>>>>> the documentation, >>>>>>>>>>>>> users can clearly understand the differences between these two >>>>>>>>>>>>> configurations and use them correctly. >>>>>>>>>>>>> >>>>>>>>>>>>> Also, I don't agree that >>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled = true and >>>>>>>>>>>>> source.predicate-pushdown-enabled = false means that the local >>>>>>>>>>>>> configuration overrides the global configuration. >>>>>>>>>>>>> On the contrary, both configurations are functioning correctly. >>>>>>>>>>>>> The optimizer allows predicate pushdown to all sources, but >> some >>>>>>>>> sources >>>>>>>>>>>>> can reject the filters pushed down by the optimizer. >>>>>>>>>>>>> This is natural, just like different components at different >>>> levels >>>>>>>>> are >>>>>>>>>>>>> responsible for different tasks. >>>>>>>>>>>>> >>>>>>>>>>>>> The more serious issue is that if >>>>>>> "source.predicate-pushdown-enabled" >>>>>>>>>>>>> does not respect >>>>>>> "table.optimizer.source.predicate-pushdown-enabled”, >>>>>>>>>>>>> the "table.optimizer.source.predicate-pushdown-enabled" >>>>>>> configuration >>>>>>>>>>>>> will be invalidated. >>>>>>>>>>>>> This means that regardless of whether >>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" is set to >>>> true >>>>>>> or >>>>>>>>>>>>> false, it will have no effect. >>>>>>>>>>>>> >>>>>>>>>>>>> Best, >>>>>>>>>>>>> Jiabao >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> 2023年10月25日 22:24,Jane Chan <qingyue....@gmail.com> 写道: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi Jiabao, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for the in-depth clarification. Here are my cents >>>>>>>>>>>>>> >>>>>>>>>>>>>> However, "table.optimizer.source.predicate-pushdown-enabled" >> and >>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are configurations for >>>> different >>>>>>>>>>>>>>> components(optimizer and source operator). >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> We cannot assume that every user would be interested in >>>>>>> understanding >>>>>>>>>>>>> the >>>>>>>>>>>>>> internal components of Flink, such as the optimizer or >>>> connectors, >>>>>>>>> and >>>>>>>>>>>>> the >>>>>>>>>>>>>> specific configurations associated with each component. >> Instead, >>>>>>>>> users >>>>>>>>>>>>>> might be more concerned about knowing which configuration >>>> enables >>>>>>> or >>>>>>>>>>>>>> disables the filter push-down feature for all source >> connectors, >>>>>>> and >>>>>>>>>>>>> which >>>>>>>>>>>>>> parameter provides the flexibility to override this behavior >>>> for a >>>>>>>>>>>>> single >>>>>>>>>>>>>> source if needed. >>>>>>>>>>>>>> >>>>>>>>>>>>>> So, from this perspective, I am inclined to divide these two >>>>>>>>> parameters >>>>>>>>>>>>>> based on the scope of their impact from the user's perspective >>>>>>> (i.e. >>>>>>>>>>>>>> global-level or operator-level), rather than categorizing them >>>>>>> based >>>>>>>>>>>>> on the >>>>>>>>>>>>>> component hierarchy from a developer's point of view. >> Therefore, >>>>>>>>> based >>>>>>>>>>>>> on >>>>>>>>>>>>>> this premise, it is intuitive and natural for users to >>>>>>>>>>>>>> understand fine-grained configuration options can override >>>> global >>>>>>>>>>>>>> configurations. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled" doesn't >>>> respect to >>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" and the >>>>>>> default >>>>>>>>>>>>> value >>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined as true, >>>>>>>>>>>>>>> it means that just modifying >>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as false >>>> will >>>>>>>>>>>>> have no >>>>>>>>>>>>>>> effect, and filter pushdown will still be performed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If we define the default value of >>>> "scan.filter-push-down.enabled" >>>>>>> as >>>>>>>>>>>>>>> false, it would introduce a difference in behavior compared >> to >>>> the >>>>>>>>>>>>> previous >>>>>>>>>>>>>>> version. >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> <1>If I understand correctly, "scan.filter-push-down.enabled" >>>> is a >>>>>>>>>>>>>> connector option, which means the only way to configure it is >> to >>>>>>>>>>>>> explicitly >>>>>>>>>>>>>> specify it in DDL (no matter whether disable or enable), and >> the >>>>>>> SET >>>>>>>>>>>>>> command is not applicable, so I think it's natural to still >>>> respect >>>>>>>>>>>>> user's >>>>>>>>>>>>>> specification here. Otherwise, users might be more confused >>>> about >>>>>>> why >>>>>>>>>>>>> the >>>>>>>>>>>>>> DDL does not work as expected, and the reason is just because >>>> some >>>>>>>>>>>>> other >>>>>>>>>>>>>> "optimizer" configuration is set to a different value. >>>>>>>>>>>>>> >>>>>>>>>>>>>> <2> From the implementation side, I am inclined to keep the >>>>>>>>> parameter's >>>>>>>>>>>>>> priority consistent for all conditions. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Let "global" denote >>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled", >>>>>>>>>>>>>> and let "per-source" denote "scan.filter-push-down.enabled" >> for >>>>>>>>>>>>> specific >>>>>>>>>>>>>> source T, the following Truth table (based on the current >>>> design) >>>>>>>>>>>>>> indicates the inconsistent behavior for "per-source override >>>>>>> global". >>>>>>>>>>>>>> >>>>>>>>>>>>>> .------------.---------------.------------------- >>>>>>>>>>>>>> ----.-------------------------------------. >>>>>>>>>>>>>> | global | per-source | push-down for T | per-source >> override >>>>>>>>> global >>>>>>>>>>>>> | >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> >> :-----------+--------------+-----------------------+------------------------------------: >>>>>>>>>>>>>> | true | false | false | Y >>>>>>>>>>>>>> | >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> >> :-----------+--------------+-----------------------+------------------------------------: >>>>>>>>>>>>>> | false | true | false | N >>>>>>>>>>>>>> | >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> >> .------------.---------------.-----------------------.-------------------------------------. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Best, >>>>>>>>>>>>>> Jane >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun < >>>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>>> .invalid> >>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks Benchao for the feedback. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> I understand that the configuration of global parallelism and >>>> task >>>>>>>>>>>>>>> parallelism is at different granularities but with the same >>>>>>>>>>>>> configuration. >>>>>>>>>>>>>>> However, "table.optimizer.source.predicate-pushdown-enabled" >>>> and >>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are configurations for >>>> different >>>>>>>>>>>>>>> components(optimizer and source operator). >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> From a user's perspective, there are two scenarios: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 1. Disabling all filter pushdown >>>>>>>>>>>>>>> In this case, setting >>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" >>>>>>>>>>>>>>> to false is sufficient to meet the requirement. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> 2. Disabling filter pushdown for specific sources >>>>>>>>>>>>>>> In this scenario, there is no need to adjust the value of >>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled". >>>>>>>>>>>>>>> Instead, the focus should be on the configuration of >>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to meet the requirement. >>>>>>>>>>>>>>> In this case, users do not need to set >>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" to false >>>> and >>>>>>>>>>>>> manually >>>>>>>>>>>>>>> enable filter pushdown for specific sources. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled" doesn't >>>> respect >>>>>>> to >>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" and the >>>>>>> default >>>>>>>>>>>>> value >>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined as true, >>>>>>>>>>>>>>> it means that just modifying >>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as false >>>> will >>>>>>>>>>>>> have no >>>>>>>>>>>>>>> effect, and filter pushdown will still be performed. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> If we define the default value of >>>> "scan.filter-push-down.enabled" >>>>>>> as >>>>>>>>>>>>>>> false, it would introduce a difference in behavior compared >> to >>>> the >>>>>>>>>>>>> previous >>>>>>>>>>>>>>> version. >>>>>>>>>>>>>>> The same SQL query that could successfully push down filters >> in >>>>>>> the >>>>>>>>>>>>> old >>>>>>>>>>>>>>> version but would no longer do so after the upgrade. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> 2023年10月25日 17:10,Benchao Li <libenc...@apache.org> 写道: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Thanks Jiabao for the detailed explanations, that helps a >>>> lot, I >>>>>>>>>>>>>>>> understand your rationale now. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Correct me if I'm wrong. Your perspective is from >> "developer", >>>>>>>>> which >>>>>>>>>>>>>>>> means there is an optimizer and connector component, and if >> we >>>>>>> want >>>>>>>>>>>>> to >>>>>>>>>>>>>>>> enable this feature (pushing filters down into connectors), >>>> you >>>>>>>>> must >>>>>>>>>>>>>>>> enable it firstly in optimizer, and only then connector has >>>> the >>>>>>>>>>>>> chance >>>>>>>>>>>>>>>> to decide to use it or not. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> My perspective is from "user" that (Why a user should care >>>> about >>>>>>>>> the >>>>>>>>>>>>>>>> difference of optimizer/connector) , this is a feature, and >>>> has >>>>>>> two >>>>>>>>>>>>>>>> way to control it, one way is to config it job-level, the >>>> other >>>>>>> one >>>>>>>>>>>>> is >>>>>>>>>>>>>>>> in table properties. What a user expects is that they can >>>>>>> control a >>>>>>>>>>>>>>>> feature in a tiered way, that setting it per job, and then >>>>>>>>>>>>>>>> fine-grained tune it per table. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> This is some kind of similar to other concepts, such as >>>>>>>>> parallelism, >>>>>>>>>>>>>>>> users can set a job level default parallelism, and then >>>>>>>>> fine-grained >>>>>>>>>>>>>>>> tune it per operator. There may be more such debate in the >>>> future >>>>>>>>>>>>>>>> e.g., we can have a job level config about adding key-by >>>> before >>>>>>>>>>>>> lookup >>>>>>>>>>>>>>>> join, and also a hint/table property way to fine-grained >>>> control >>>>>>> it >>>>>>>>>>>>>>>> per lookup operator. Hence we'd better find a unified way >> for >>>> all >>>>>>>>>>>>>>>> those similar kind of features. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> 于2023年10月25日周三 >>>>>>>>> 15:27写道: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Thanks Jane for further explanation. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> These two configurations correspond to different levels. >>>>>>>>>>>>>>> "scan.filter-push-down.enabled" does not make >>>>>>>>>>>>>>> "table.optimizer.source.predicate" invalid. >>>>>>>>>>>>>>>>> The planner will still push down predicates to all sources. >>>>>>>>>>>>>>>>> Whether filter pushdown is allowed or not is determined by >>>> the >>>>>>>>>>>>> specific >>>>>>>>>>>>>>> source's "scan.filter-push-down.enabled" configuration. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> However, "table.optimizer.source.predicate" does directly >>>> affect >>>>>>>>>>>>>>> "scan.filter-push-down.enabled”. >>>>>>>>>>>>>>>>> When the planner disables predicate pushdown, the >>>> source-level >>>>>>>>>>>>> filter >>>>>>>>>>>>>>> pushdown will also not be executed, even if the source allows >>>>>>> filter >>>>>>>>>>>>>>> pushdown. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Whatever, in point 1 and 2, our expectation is consistent. >>>>>>>>>>>>>>>>> For the 3rd point, I still think that the planner-level >>>>>>>>>>>>> configuration >>>>>>>>>>>>>>> takes precedence over the source-level configuration. >>>>>>>>>>>>>>>>> It may seem counterintuitive when we globally disable >>>> predicate >>>>>>>>>>>>>>> pushdown but allow filter pushdown at the source level. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2023年10月25日 14:35,Jane Chan <qingyue....@gmail.com> 写道: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Jiabao, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks for clarifying this. While by >>>>>>>>> "scan.filter-push-down.enabled >>>>>>>>>>>>>>> takes a >>>>>>>>>>>>>>>>>> higher priority" I meant that this value should be >> respected >>>>>>>>>>>>> whenever >>>>>>>>>>>>>>> it is >>>>>>>>>>>>>>>>>> set explicitly. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> The conclusion that >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and >>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false" >>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate pushdown, but >>>>>>> individual >>>>>>>>>>>>>>> sources do >>>>>>>>>>>>>>>>>>> not enable filter pushdown. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> This indicates that the option >>>> "scan.filter-push-down.enabled = >>>>>>>>>>>>> false" >>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>> an individual source connector does indeed override the >>>>>>>>>>>>> global-level >>>>>>>>>>>>>>>>>> planner settings to make a difference. And thus "has a >>>> higher >>>>>>>>>>>>>>> priority". >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> While for >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false" >>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the planner. >>>>>>>>>>>>>>>>>>> Regardless of the value of the >>>> "scan.filter-push-down.enabled" >>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled. >>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains consistent with >> the >>>> old >>>>>>>>>>>>>>> version as >>>>>>>>>>>>>>>>>>> well. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> I still think "scan.filter-push-down.enabled" should also >> be >>>>>>>>>>>>> respected >>>>>>>>>>>>>>> if >>>>>>>>>>>>>>>>>> it is enabled for individual connectors. WDYT? >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>> Jane >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun < >>>>>>>>>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>>>>> .invalid> >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> For the current proposal, we recommend keeping the >> default >>>>>>> value >>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" as true, >>>>>>>>>>>>>>>>>>> and setting the the default value of newly introduced >>>> option >>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to true as well. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> The main purpose of doing this is to maintain consistency >>>> with >>>>>>>>>>>>>>> previous >>>>>>>>>>>>>>>>>>> versions, as whether to perform >>>>>>>>>>>>>>>>>>> filter pushdown in the old version solely depends on the >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" option. >>>>>>>>>>>>>>>>>>> That means by default, as long as a TableSource >> implements >>>> the >>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface, filter pushdown is >>>> allowed. >>>>>>>>>>>>>>>>>>> And it seems that we don't have much benefit in changing >>>> the >>>>>>>>>>>>> default >>>>>>>>>>>>>>> value >>>>>>>>>>>>>>>>>>> of "table.optimizer.source.predicate" to false. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Regarding the priority of these two configurations, I >>>> believe >>>>>>>>> that >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>>>>>>>>> takes precedence over "scan.filter-push-down.enabled" and >>>> it >>>>>>>>>>>>> exhibits >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> following behavior. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 1. "table.optimizer.source.predicate" = "true" and >>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "true" >>>>>>>>>>>>>>>>>>> This is the default behavior, allowing filter pushdown >> for >>>>>>>>>>>>> sources. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and >>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false" >>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate pushdown, but >>>>>>> individual >>>>>>>>>>>>>>> sources do >>>>>>>>>>>>>>>>>>> not enable filter pushdown. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false" >>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the planner. >>>>>>>>>>>>>>>>>>> Regardless of the value of the >>>> "scan.filter-push-down.enabled" >>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled. >>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains consistent with >> the >>>> old >>>>>>>>>>>>>>> version as >>>>>>>>>>>>>>>>>>> well. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> From an implementation perspective, setting the priority >> of >>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" higher than >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" is difficult to >> achieve >>>>>>> now. >>>>>>>>>>>>>>>>>>> Because the PushFilterIntoSourceScanRuleBase at the >> planner >>>>>>>>> level >>>>>>>>>>>>>>> takes >>>>>>>>>>>>>>>>>>> precedence over the source-level FilterPushDownSpec. >>>>>>>>>>>>>>>>>>> Only when the PushFilterIntoSourceScanRuleBase is >> enabled, >>>>>>> will >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> Source-level filter pushdown be performed. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Additionally, in my opinion, there doesn't seem to be >> much >>>>>>>>>>>>> benefit in >>>>>>>>>>>>>>>>>>> setting a higher priority for >>>> "scan.filter-push-down.enabled". >>>>>>>>>>>>>>>>>>> It may instead affect compatibility and increase >>>>>>> implementation >>>>>>>>>>>>>>> complexity. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <libenc...@apache.org> 写道: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I agree with Jane that fine-grained configurations >> should >>>>>>> have >>>>>>>>>>>>> higher >>>>>>>>>>>>>>>>>>>> priority than job level configurations. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> For current proposal, we can achieve that: >>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "true" to >>>> enable >>>>>>> by >>>>>>>>>>>>>>>>>>>> default, and set ""scan.filter-push-down.enabled" = >>>> "false" >>>>>>> to >>>>>>>>>>>>>>> disable >>>>>>>>>>>>>>>>>>>> it per table source >>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "false" to >>>> disable >>>>>>>>> by >>>>>>>>>>>>>>>>>>>> default, and set ""scan.filter-push-down.enabled" = >>>> "true" to >>>>>>>>>>>>> enable >>>>>>>>>>>>>>>>>>>> it per table source >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com> 于2023年10月24日周二 >> 23:55写道: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I believe that the configuration >>>>>>>>>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>>>>>>>>> has a >>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level than the >>>> configuration >>>>>>>>> at >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> source >>>>>>>>>>>>>>>>>>>>>> level, >>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong, but I think the fine-grained >>>>>>>>>>>>> configuration >>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" should have a higher >>>>>>> priority >>>>>>>>>>>>>>> because >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>> default value of "table.optimizer.source.predicate" is >>>> true. >>>>>>>>> As >>>>>>>>>>>>> a >>>>>>>>>>>>>>>>>>> result, >>>>>>>>>>>>>>>>>>>>> turning off filter push-down for a specific source will >>>> not >>>>>>>>> take >>>>>>>>>>>>>>> effect >>>>>>>>>>>>>>>>>>>>> unless the default value of >>>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>> changed >>>>>>>>>>>>>>>>>>>>> to false, or, alternatively, let users manually set >>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" to false first and >>>> then >>>>>>>>>>>>>>> selectively >>>>>>>>>>>>>>>>>>>>> enable filter push-down for the desired sources, which >> is >>>>>>> less >>>>>>>>>>>>>>>>>>> intuitive. >>>>>>>>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>> Jane >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun < >>>>>>>>>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>>>>>>>>> .invalid> >>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks Jane, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> I believe that the configuration >>>>>>>>>>>>> "table.optimizer.source.predicate" >>>>>>>>>>>>>>>>>>> has a >>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level than the >>>> configuration >>>>>>>>> at >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> source >>>>>>>>>>>>>>>>>>>>>> level, >>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <qingyue....@gmail.com> >>>> 写道: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao, >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this discussion. I have a small >>>>>>> question >>>>>>>>>>>>> that >>>>>>>>>>>>>>> will >>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take precedence over >>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" when the two >>>> parameters >>>>>>>>>>>>> might >>>>>>>>>>>>>>>>>>> conflict >>>>>>>>>>>>>>>>>>>>>>> each other? >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>> Jane >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun < >>>>>>>>>>>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>>>>>>>>>>>> .invalid> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks Jark, >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> If we only add configuration without adding the >>>>>>>>>>>>>>> enableFilterPushDown >>>>>>>>>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown interface, >>>>>>>>>>>>>>>>>>>>>>>> each connector would have to handle the same logic >> in >>>> the >>>>>>>>>>>>>>>>>>> applyFilters >>>>>>>>>>>>>>>>>>>>>>>> method to determine whether filter pushdown is >> needed. >>>>>>>>>>>>>>>>>>>>>>>> This would increase complexity and violate the >>>> original >>>>>>>>>>>>> behavior >>>>>>>>>>>>>>> of >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> applyFilters method. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On the contrary, we only need to pass the >>>> configuration >>>>>>>>>>>>>>> parameter in >>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> newly added enableFilterPushDown method >>>>>>>>>>>>>>>>>>>>>>>> to decide whether to perform predicate pushdown. >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> I think this approach would be clearer and simpler. >>>>>>>>>>>>>>>>>>>>>>>> WDYT? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <imj...@gmail.com> 写道: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Hi JIabao, >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I think the current interface can already satisfy >>>> your >>>>>>>>>>>>>>> requirements. >>>>>>>>>>>>>>>>>>>>>>>>> The connector can reject all the filters by >> returning >>>>>>> the >>>>>>>>>>>>> input >>>>>>>>>>>>>>>>>>> filters >>>>>>>>>>>>>>>>>>>>>>>>> as `Result#remainingFilters`. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> So maybe we don't need to introduce a new method to >>>>>>>>> disable >>>>>>>>>>>>>>>>>>>>>>>>> pushdown, but just introduce an option for the >>>> specific >>>>>>>>>>>>>>> connector. >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>>> Jark >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard Xu < >>>>>>>>> xbjt...@gmail.com >>>>>>>>>>>>>> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this discussion. >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Could you add a section to explain the difference >>>>>>> between >>>>>>>>>>>>>>> proposed >>>>>>>>>>>>>>>>>>>>>>>>>> connector level config >>>> `scan.filter-push-down.enabled` >>>>>>>>> and >>>>>>>>>>>>>>> existing >>>>>>>>>>>>>>>>>>>>>>>> query >>>>>>>>>>>>>>>>>>>>>>>>>> level config >>>>>>>>>>>>>>> `table.optimizer.source.predicate-pushdown-enabled` ? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>>>> Leonard >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun < >>>>>>> jiabao....@xtransfer.cn >>>>>>>>>>>>>>> .INVALID> >>>>>>>>>>>>>>>>>>> 写道: >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Devs, >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion on FLIP-377: >>>>>>> support >>>>>>>>>>>>>>>>>>> configuration >>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> disable filter pushdown for Table/SQL Sources[1]. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not expose >>>>>>> fine-grained >>>>>>>>>>>>>>> control >>>>>>>>>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>>>>>> users to enable or disable filter pushdown. >>>>>>>>>>>>>>>>>>>>>>>>>>> However, filter pushdown has some side effects, >>>> such >>>>>>> as >>>>>>>>>>>>>>> additional >>>>>>>>>>>>>>>>>>>>>>>>>> computational pressure on external systems. >>>>>>>>>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead to issues >> such >>>> as >>>>>>>>> full >>>>>>>>>>>>>>> table >>>>>>>>>>>>>>>>>>>>>> scans, >>>>>>>>>>>>>>>>>>>>>>>>>> which in turn can impact the stability of external >>>>>>>>> systems. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with two sources: >>>> Kafka >>>>>>>>> and a >>>>>>>>>>>>>>>>>>> database. >>>>>>>>>>>>>>>>>>>>>>>>>>> The database is sensitive to pressure, and we >> want >>>> to >>>>>>>>>>>>>>> configure >>>>>>>>>>>>>>>>>>> it to >>>>>>>>>>>>>>>>>>>>>>>>>> not perform filter pushdown to the database >> source. >>>>>>>>>>>>>>>>>>>>>>>>>>> However, we still want to perform filter pushdown >>>> to >>>>>>> the >>>>>>>>>>>>> Kafka >>>>>>>>>>>>>>>>>>> source >>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>> decrease network IO. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> I propose to support configuration to disable >>>> filter >>>>>>>>> push >>>>>>>>>>>>>>> down for >>>>>>>>>>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide whether to >>>> perform >>>>>>>>>>>>> filter >>>>>>>>>>>>>>>>>>>>>> pushdown. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback. >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> >> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768 >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>>>>>> Benchao Li >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> -- >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Best, >>>>>>>>>>>>>>>> Benchao Li >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>> >>>> >>>> >> >> >>