Thanks for digging into the git history, Jark. I agree it makes sense to
deprecate this API in 2.0.

Cheers,

Jiangjie (Becket) Qin

On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <imj...@gmail.com> wrote:

> Hi Becket,
>
> I checked the history of "
> *table.optimizer.source.predicate-pushdown-enabled*",
> it seems it was introduced since the legacy FilterableTableSource
> interface
> which might be an experiential feature at that time. I don't see the
> necessity
> of this option at the moment. Maybe we can deprecate this option and drop
> it
> in Flink 2.0[1] if it is not necessary anymore. This may help to
> simplify this discussion.
>
>
> Best,
> Jark
>
> [1]: https://issues.apache.org/jira/browse/FLINK-32383
>
>
>
> On Thu, 26 Oct 2023 at 10:14, Becket Qin <becket....@gmail.com> wrote:
>
>> Thanks for the proposal, Jiabao. My two cents below:
>>
>> 1. If I understand correctly, the motivation of the FLIP is mainly to
>> make predicate pushdown optional on SOME of the Sources. If so, intuitively
>> the configuration should be Source specific instead of general. Otherwise,
>> we will end up with general configurations that may not take effect for
>> some of the Source implementations. This violates the basic rule of a
>> configuration - it does what it says, regardless of the implementation.
>> While configuration standardization is usually a good thing, it should not
>> break the basic rules.
>> If we really want to have this general configuration, for the sources
>> this configuration does not apply, they should throw an exception to make
>> it clear that this configuration is not supported. However, that seems ugly.
>>
>> 2. I think the actual motivation of this FLIP is about "how a source
>> should implement predicate pushdown efficiently", not "whether predicate
>> pushdown should be applied to the source." For example, if a source wants
>> to avoid additional computing load in the external system, it can always
>> read the entire record and apply the predicates by itself. However, from
>> the Flink perspective, the predicate pushdown is applied, it is just
>> implemented differently by the source. So the design principle here is that
>> Flink only cares about whether a source supports predicate pushdown or not,
>> it does not care about the implementation efficiency / side effect of the
>> predicates pushdown. It is the Source implementation's responsibility to
>> ensure the predicates pushdown is implemented efficiently and does not
>> impose excessive pressure on the external system. And it is OK to have
>> additional configurations to achieve this goal. Obviously, such
>> configurations will be source specific in this case.
>>
>> 3. Regarding the existing configurations of 
>> *table.optimizer.source.predicate-pushdown-enabled.
>> *I am not sure why we need it. Supposedly, if a source implements a
>> SupportsXXXPushDown interface, the optimizer should push the corresponding
>> predicates to the Source. I am not sure in which case this configuration
>> would be used. Any ideas @Jark Wu <imj...@gmail.com>?
>>
>> Thanks,
>>
>> Jiangjie (Becket) Qin
>>
>>
>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun
>> <jiabao....@xtransfer.cn.invalid> wrote:
>>
>>> Thanks Jane for the detailed explanation.
>>>
>>> I think that for users, we should respect conventions over
>>> configurations.
>>> Conventions can be default values explicitly specified in
>>> configurations, or they can be behaviors that follow previous versions.
>>> If the same code has different behaviors in different versions, it would
>>> be a very bad thing.
>>>
>>> I agree that for regular users, it is not necessary to understand all
>>> the configurations related to Flink.
>>> By following conventions, they can have a good experience.
>>>
>>> Let's get back to the practical situation and consider it.
>>>
>>> Case 1:
>>> The user is not familiar with the purpose of the
>>> table.optimizer.source.predicate-pushdown-enabled configuration but follows
>>> the convention of allowing predicate pushdown to the source by default.
>>> Just understanding the source.predicate-pushdown-enabled configuration
>>> and performing fine-grained toggle control will work well.
>>>
>>> Case 2:
>>> The user understands the meaning of the
>>> table.optimizer.source.predicate-pushdown-enabled configuration and has set
>>> its value to false.
>>> We have reason to believe that the user understands the meaning of the
>>> predicate pushdown configuration and the intention is to disable predicate
>>> pushdown (rather than whether or not to allow it).
>>> The previous choice of globally disabling it is likely because it
>>> couldn't be disabled on individual sources.
>>> From this perspective, if we provide more fine-grained configuration
>>> support and provide detailed explanations of the configuration behaviors in
>>> the documentation,
>>> users can clearly understand the differences between these two
>>> configurations and use them correctly.
>>>
>>> Also, I don't agree that
>>> table.optimizer.source.predicate-pushdown-enabled = true and
>>> source.predicate-pushdown-enabled = false means that the local
>>> configuration overrides the global configuration.
>>> On the contrary, both configurations are functioning correctly.
>>> The optimizer allows predicate pushdown to all sources, but some sources
>>> can reject the filters pushed down by the optimizer.
>>> This is natural, just like different components at different levels are
>>> responsible for different tasks.
>>>
>>> The more serious issue is that if "source.predicate-pushdown-enabled"
>>> does not respect "table.optimizer.source.predicate-pushdown-enabled”,
>>> the "table.optimizer.source.predicate-pushdown-enabled" configuration
>>> will be invalidated.
>>> This means that regardless of whether
>>> "table.optimizer.source.predicate-pushdown-enabled" is set to true or
>>> false, it will have no effect.
>>>
>>> Best,
>>> Jiabao
>>>
>>>
>>> > 2023年10月25日 22:24,Jane Chan <qingyue....@gmail.com> 写道:
>>> >
>>> > Hi Jiabao,
>>> >
>>> > Thanks for the in-depth clarification. Here are my cents
>>> >
>>> > However, "table.optimizer.source.predicate-pushdown-enabled" and
>>> >> "scan.filter-push-down.enabled" are configurations for different
>>> >> components(optimizer and source operator).
>>> >>
>>> >
>>> > We cannot assume that every user would be interested in understanding
>>> the
>>> > internal components of Flink, such as the optimizer or connectors, and
>>> the
>>> > specific configurations associated with each component. Instead, users
>>> > might be more concerned about knowing which configuration enables or
>>> > disables the filter push-down feature for all source connectors, and
>>> which
>>> > parameter provides the flexibility to override this behavior for a
>>> single
>>> > source if needed.
>>> >
>>> > So, from this perspective, I am inclined to divide these two parameters
>>> > based on the scope of their impact from the user's perspective (i.e.
>>> > global-level or operator-level), rather than categorizing them based
>>> on the
>>> > component hierarchy from a developer's point of view. Therefore, based
>>> on
>>> > this premise, it is intuitive and natural for users to
>>> > understand fine-grained configuration options can override global
>>> > configurations.
>>> >
>>> > Additionally, if "scan.filter-push-down.enabled" doesn't respect to
>>> >> "table.optimizer.source.predicate-pushdown-enabled" and the default
>>> value
>>> >> of "scan.filter-push-down.enabled" is defined as true,
>>> >> it means that just modifying
>>> >> "table.optimizer.source.predicate-pushdown-enabled" as false will
>>> have no
>>> >> effect, and filter pushdown will still be performed.
>>> >>
>>> >> If we define the default value of "scan.filter-push-down.enabled" as
>>> >> false, it would introduce a difference in behavior compared to the
>>> previous
>>> >> version.
>>> >>
>>> >
>>> > <1>If I understand correctly, "scan.filter-push-down.enabled" is a
>>> > connector option, which means the only way to configure it is to
>>> explicitly
>>> > specify it in DDL (no matter whether disable or enable), and the SET
>>> > command is not applicable, so I think it's natural to still respect
>>> user's
>>> > specification here. Otherwise, users might be more confused about why
>>> the
>>> > DDL does not work as expected, and the reason is just because some
>>> other
>>> > "optimizer" configuration is set to a different value.
>>> >
>>> > <2> From the implementation side, I am inclined to keep the parameter's
>>> > priority consistent for all conditions.
>>> >
>>> > Let "global" denote
>>> "table.optimizer.source.predicate-pushdown-enabled",
>>> > and let "per-source" denote "scan.filter-push-down.enabled" for
>>> specific
>>> > source T,  the following Truth table (based on the current design)
>>> > indicates the inconsistent behavior for "per-source override global".
>>> >
>>> > .------------.---------------.-------------------
>>> > ----.-------------------------------------.
>>> > | global   | per-source | push-down for T | per-source override global
>>> |
>>> >
>>> :-----------+--------------+-----------------------+------------------------------------:
>>> > | true       | false         | false                    | Y
>>> >                        |
>>> >
>>> :-----------+--------------+-----------------------+------------------------------------:
>>> > | false     | true           | false                    | N
>>> >                        |
>>> >
>>> .------------.---------------.-----------------------.-------------------------------------.
>>> >
>>> > Best,
>>> > Jane
>>> >
>>> > On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <jiabao....@xtransfer.cn
>>> .invalid>
>>> > wrote:
>>> >
>>> >> Thanks Benchao for the feedback.
>>> >>
>>> >> I understand that the configuration of global parallelism and task
>>> >> parallelism is at different granularities but with the same
>>> configuration.
>>> >> However, "table.optimizer.source.predicate-pushdown-enabled" and
>>> >> "scan.filter-push-down.enabled" are configurations for different
>>> >> components(optimizer and source operator).
>>> >>
>>> >> From a user's perspective, there are two scenarios:
>>> >>
>>> >> 1. Disabling all filter pushdown
>>> >> In this case, setting
>>> "table.optimizer.source.predicate-pushdown-enabled"
>>> >> to false is sufficient to meet the requirement.
>>> >>
>>> >> 2. Disabling filter pushdown for specific sources
>>> >> In this scenario, there is no need to adjust the value of
>>> >> "table.optimizer.source.predicate-pushdown-enabled".
>>> >> Instead, the focus should be on the configuration of
>>> >> "scan.filter-push-down.enabled" to meet the requirement.
>>> >> In this case, users do not need to set
>>> >> "table.optimizer.source.predicate-pushdown-enabled" to false and
>>> manually
>>> >> enable filter pushdown for specific sources.
>>> >>
>>> >> Additionally, if "scan.filter-push-down.enabled" doesn't respect to
>>> >> "table.optimizer.source.predicate-pushdown-enabled" and the default
>>> value
>>> >> of "scan.filter-push-down.enabled" is defined as true,
>>> >> it means that just modifying
>>> >> "table.optimizer.source.predicate-pushdown-enabled" as false will
>>> have no
>>> >> effect, and filter pushdown will still be performed.
>>> >>
>>> >> If we define the default value of "scan.filter-push-down.enabled" as
>>> >> false, it would introduce a difference in behavior compared to the
>>> previous
>>> >> version.
>>> >> The same SQL query that could successfully push down filters in the
>>> old
>>> >> version but would no longer do so after the upgrade.
>>> >>
>>> >> Best,
>>> >> Jiabao
>>> >>
>>> >>
>>> >>> 2023年10月25日 17:10,Benchao Li <libenc...@apache.org> 写道:
>>> >>>
>>> >>> Thanks Jiabao for the detailed explanations, that helps a lot, I
>>> >>> understand your rationale now.
>>> >>>
>>> >>> Correct me if I'm wrong. Your perspective is from "developer", which
>>> >>> means there is an optimizer and connector component, and if we want
>>> to
>>> >>> enable this feature (pushing filters down into connectors), you must
>>> >>> enable it firstly in optimizer, and only then connector has the
>>> chance
>>> >>> to decide to use it or not.
>>> >>>
>>> >>> My perspective is from "user" that (Why a user should care about the
>>> >>> difference of optimizer/connector) , this is a feature, and has two
>>> >>> way to control it, one way is to config it job-level, the other one
>>> is
>>> >>> in table properties. What a user expects is that they can control a
>>> >>> feature in a tiered way, that setting it per job, and then
>>> >>> fine-grained tune it per table.
>>> >>>
>>> >>> This is some kind of similar to other concepts, such as parallelism,
>>> >>> users can set a job level default parallelism, and then fine-grained
>>> >>> tune it per operator. There may be more such debate in the future
>>> >>> e.g., we can have a job level config about adding key-by before
>>> lookup
>>> >>> join, and also a hint/table property way to fine-grained control it
>>> >>> per lookup operator. Hence we'd better find a unified way for all
>>> >>> those similar kind of features.
>>> >>>
>>> >>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> 于2023年10月25日周三 15:27写道:
>>> >>>>
>>> >>>> Thanks Jane for further explanation.
>>> >>>>
>>> >>>> These two configurations correspond to different levels.
>>> >> "scan.filter-push-down.enabled" does not make
>>> >> "table.optimizer.source.predicate" invalid.
>>> >>>> The planner will still push down predicates to all sources.
>>> >>>> Whether filter pushdown is allowed or not is determined by the
>>> specific
>>> >> source's "scan.filter-push-down.enabled" configuration.
>>> >>>>
>>> >>>> However, "table.optimizer.source.predicate" does directly affect
>>> >> "scan.filter-push-down.enabled”.
>>> >>>> When the planner disables predicate pushdown, the source-level
>>> filter
>>> >> pushdown will also not be executed, even if the source allows filter
>>> >> pushdown.
>>> >>>>
>>> >>>> Whatever, in point 1 and 2, our expectation is consistent.
>>> >>>> For the 3rd point, I still think that the planner-level
>>> configuration
>>> >> takes precedence over the source-level configuration.
>>> >>>> It may seem counterintuitive when we globally disable predicate
>>> >> pushdown but allow filter pushdown at the source level.
>>> >>>>
>>> >>>> Best,
>>> >>>> Jiabao
>>> >>>>
>>> >>>>
>>> >>>>
>>> >>>>> 2023年10月25日 14:35,Jane Chan <qingyue....@gmail.com> 写道:
>>> >>>>>
>>> >>>>> Hi Jiabao,
>>> >>>>>
>>> >>>>> Thanks for clarifying this. While by "scan.filter-push-down.enabled
>>> >> takes a
>>> >>>>> higher priority" I meant that this value should be respected
>>> whenever
>>> >> it is
>>> >>>>> set explicitly.
>>> >>>>>
>>> >>>>> The conclusion that
>>> >>>>>
>>> >>>>> 2. "table.optimizer.source.predicate" = "true" and
>>> >>>>>> "scan.filter-push-down.enabled" = "false"
>>> >>>>>> Allow the planner to perform predicate pushdown, but individual
>>> >> sources do
>>> >>>>>> not enable filter pushdown.
>>> >>>>>>
>>> >>>>>
>>> >>>>> This indicates that the option "scan.filter-push-down.enabled =
>>> false"
>>> >> for
>>> >>>>> an individual source connector does indeed override the
>>> global-level
>>> >>>>> planner settings to make a difference. And thus "has a higher
>>> >> priority".
>>> >>>>>
>>> >>>>> While for
>>> >>>>>
>>> >>>>> 3. "table.optimizer.source.predicate" = "false"
>>> >>>>>> Predicate pushdown is not allowed for the planner.
>>> >>>>>> Regardless of the value of the "scan.filter-push-down.enabled"
>>> >>>>>> configuration, filter pushdown is disabled.
>>> >>>>>> In this scenario, the behavior remains consistent with the old
>>> >> version as
>>> >>>>>> well.
>>> >>>>>>
>>> >>>>>
>>> >>>>> I still think "scan.filter-push-down.enabled" should also be
>>> respected
>>> >> if
>>> >>>>> it is enabled for individual connectors. WDYT?
>>> >>>>>
>>> >>>>> Best,
>>> >>>>> Jane
>>> >>>>>
>>> >>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun <
>>> jiabao....@xtransfer.cn
>>> >> .invalid>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>>> Thanks Benchao for the feedback.
>>> >>>>>>
>>> >>>>>> For the current proposal, we recommend keeping the default value
>>> of
>>> >>>>>> "table.optimizer.source.predicate" as true,
>>> >>>>>> and setting the the default value of newly introduced option
>>> >>>>>> "scan.filter-push-down.enabled" to true as well.
>>> >>>>>>
>>> >>>>>> The main purpose of doing this is to maintain consistency with
>>> >> previous
>>> >>>>>> versions, as whether to perform
>>> >>>>>> filter pushdown in the old version solely depends on the
>>> >>>>>> "table.optimizer.source.predicate" option.
>>> >>>>>> That means by default, as long as a TableSource implements the
>>> >>>>>> SupportsFilterPushDown interface, filter pushdown is allowed.
>>> >>>>>> And it seems that we don't have much benefit in changing the
>>> default
>>> >> value
>>> >>>>>> of "table.optimizer.source.predicate" to false.
>>> >>>>>>
>>> >>>>>> Regarding the priority of these two configurations, I believe that
>>> >>>>>> "table.optimizer.source.predicate"
>>> >>>>>> takes precedence over "scan.filter-push-down.enabled" and it
>>> exhibits
>>> >> the
>>> >>>>>> following behavior.
>>> >>>>>>
>>> >>>>>> 1. "table.optimizer.source.predicate" = "true" and
>>> >>>>>> "scan.filter-push-down.enabled" = "true"
>>> >>>>>> This is the default behavior, allowing filter pushdown for
>>> sources.
>>> >>>>>>
>>> >>>>>> 2. "table.optimizer.source.predicate" = "true" and
>>> >>>>>> "scan.filter-push-down.enabled" = "false"
>>> >>>>>> Allow the planner to perform predicate pushdown, but individual
>>> >> sources do
>>> >>>>>> not enable filter pushdown.
>>> >>>>>>
>>> >>>>>> 3. "table.optimizer.source.predicate" = "false"
>>> >>>>>> Predicate pushdown is not allowed for the planner.
>>> >>>>>> Regardless of the value of the "scan.filter-push-down.enabled"
>>> >>>>>> configuration, filter pushdown is disabled.
>>> >>>>>> In this scenario, the behavior remains consistent with the old
>>> >> version as
>>> >>>>>> well.
>>> >>>>>>
>>> >>>>>>
>>> >>>>>> From an implementation perspective, setting the priority of
>>> >>>>>> "scan.filter-push-down.enabled" higher than
>>> >>>>>> "table.optimizer.source.predicate" is difficult to achieve now.
>>> >>>>>> Because the PushFilterIntoSourceScanRuleBase at the planner level
>>> >> takes
>>> >>>>>> precedence over the source-level FilterPushDownSpec.
>>> >>>>>> Only when the PushFilterIntoSourceScanRuleBase is enabled, will
>>> the
>>> >>>>>> Source-level filter pushdown be performed.
>>> >>>>>>
>>> >>>>>> Additionally, in my opinion, there doesn't seem to be much
>>> benefit in
>>> >>>>>> setting a higher priority for "scan.filter-push-down.enabled".
>>> >>>>>> It may instead affect compatibility and increase implementation
>>> >> complexity.
>>> >>>>>>
>>> >>>>>> WDYT?
>>> >>>>>>
>>> >>>>>> Best,
>>> >>>>>> Jiabao
>>> >>>>>>
>>> >>>>>>
>>> >>>>>>> 2023年10月25日 11:56,Benchao Li <libenc...@apache.org> 写道:
>>> >>>>>>>
>>> >>>>>>> I agree with Jane that fine-grained configurations should have
>>> higher
>>> >>>>>>> priority than job level configurations.
>>> >>>>>>>
>>> >>>>>>> For current proposal, we can achieve that:
>>> >>>>>>> - Set "table.optimizer.source.predicate" = "true" to enable by
>>> >>>>>>> default, and set ""scan.filter-push-down.enabled" = "false" to
>>> >> disable
>>> >>>>>>> it per table source
>>> >>>>>>> - Set "table.optimizer.source.predicate" = "false" to disable by
>>> >>>>>>> default, and set ""scan.filter-push-down.enabled" = "true" to
>>> enable
>>> >>>>>>> it per table source
>>> >>>>>>>
>>> >>>>>>> Jane Chan <qingyue....@gmail.com> 于2023年10月24日周二 23:55写道:
>>> >>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> I believe that the configuration
>>> "table.optimizer.source.predicate"
>>> >>>>>> has a
>>> >>>>>>>>> higher priority at the planner level than the configuration at
>>> the
>>> >>>>>> source
>>> >>>>>>>>> level,
>>> >>>>>>>>> and it seems easy to implement now.
>>> >>>>>>>>>
>>> >>>>>>>>
>>> >>>>>>>> Correct me if I'm wrong, but I think the fine-grained
>>> configuration
>>> >>>>>>>> "scan.filter-push-down.enabled" should have a higher priority
>>> >> because
>>> >>>>>> the
>>> >>>>>>>> default value of "table.optimizer.source.predicate" is true. As
>>> a
>>> >>>>>> result,
>>> >>>>>>>> turning off filter push-down for a specific source will not take
>>> >> effect
>>> >>>>>>>> unless the default value of "table.optimizer.source.predicate"
>>> is
>>> >>>>>> changed
>>> >>>>>>>> to false, or, alternatively, let users manually set
>>> >>>>>>>> "table.optimizer.source.predicate" to false first and then
>>> >> selectively
>>> >>>>>>>> enable filter push-down for the desired sources, which is less
>>> >>>>>> intuitive.
>>> >>>>>>>> WDYT?
>>> >>>>>>>>
>>> >>>>>>>> Best,
>>> >>>>>>>> Jane
>>> >>>>>>>>
>>> >>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun <
>>> jiabao....@xtransfer.cn
>>> >>>>>> .invalid>
>>> >>>>>>>> wrote:
>>> >>>>>>>>
>>> >>>>>>>>> Thanks Jane,
>>> >>>>>>>>>
>>> >>>>>>>>> I believe that the configuration
>>> "table.optimizer.source.predicate"
>>> >>>>>> has a
>>> >>>>>>>>> higher priority at the planner level than the configuration at
>>> the
>>> >>>>>> source
>>> >>>>>>>>> level,
>>> >>>>>>>>> and it seems easy to implement now.
>>> >>>>>>>>>
>>> >>>>>>>>> Best,
>>> >>>>>>>>> Jiabao
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>> 2023年10月24日 17:36,Jane Chan <qingyue....@gmail.com> 写道:
>>> >>>>>>>>>>
>>> >>>>>>>>>> Hi Jiabao,
>>> >>>>>>>>>>
>>> >>>>>>>>>> Thanks for driving this discussion. I have a small question
>>> that
>>> >> will
>>> >>>>>>>>>> "scan.filter-push-down.enabled" take precedence over
>>> >>>>>>>>>> "table.optimizer.source.predicate" when the two parameters
>>> might
>>> >>>>>> conflict
>>> >>>>>>>>>> each other?
>>> >>>>>>>>>>
>>> >>>>>>>>>> Best,
>>> >>>>>>>>>> Jane
>>> >>>>>>>>>>
>>> >>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun <
>>> >> jiabao....@xtransfer.cn
>>> >>>>>>>>> .invalid>
>>> >>>>>>>>>> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>> Thanks Jark,
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> If we only add configuration without adding the
>>> >> enableFilterPushDown
>>> >>>>>>>>>>> method in the SupportsFilterPushDown interface,
>>> >>>>>>>>>>> each connector would have to handle the same logic in the
>>> >>>>>> applyFilters
>>> >>>>>>>>>>> method to determine whether filter pushdown is needed.
>>> >>>>>>>>>>> This would increase complexity and violate the original
>>> behavior
>>> >> of
>>> >>>>>> the
>>> >>>>>>>>>>> applyFilters method.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On the contrary, we only need to pass the configuration
>>> >> parameter in
>>> >>>>>> the
>>> >>>>>>>>>>> newly added enableFilterPushDown method
>>> >>>>>>>>>>> to decide whether to perform predicate pushdown.
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> I think this approach would be clearer and simpler.
>>> >>>>>>>>>>> WDYT?
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> Best,
>>> >>>>>>>>>>> Jiabao
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <imj...@gmail.com> 写道:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Hi JIabao,
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> I think the current interface can already satisfy your
>>> >> requirements.
>>> >>>>>>>>>>>> The connector can reject all the filters by returning the
>>> input
>>> >>>>>> filters
>>> >>>>>>>>>>>> as `Result#remainingFilters`.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> So maybe we don't need to introduce a new method to disable
>>> >>>>>>>>>>>> pushdown, but just introduce an option for the specific
>>> >> connector.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Best,
>>> >>>>>>>>>>>> Jark
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard Xu <xbjt...@gmail.com
>>> >
>>> >> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>> Thanks @Jiabao for kicking off this discussion.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Could you add a section to explain the difference between
>>> >> proposed
>>> >>>>>>>>>>>>> connector level config `scan.filter-push-down.enabled` and
>>> >> existing
>>> >>>>>>>>>>> query
>>> >>>>>>>>>>>>> level config
>>> >> `table.optimizer.source.predicate-pushdown-enabled` ?
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> Best,
>>> >>>>>>>>>>>>> Leonard
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <jiabao....@xtransfer.cn
>>> >> .INVALID>
>>> >>>>>> 写道:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Hi Devs,
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> I would like to start a discussion on FLIP-377: support
>>> >>>>>> configuration
>>> >>>>>>>>>>> to
>>> >>>>>>>>>>>>> disable filter pushdown for Table/SQL Sources[1].
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Currently, Flink Table/SQL does not expose fine-grained
>>> >> control
>>> >>>>>> for
>>> >>>>>>>>>>>>> users to enable or disable filter pushdown.
>>> >>>>>>>>>>>>>> However, filter pushdown has some side effects, such as
>>> >> additional
>>> >>>>>>>>>>>>> computational pressure on external systems.
>>> >>>>>>>>>>>>>> Moreover, Improper queries can lead to issues such as full
>>> >> table
>>> >>>>>>>>> scans,
>>> >>>>>>>>>>>>> which in turn can impact the stability of external systems.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Suppose we have an SQL query with two sources: Kafka and a
>>> >>>>>> database.
>>> >>>>>>>>>>>>>> The database is sensitive to pressure, and we want to
>>> >> configure
>>> >>>>>> it to
>>> >>>>>>>>>>>>> not perform filter pushdown to the database source.
>>> >>>>>>>>>>>>>> However, we still want to perform filter pushdown to the
>>> Kafka
>>> >>>>>> source
>>> >>>>>>>>>>> to
>>> >>>>>>>>>>>>> decrease network IO.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> I propose to support configuration to disable filter push
>>> >> down for
>>> >>>>>>>>>>>>> Table/SQL sources to let user decide whether to perform
>>> filter
>>> >>>>>>>>> pushdown.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Looking forward to your feedback.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> [1]
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>
>>> >>
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Best,
>>> >>>>>>>>>>>>>> Jiabao
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> --
>>> >>>>>>>
>>> >>>>>>> Best,
>>> >>>>>>> Benchao Li
>>> >>>>>>
>>> >>>>>>
>>> >>>>
>>> >>>
>>> >>>
>>> >>> --
>>> >>>
>>> >>> Best,
>>> >>> Benchao Li
>>> >>
>>> >>
>>>
>>>

Reply via email to