Thanks Venkatakrishnan for the feedback.

Taking MySQL as an example, if the pushed-down filter does not hit an index, it 
will result in a full table scan. 
For a table with a large amount of data, a full table scan can consume a 
significant amount of CPU resources,
increase response time, hold connections for a long time, and impact the 
overall performance of the database.

Best,
Jiabao


> 2023年10月28日 13:34,Venkatakrishnan Sowrirajan <vsowr...@asu.edu> 写道:
> 
> Thanks for the proposal, Jiabao.
> 
> I agree with Becket if a *Source* is implementing the *SupportsXXXPushDown*
> (in this case *SupportsFilterPushdown*) interface, then the *Source* (in
> your FLIP example which is a database) is designed to support filter
> pushdown. The corresponding Source can have mechanisms built into it to
> detect cases where applying the filter pushdown adds additional computation
> pressure which can affect the stability of the system - if so disable it.
> 
> Could you please elaborate on the use cases where users know upfront itself
> (but not detectable at the source level), that for a specific job or SQL,
> where *applyFilters *could negatively affect the overall performance of the
> query or the external system or any other use cases where the ***PushDown *has
> to be selectively disabled for specific sources?
> 
> Regards
> Venkata krishnan
> 
> 
> On Fri, Oct 27, 2023 at 2:48 AM Jark Wu <imj...@gmail.com 
> <mailto:imj...@gmail.com>> wrote:
> 
>> Hi Becket,
>> 
>> I checked the history of "
>> *table.optimizer.source.predicate-pushdown-enabled*",
>> it seems it was introduced since the legacy FilterableTableSource interface
>> which might be an experiential feature at that time. I don't see the
>> necessity
>> of this option at the moment. Maybe we can deprecate this option and drop
>> it
>> in Flink 2.0[1] if it is not necessary anymore. This may help to
>> simplify this discussion.
>> 
>> 
>> Best,
>> Jark
>> 
>> [1]:
>> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/FLINK-32383__;!!IKRxdwAv5BmarQ!dc-Q4Kn9OWLkpDKBZwATS0hujC6KJShXBh_sk3-W2giD8vNbfm3UdHq4mAhiXw5ITHkQSl5HYkzkCw$
>> 
>> 
>> 
>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <becket....@gmail.com 
>> <mailto:becket....@gmail.com>> wrote:
>> 
>>> Thanks for the proposal, Jiabao. My two cents below:
>>> 
>>> 1. If I understand correctly, the motivation of the FLIP is mainly to
>> make
>>> predicate pushdown optional on SOME of the Sources. If so, intuitively
>> the
>>> configuration should be Source specific instead of general. Otherwise, we
>>> will end up with general configurations that may not take effect for some
>>> of the Source implementations. This violates the basic rule of a
>>> configuration - it does what it says, regardless of the implementation.
>>> While configuration standardization is usually a good thing, it should
>> not
>>> break the basic rules.
>>> If we really want to have this general configuration, for the sources
>> this
>>> configuration does not apply, they should throw an exception to make it
>>> clear that this configuration is not supported. However, that seems ugly.
>>> 
>>> 2. I think the actual motivation of this FLIP is about "how a source
>>> should implement predicate pushdown efficiently", not "whether predicate
>>> pushdown should be applied to the source." For example, if a source wants
>>> to avoid additional computing load in the external system, it can always
>>> read the entire record and apply the predicates by itself. However, from
>>> the Flink perspective, the predicate pushdown is applied, it is just
>>> implemented differently by the source. So the design principle here is
>> that
>>> Flink only cares about whether a source supports predicate pushdown or
>> not,
>>> it does not care about the implementation efficiency / side effect of the
>>> predicates pushdown. It is the Source implementation's responsibility to
>>> ensure the predicates pushdown is implemented efficiently and does not
>>> impose excessive pressure on the external system. And it is OK to have
>>> additional configurations to achieve this goal. Obviously, such
>>> configurations will be source specific in this case.
>>> 
>>> 3. Regarding the existing configurations of
>> *table.optimizer.source.predicate-pushdown-enabled.
>>> *I am not sure why we need it. Supposedly, if a source implements a
>>> SupportsXXXPushDown interface, the optimizer should push the
>> corresponding
>>> predicates to the Source. I am not sure in which case this configuration
>>> would be used. Any ideas @Jark Wu <imj...@gmail.com 
>>> <mailto:imj...@gmail.com>>?
>>> 
>>> Thanks,
>>> 
>>> Jiangjie (Becket) Qin
>>> 
>>> 
>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun
>>> <jiabao....@xtransfer.cn.invalid <mailto:jiabao....@xtransfer.cn.invalid>> 
>>> wrote:
>>> 
>>>> Thanks Jane for the detailed explanation.
>>>> 
>>>> I think that for users, we should respect conventions over
>>>> configurations.
>>>> Conventions can be default values explicitly specified in
>> configurations,
>>>> or they can be behaviors that follow previous versions.
>>>> If the same code has different behaviors in different versions, it would
>>>> be a very bad thing.
>>>> 
>>>> I agree that for regular users, it is not necessary to understand all
>> the
>>>> configurations related to Flink.
>>>> By following conventions, they can have a good experience.
>>>> 
>>>> Let's get back to the practical situation and consider it.
>>>> 
>>>> Case 1:
>>>> The user is not familiar with the purpose of the
>>>> table.optimizer.source.predicate-pushdown-enabled configuration but
>> follows
>>>> the convention of allowing predicate pushdown to the source by default.
>>>> Just understanding the source.predicate-pushdown-enabled configuration
>>>> and performing fine-grained toggle control will work well.
>>>> 
>>>> Case 2:
>>>> The user understands the meaning of the
>>>> table.optimizer.source.predicate-pushdown-enabled configuration and has
>> set
>>>> its value to false.
>>>> We have reason to believe that the user understands the meaning of the
>>>> predicate pushdown configuration and the intention is to disable
>> predicate
>>>> pushdown (rather than whether or not to allow it).
>>>> The previous choice of globally disabling it is likely because it
>>>> couldn't be disabled on individual sources.
>>>> From this perspective, if we provide more fine-grained configuration
>>>> support and provide detailed explanations of the configuration
>> behaviors in
>>>> the documentation,
>>>> users can clearly understand the differences between these two
>>>> configurations and use them correctly.
>>>> 
>>>> Also, I don't agree that
>>>> table.optimizer.source.predicate-pushdown-enabled = true and
>>>> source.predicate-pushdown-enabled = false means that the local
>>>> configuration overrides the global configuration.
>>>> On the contrary, both configurations are functioning correctly.
>>>> The optimizer allows predicate pushdown to all sources, but some sources
>>>> can reject the filters pushed down by the optimizer.
>>>> This is natural, just like different components at different levels are
>>>> responsible for different tasks.
>>>> 
>>>> The more serious issue is that if "source.predicate-pushdown-enabled"
>>>> does not respect "table.optimizer.source.predicate-pushdown-enabled”,
>>>> the "table.optimizer.source.predicate-pushdown-enabled" configuration
>>>> will be invalidated.
>>>> This means that regardless of whether
>>>> "table.optimizer.source.predicate-pushdown-enabled" is set to true or
>>>> false, it will have no effect.
>>>> 
>>>> Best,
>>>> Jiabao
>>>> 
>>>> 
>>>>> 2023年10月25日 22:24,Jane Chan <qingyue....@gmail.com 
>>>>> <mailto:qingyue....@gmail.com>> 写道:
>>>>> 
>>>>> Hi Jiabao,
>>>>> 
>>>>> Thanks for the in-depth clarification. Here are my cents
>>>>> 
>>>>> However, "table.optimizer.source.predicate-pushdown-enabled" and
>>>>>> "scan.filter-push-down.enabled" are configurations for different
>>>>>> components(optimizer and source operator).
>>>>>> 
>>>>> 
>>>>> We cannot assume that every user would be interested in understanding
>>>> the
>>>>> internal components of Flink, such as the optimizer or connectors, and
>>>> the
>>>>> specific configurations associated with each component. Instead, users
>>>>> might be more concerned about knowing which configuration enables or
>>>>> disables the filter push-down feature for all source connectors, and
>>>> which
>>>>> parameter provides the flexibility to override this behavior for a
>>>> single
>>>>> source if needed.
>>>>> 
>>>>> So, from this perspective, I am inclined to divide these two
>> parameters
>>>>> based on the scope of their impact from the user's perspective (i.e.
>>>>> global-level or operator-level), rather than categorizing them based
>> on
>>>> the
>>>>> component hierarchy from a developer's point of view. Therefore, based
>>>> on
>>>>> this premise, it is intuitive and natural for users to
>>>>> understand fine-grained configuration options can override global
>>>>> configurations.
>>>>> 
>>>>> Additionally, if "scan.filter-push-down.enabled" doesn't respect to
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" and the default
>>>> value
>>>>>> of "scan.filter-push-down.enabled" is defined as true,
>>>>>> it means that just modifying
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as false will
>> have
>>>> no
>>>>>> effect, and filter pushdown will still be performed.
>>>>>> 
>>>>>> If we define the default value of "scan.filter-push-down.enabled" as
>>>>>> false, it would introduce a difference in behavior compared to the
>>>> previous
>>>>>> version.
>>>>>> 
>>>>> 
>>>>> <1>If I understand correctly, "scan.filter-push-down.enabled" is a
>>>>> connector option, which means the only way to configure it is to
>>>> explicitly
>>>>> specify it in DDL (no matter whether disable or enable), and the SET
>>>>> command is not applicable, so I think it's natural to still respect
>>>> user's
>>>>> specification here. Otherwise, users might be more confused about why
>>>> the
>>>>> DDL does not work as expected, and the reason is just because some
>> other
>>>>> "optimizer" configuration is set to a different value.
>>>>> 
>>>>> <2> From the implementation side, I am inclined to keep the
>> parameter's
>>>>> priority consistent for all conditions.
>>>>> 
>>>>> Let "global" denote
>> "table.optimizer.source.predicate-pushdown-enabled",
>>>>> and let "per-source" denote "scan.filter-push-down.enabled" for
>> specific
>>>>> source T,  the following Truth table (based on the current design)
>>>>> indicates the inconsistent behavior for "per-source override global".
>>>>> 
>>>>> .------------.---------------.-------------------
>>>>> ----.-------------------------------------.
>>>>> | global   | per-source | push-down for T | per-source override
>> global |
>>>>> 
>>>> 
>> :-----------+--------------+-----------------------+------------------------------------:
>>>>> | true       | false         | false                    | Y
>>>>>                       |
>>>>> 
>>>> 
>> :-----------+--------------+-----------------------+------------------------------------:
>>>>> | false     | true           | false                    | N
>>>>>                       |
>>>>> 
>>>> 
>> .------------.---------------.-----------------------.-------------------------------------.
>>>>> 
>>>>> Best,
>>>>> Jane
>>>>> 
>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <jiabao....@xtransfer.cn 
>>>>> <mailto:jiabao....@xtransfer.cn>
>>>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Thanks Benchao for the feedback.
>>>>>> 
>>>>>> I understand that the configuration of global parallelism and task
>>>>>> parallelism is at different granularities but with the same
>>>> configuration.
>>>>>> However, "table.optimizer.source.predicate-pushdown-enabled" and
>>>>>> "scan.filter-push-down.enabled" are configurations for different
>>>>>> components(optimizer and source operator).
>>>>>> 
>>>>>> From a user's perspective, there are two scenarios:
>>>>>> 
>>>>>> 1. Disabling all filter pushdown
>>>>>> In this case, setting
>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>> to false is sufficient to meet the requirement.
>>>>>> 
>>>>>> 2. Disabling filter pushdown for specific sources
>>>>>> In this scenario, there is no need to adjust the value of
>>>>>> "table.optimizer.source.predicate-pushdown-enabled".
>>>>>> Instead, the focus should be on the configuration of
>>>>>> "scan.filter-push-down.enabled" to meet the requirement.
>>>>>> In this case, users do not need to set
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" to false and
>>>> manually
>>>>>> enable filter pushdown for specific sources.
>>>>>> 
>>>>>> Additionally, if "scan.filter-push-down.enabled" doesn't respect to
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" and the default
>>>> value
>>>>>> of "scan.filter-push-down.enabled" is defined as true,
>>>>>> it means that just modifying
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as false will
>> have
>>>> no
>>>>>> effect, and filter pushdown will still be performed.
>>>>>> 
>>>>>> If we define the default value of "scan.filter-push-down.enabled" as
>>>>>> false, it would introduce a difference in behavior compared to the
>>>> previous
>>>>>> version.
>>>>>> The same SQL query that could successfully push down filters in the
>> old
>>>>>> version but would no longer do so after the upgrade.
>>>>>> 
>>>>>> Best,
>>>>>> Jiabao
>>>>>> 
>>>>>> 
>>>>>>> 2023年10月25日 17:10,Benchao Li <libenc...@apache.org 
>>>>>>> <mailto:libenc...@apache.org>> 写道:
>>>>>>> 
>>>>>>> Thanks Jiabao for the detailed explanations, that helps a lot, I
>>>>>>> understand your rationale now.
>>>>>>> 
>>>>>>> Correct me if I'm wrong. Your perspective is from "developer", which
>>>>>>> means there is an optimizer and connector component, and if we want
>> to
>>>>>>> enable this feature (pushing filters down into connectors), you must
>>>>>>> enable it firstly in optimizer, and only then connector has the
>> chance
>>>>>>> to decide to use it or not.
>>>>>>> 
>>>>>>> My perspective is from "user" that (Why a user should care about the
>>>>>>> difference of optimizer/connector) , this is a feature, and has two
>>>>>>> way to control it, one way is to config it job-level, the other one
>> is
>>>>>>> in table properties. What a user expects is that they can control a
>>>>>>> feature in a tiered way, that setting it per job, and then
>>>>>>> fine-grained tune it per table.
>>>>>>> 
>>>>>>> This is some kind of similar to other concepts, such as parallelism,
>>>>>>> users can set a job level default parallelism, and then fine-grained
>>>>>>> tune it per operator. There may be more such debate in the future
>>>>>>> e.g., we can have a job level config about adding key-by before
>> lookup
>>>>>>> join, and also a hint/table property way to fine-grained control it
>>>>>>> per lookup operator. Hence we'd better find a unified way for all
>>>>>>> those similar kind of features.
>>>>>>> 
>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid 
>>>>>>> <mailto:jiabao....@xtransfer.cn.invalid>> 于2023年10月25日周三
>> 15:27写道:
>>>>>>>> 
>>>>>>>> Thanks Jane for further explanation.
>>>>>>>> 
>>>>>>>> These two configurations correspond to different levels.
>>>>>> "scan.filter-push-down.enabled" does not make
>>>>>> "table.optimizer.source.predicate" invalid.
>>>>>>>> The planner will still push down predicates to all sources.
>>>>>>>> Whether filter pushdown is allowed or not is determined by the
>>>> specific
>>>>>> source's "scan.filter-push-down.enabled" configuration.
>>>>>>>> 
>>>>>>>> However, "table.optimizer.source.predicate" does directly affect
>>>>>> "scan.filter-push-down.enabled”.
>>>>>>>> When the planner disables predicate pushdown, the source-level
>> filter
>>>>>> pushdown will also not be executed, even if the source allows filter
>>>>>> pushdown.
>>>>>>>> 
>>>>>>>> Whatever, in point 1 and 2, our expectation is consistent.
>>>>>>>> For the 3rd point, I still think that the planner-level
>> configuration
>>>>>> takes precedence over the source-level configuration.
>>>>>>>> It may seem counterintuitive when we globally disable predicate
>>>>>> pushdown but allow filter pushdown at the source level.
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Jiabao
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 2023年10月25日 14:35,Jane Chan <qingyue....@gmail.com 
>>>>>>>>> <mailto:qingyue....@gmail.com>> 写道:
>>>>>>>>> 
>>>>>>>>> Hi Jiabao,
>>>>>>>>> 
>>>>>>>>> Thanks for clarifying this. While by
>> "scan.filter-push-down.enabled
>>>>>> takes a
>>>>>>>>> higher priority" I meant that this value should be respected
>>>> whenever
>>>>>> it is
>>>>>>>>> set explicitly.
>>>>>>>>> 
>>>>>>>>> The conclusion that
>>>>>>>>> 
>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and
>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>> Allow the planner to perform predicate pushdown, but individual
>>>>>> sources do
>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> This indicates that the option "scan.filter-push-down.enabled =
>>>> false"
>>>>>> for
>>>>>>>>> an individual source connector does indeed override the
>> global-level
>>>>>>>>> planner settings to make a difference. And thus "has a higher
>>>>>> priority".
>>>>>>>>> 
>>>>>>>>> While for
>>>>>>>>> 
>>>>>>>>> 3. "table.optimizer.source.predicate" = "false"
>>>>>>>>>> Predicate pushdown is not allowed for the planner.
>>>>>>>>>> Regardless of the value of the "scan.filter-push-down.enabled"
>>>>>>>>>> configuration, filter pushdown is disabled.
>>>>>>>>>> In this scenario, the behavior remains consistent with the old
>>>>>> version as
>>>>>>>>>> well.
>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> I still think "scan.filter-push-down.enabled" should also be
>>>> respected
>>>>>> if
>>>>>>>>> it is enabled for individual connectors. WDYT?
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Jane
>>>>>>>>> 
>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun <
>> jiabao....@xtransfer.cn <mailto:jiabao....@xtransfer.cn>
>>>>>> .invalid>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>> 
>>>>>>>>>> For the current proposal, we recommend keeping the default value
>> of
>>>>>>>>>> "table.optimizer.source.predicate" as true,
>>>>>>>>>> and setting the the default value of newly introduced option
>>>>>>>>>> "scan.filter-push-down.enabled" to true as well.
>>>>>>>>>> 
>>>>>>>>>> The main purpose of doing this is to maintain consistency with
>>>>>> previous
>>>>>>>>>> versions, as whether to perform
>>>>>>>>>> filter pushdown in the old version solely depends on the
>>>>>>>>>> "table.optimizer.source.predicate" option.
>>>>>>>>>> That means by default, as long as a TableSource implements the
>>>>>>>>>> SupportsFilterPushDown interface, filter pushdown is allowed.
>>>>>>>>>> And it seems that we don't have much benefit in changing the
>>>> default
>>>>>> value
>>>>>>>>>> of "table.optimizer.source.predicate" to false.
>>>>>>>>>> 
>>>>>>>>>> Regarding the priority of these two configurations, I believe
>> that
>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>> takes precedence over "scan.filter-push-down.enabled" and it
>>>> exhibits
>>>>>> the
>>>>>>>>>> following behavior.
>>>>>>>>>> 
>>>>>>>>>> 1. "table.optimizer.source.predicate" = "true" and
>>>>>>>>>> "scan.filter-push-down.enabled" = "true"
>>>>>>>>>> This is the default behavior, allowing filter pushdown for
>> sources.
>>>>>>>>>> 
>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and
>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>> Allow the planner to perform predicate pushdown, but individual
>>>>>> sources do
>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>> 
>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false"
>>>>>>>>>> Predicate pushdown is not allowed for the planner.
>>>>>>>>>> Regardless of the value of the "scan.filter-push-down.enabled"
>>>>>>>>>> configuration, filter pushdown is disabled.
>>>>>>>>>> In this scenario, the behavior remains consistent with the old
>>>>>> version as
>>>>>>>>>> well.
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> From an implementation perspective, setting the priority of
>>>>>>>>>> "scan.filter-push-down.enabled" higher than
>>>>>>>>>> "table.optimizer.source.predicate" is difficult to achieve now.
>>>>>>>>>> Because the PushFilterIntoSourceScanRuleBase at the planner level
>>>>>> takes
>>>>>>>>>> precedence over the source-level FilterPushDownSpec.
>>>>>>>>>> Only when the PushFilterIntoSourceScanRuleBase is enabled, will
>> the
>>>>>>>>>> Source-level filter pushdown be performed.
>>>>>>>>>> 
>>>>>>>>>> Additionally, in my opinion, there doesn't seem to be much
>> benefit
>>>> in
>>>>>>>>>> setting a higher priority for "scan.filter-push-down.enabled".
>>>>>>>>>> It may instead affect compatibility and increase implementation
>>>>>> complexity.
>>>>>>>>>> 
>>>>>>>>>> WDYT?
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Jiabao
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <libenc...@apache.org 
>>>>>>>>>>> <mailto:libenc...@apache.org>> 写道:
>>>>>>>>>>> 
>>>>>>>>>>> I agree with Jane that fine-grained configurations should have
>>>> higher
>>>>>>>>>>> priority than job level configurations.
>>>>>>>>>>> 
>>>>>>>>>>> For current proposal, we can achieve that:
>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "true" to enable by
>>>>>>>>>>> default, and set ""scan.filter-push-down.enabled" = "false" to
>>>>>> disable
>>>>>>>>>>> it per table source
>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "false" to disable by
>>>>>>>>>>> default, and set ""scan.filter-push-down.enabled" = "true" to
>>>> enable
>>>>>>>>>>> it per table source
>>>>>>>>>>> 
>>>>>>>>>>> Jane Chan <qingyue....@gmail.com <mailto:qingyue....@gmail.com>> 
>>>>>>>>>>> 于2023年10月24日周二 23:55写道:
>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I believe that the configuration
>>>> "table.optimizer.source.predicate"
>>>>>>>>>> has a
>>>>>>>>>>>>> higher priority at the planner level than the configuration at
>>>> the
>>>>>>>>>> source
>>>>>>>>>>>>> level,
>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> Correct me if I'm wrong, but I think the fine-grained
>>>> configuration
>>>>>>>>>>>> "scan.filter-push-down.enabled" should have a higher priority
>>>>>> because
>>>>>>>>>> the
>>>>>>>>>>>> default value of "table.optimizer.source.predicate" is true.
>> As a
>>>>>>>>>> result,
>>>>>>>>>>>> turning off filter push-down for a specific source will not
>> take
>>>>>> effect
>>>>>>>>>>>> unless the default value of "table.optimizer.source.predicate"
>> is
>>>>>>>>>> changed
>>>>>>>>>>>> to false, or, alternatively, let users manually set
>>>>>>>>>>>> "table.optimizer.source.predicate" to false first and then
>>>>>> selectively
>>>>>>>>>>>> enable filter push-down for the desired sources, which is less
>>>>>>>>>> intuitive.
>>>>>>>>>>>> WDYT?
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jane
>>>>>>>>>>>> 
>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun <
>>>> jiabao....@xtransfer.cn <mailto:jiabao....@xtransfer.cn>
>>>>>>>>>> .invalid>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks Jane,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> I believe that the configuration
>>>> "table.optimizer.source.predicate"
>>>>>>>>>> has a
>>>>>>>>>>>>> higher priority at the planner level than the configuration at
>>>> the
>>>>>>>>>> source
>>>>>>>>>>>>> level,
>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <qingyue....@gmail.com 
>>>>>>>>>>>>>> <mailto:qingyue....@gmail.com>> 写道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks for driving this discussion. I have a small question
>>>> that
>>>>>> will
>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take precedence over
>>>>>>>>>>>>>> "table.optimizer.source.predicate" when the two parameters
>>>> might
>>>>>>>>>> conflict
>>>>>>>>>>>>>> each other?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun <
>>>>>> jiabao....@xtransfer.cn <mailto:jiabao....@xtransfer.cn>
>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks Jark,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> If we only add configuration without adding the
>>>>>> enableFilterPushDown
>>>>>>>>>>>>>>> method in the SupportsFilterPushDown interface,
>>>>>>>>>>>>>>> each connector would have to handle the same logic in the
>>>>>>>>>> applyFilters
>>>>>>>>>>>>>>> method to determine whether filter pushdown is needed.
>>>>>>>>>>>>>>> This would increase complexity and violate the original
>>>> behavior
>>>>>> of
>>>>>>>>>> the
>>>>>>>>>>>>>>> applyFilters method.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On the contrary, we only need to pass the configuration
>>>>>> parameter in
>>>>>>>>>> the
>>>>>>>>>>>>>>> newly added enableFilterPushDown method
>>>>>>>>>>>>>>> to decide whether to perform predicate pushdown.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I think this approach would be clearer and simpler.
>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <imj...@gmail.com 
>>>>>>>>>>>>>>>> <mailto:imj...@gmail.com>> 写道:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi JIabao,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> I think the current interface can already satisfy your
>>>>>> requirements.
>>>>>>>>>>>>>>>> The connector can reject all the filters by returning the
>>>> input
>>>>>>>>>> filters
>>>>>>>>>>>>>>>> as `Result#remainingFilters`.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> So maybe we don't need to introduce a new method to disable
>>>>>>>>>>>>>>>> pushdown, but just introduce an option for the specific
>>>>>> connector.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard Xu <
>> xbjt...@gmail.com <mailto:xbjt...@gmail.com>>
>>>>>> wrote:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this discussion.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Could you add a section to explain the difference between
>>>>>> proposed
>>>>>>>>>>>>>>>>> connector level config `scan.filter-push-down.enabled` and
>>>>>> existing
>>>>>>>>>>>>>>> query
>>>>>>>>>>>>>>>>> level config
>>>>>> `table.optimizer.source.predicate-pushdown-enabled` ?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Leonard
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <jiabao....@xtransfer.cn 
>>>>>>>>>>>>>>>>>> <mailto:jiabao....@xtransfer.cn>
>>>>>> .INVALID>
>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hi Devs,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I would like to start a discussion on FLIP-377: support
>>>>>>>>>> configuration
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> disable filter pushdown for Table/SQL Sources[1].
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not expose fine-grained
>>>>>> control
>>>>>>>>>> for
>>>>>>>>>>>>>>>>> users to enable or disable filter pushdown.
>>>>>>>>>>>>>>>>>> However, filter pushdown has some side effects, such as
>>>>>> additional
>>>>>>>>>>>>>>>>> computational pressure on external systems.
>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead to issues such as
>> full
>>>>>> table
>>>>>>>>>>>>> scans,
>>>>>>>>>>>>>>>>> which in turn can impact the stability of external
>> systems.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with two sources: Kafka and
>> a
>>>>>>>>>> database.
>>>>>>>>>>>>>>>>>> The database is sensitive to pressure, and we want to
>>>>>> configure
>>>>>>>>>> it to
>>>>>>>>>>>>>>>>> not perform filter pushdown to the database source.
>>>>>>>>>>>>>>>>>> However, we still want to perform filter pushdown to the
>>>> Kafka
>>>>>>>>>> source
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> decrease network IO.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> I propose to support configuration to disable filter push
>>>>>> down for
>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide whether to perform
>>>> filter
>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>> 
>>>> 
>> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768__;!!IKRxdwAv5BmarQ!dc-Q4Kn9OWLkpDKBZwATS0hujC6KJShXBh_sk3-W2giD8vNbfm3UdHq4mAhiXw5ITHkQSl4D3HTulQ$
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> --
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Benchao Li
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> 
>>>>>>> Best,
>>>>>>> Benchao Li

Reply via email to