Hi Becket,

The FLIP document[1] has been updated.
Could you help take a look again?

Thanks,
Jiabao

[1] https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768


> 2023年12月18日 16:53,Becket Qin <becket....@gmail.com> 写道:
> 
> Yes, that sounds reasonable to me. We can start with ALWAYS and NEVER, and
> add more policies as needed.
> 
> Thanks,
> 
> Jiangjie (Becket) Qin
> 
> On Mon, Dec 18, 2023 at 4:48 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid>
> wrote:
> 
>> Thanks Bucket,
>> 
>> The jdbc.filter.handling.policy is good to me as it provides sufficient
>> extensibility for future filter pushdown optimizations.
>> However, currently, we don't have an implementation for the AUTO mode, and
>> it seems that the AUTO mode can easily be confused with the ALWAYS mode
>> because users don't have the opportunity to MANUALLY decide which filters
>> to push down.
>> 
>> I suggest that we only introduce the ALWAYS and NEVER modes for now, and
>> we can consider introducing more flexible policies in the future,
>> such as INDEX_ONLY, NUMBERIC_ONLY and so on.
>> 
>> WDYT?
>> 
>> Best,
>> Jiabao
>> 
>> 
>> 
>>> 2023年12月18日 16:27,Becket Qin <becket....@gmail.com> 写道:
>>> 
>>> Hi Jiabao,
>>> 
>>> Please see the reply inline.
>>> 
>>> 
>>>> The MySQL connector is currently in the flink-connector-jdbc repository
>>>> and is not a standalone connector.
>>>> Is it too unique to use "mysql" as the configuration option prefix?
>>> 
>>> If the intended behavior makes sense to all the supported JDBC drivers,
>> we
>>> can make this a JDBC connector configuration.
>>> 
>>> Also, I would like to ask about the difference in behavior between AUTO
>> and
>>>> ALWAYS.
>>>> It seems that we cannot guarantee the pushing down of all filters to the
>>>> external system under the ALWAYS
>>>> mode because not all filters in Flink SQL are supported by the external
>>>> system.
>>>> Should we throw an error when encountering a filter that cannot be
>> pushed
>>>> down in the ALWAYS mode?
>>> 
>>> The idea of AUTO is to do efficiency-aware pushdowns. The source will
>> query
>>> the external system (MySQL, Oracle, SQL Server, etc) first to retrieve
>> the
>>> information of the table. With that information, the source will decide
>>> whether to further push a filter to the external system based on the
>>> efficiency. E.g. only push the indexed fields. In contrast, ALWAYS will
>>> just always push the supported filters to the external system, regardless
>>> of the efficiency. In case there are filters that are not supported,
>>> according to the current contract of SupportsFilterPushdown, these
>>> unsupported filters should just be returned by the
>>> *SupportsFilterPushdown.applyFilters()* method as remaining filters.
>>> Therefore, there is no need to throw exceptions here. This is likely the
>>> desired behavior for most users, IMO. If there are cases that users
>> really
>>> want to get alerted when a filter cannot be pushed to the external
>> system,
>>> we can add another value like "ENFORCED_ALWAYS", which behaves like
>> ALWAYS,
>>> but throws exceptions when a filter cannot be applied to the external
>>> system. But personally I don't see much value in doing this.
>>> 
>>> Thanks,
>>> 
>>> Jiangjie (Becket) Qin
>>> 
>>> 
>>> 
>>> On Mon, Dec 18, 2023 at 3:54 PM Jiabao Sun <jiabao....@xtransfer.cn
>> .invalid>
>>> wrote:
>>> 
>>>> Hi Becket,
>>>> 
>>>> The MySQL connector is currently in the flink-connector-jdbc repository
>>>> and is not a standalone connector.
>>>> Is it too unique to use "mysql" as the configuration option prefix?
>>>> 
>>>> Also, I would like to ask about the difference in behavior between AUTO
>>>> and ALWAYS.
>>>> It seems that we cannot guarantee the pushing down of all filters to the
>>>> external system under the ALWAYS
>>>> mode because not all filters in Flink SQL are supported by the external
>>>> system.
>>>> Should we throw an error when encountering a filter that cannot be
>> pushed
>>>> down in the ALWAYS mode?
>>>> 
>>>> Thanks,
>>>> Jiabao
>>>> 
>>>>> 2023年12月18日 15:34,Becket Qin <becket....@gmail.com> 写道:
>>>>> 
>>>>> Hi JIabao,
>>>>> 
>>>>> Thanks for updating the FLIP. Maybe I did not explain it clearly
>> enough.
>>>> My
>>>>> point is that given there are various good flavors of behaviors
>> handling
>>>>> filters pushed down, we should not have a common config of
>>>>> "ignore.filter.pushdown", because the behavior is not *common*.
>>>>> 
>>>>> It looks like the original motivation of this FLIP is just for MySql.
>>>> Let's
>>>>> focus on what is the best solution for MySql connector here first.
>> After
>>>>> that, if people think the best behavior for MySql happens to be a
>> common
>>>>> one, we can then discuss whether that is worth being added to the base
>>>>> implementation of source. For MySQL, if we are going to introduce a
>>>> config
>>>>> to MySql, why not have something like "mysql.filter.handling.policy"
>> with
>>>>> value of AUTO / NEVER / ALWAYS? Isn't that better than
>>>>> "ignore.filter.pushdown"?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Jiangjie (Becket) Qin
>>>>> 
>>>>> 
>>>>> 
>>>>> On Sun, Dec 17, 2023 at 11:30 PM Jiabao Sun <jiabao....@xtransfer.cn
>>>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Hi Becket,
>>>>>> 
>>>>>> The FLIP document has been updated as well.
>>>>>> Please take a look when you have time.
>>>>>> 
>>>>>> Thanks,
>>>>>> Jiabao
>>>>>> 
>>>>>> 
>>>>>>> 2023年12月17日 22:54,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道:
>>>>>>> 
>>>>>>> Thanks Becket,
>>>>>>> 
>>>>>>> I apologize for not being able to continue with this proposal due to
>>>>>> being too busy during this period.
>>>>>>> 
>>>>>>> The viewpoints you shared about the design of Flink Source make sense
>>>> to
>>>>>> me
>>>>>>> The native configuration ‘ignore.filter.pushdown’ is good to me.
>>>>>>> Having a unified name or naming style can indeed prevent confusion
>> for
>>>>>> users regarding
>>>>>>> the inconsistent naming of this configuration across different
>>>>>> connectors.
>>>>>>> 
>>>>>>> Currently, there are not many external connectors that support filter
>>>>>> pushdown.
>>>>>>> I propose that we first introduce it in flink-connector-jdbc and
>>>>>> flink-connector-mongodb.
>>>>>>> Do you think this is feasible?
>>>>>>> 
>>>>>>> Best,
>>>>>>> Jiabao
>>>>>>> 
>>>>>>> 
>>>>>>>> 2023年11月16日 17:45,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>> 
>>>>>>>> Hi Jiabao,
>>>>>>>> 
>>>>>>>> Arguments like "because Spark has it so Flink should also have it"
>>>> does
>>>>>> not
>>>>>>>> make sense. Different projects have different API flavors and
>> styles.
>>>>>> What
>>>>>>>> is really important is the rationale and the design principle behind
>>>> the
>>>>>>>> API. They should conform to the convention of the project.
>>>>>>>> 
>>>>>>>> First of all, Spark Source API itself has a few issues and they
>> ended
>>>> up
>>>>>>>> introduce DataSource V2 in Spark 3.0, which added the decorative
>>>>>> interfaces
>>>>>>>> like SupportsPushdownXXX. Some of the configurations predating
>>>>>> DataSource
>>>>>>>> V2 may still be there.
>>>>>>>> 
>>>>>>>> For the Spark configurations you mentioned, they are all the
>>>>>> configurations
>>>>>>>> for FileScanBuilder, which is equivalent to FileSource in Flink.
>>>>>> Currently,
>>>>>>>> regardless of the format (ORC, Parquet, Avro, etc), the FileSource
>>>>>> pushes
>>>>>>>> back all the filters to ensure correctness. The actual filters that
>>>> got
>>>>>>>> applied to the specific format might still be different. This
>>>>>>>> implementation is the same in FileScanBuilder.pushFilters() for
>>>> Spark. I
>>>>>>>> don't know why Spark got separate configurations for each format.
>>>> Maybe
>>>>>> it
>>>>>>>> is because the filters are actually implemented differently for
>>>>>> different
>>>>>>>> format.
>>>>>>>> 
>>>>>>>> At least for the current implementation in FileScanBuilder, these
>>>>>>>> configurations can be merged to one configuration like
>>>>>>>> `apply.filters.to.format.enabled`. Note that this config, as well as
>>>> the
>>>>>>>> separate configs you mentioned, are just visible and used by the
>>>>>>>> FileScanBuilder. It determines whether the filters should be passed
>>>>>> down to
>>>>>>>> the format of the FileScanBuilder instance. Regardless of the value
>> of
>>>>>>>> these configs, FileScanBuilder.pushFilters() will always be called,
>>>> and
>>>>>>>> FileScanBuilder (as well as FileSource in Flink) will always push
>> back
>>>>>> all
>>>>>>>> the filters to the framework.
>>>>>>>> 
>>>>>>>> A MySql source can have a very different way to handle this. For
>>>>>> example, A
>>>>>>>> MySql source  A config in this case might be "my.apply.filters" with
>>>>>> three
>>>>>>>> different values:
>>>>>>>> - AUTO: The Source will issue a DESC Table request to understand
>>>>>> whether a
>>>>>>>> filter can be applied efficiently. And decide which filters can be
>>>>>> applied
>>>>>>>> and which cannot based on that.
>>>>>>>> - NEVER: Never apply filtering. It will always do a full table read
>>>> and
>>>>>>>> let Flink do the filtering.
>>>>>>>> - ALWAYS: Always apply the filtering to the MySql server.
>>>>>>>> 
>>>>>>>> In the above examples of FileSource and MySql Source, I don't think
>> it
>>>>>> is a
>>>>>>>> good idea to shoehorn the behaviors into a naive config of
>>>>>>>> `ignore.filter.pushdown`. That is why I don't think this is a common
>>>>>> config.
>>>>>>>> 
>>>>>>>> To recap, like I said, I do agree that in some cases, we may want to
>>>>>> behave
>>>>>>>> differently when filters are pushed down to the sources, even if a
>>>>>> source
>>>>>>>> implements SupportsFilterPushDown, but I don't think there is a
>>>> suitable
>>>>>>>> common config for this. The behavior is very likely source specific.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> 
>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Nov 16, 2023 at 3:41 PM Jiabao Sun <jiabao....@xtransfer.cn
>>>>>> .invalid>
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Thanks Becket,
>>>>>>>>> 
>>>>>>>>> I still believe that adding a configuration at the source level to
>>>>>> disable
>>>>>>>>> filter pushdown is needed. This demand exists in spark as well[1].
>>>>>>>>> 
>>>>>>>>> In Spark, most sources that support filter pushdown provide their
>> own
>>>>>>>>> corresponding configuration options to enable or disable filter
>>>>>> pushdown.
>>>>>>>>> For PRs[2-4] that support filter pushdown capability, they also
>>>> provide
>>>>>>>>> configuration options to disable this capability.
>>>>>>>>> 
>>>>>>>>> I believe this configuration is applicable to most scenarios, and
>>>>>> there is
>>>>>>>>> no need to dwell on why this configuration option was not
>> introduced
>>>>>>>>> earlier than the SupportsFilterPushDown interface.
>>>>>>>>> 
>>>>>>>>> spark.sql.parquet.filterPushdown
>>>>>>>>> spark.sql.orc.filterPushdown
>>>>>>>>> spark.sql.csv.filterPushdown.enabled
>>>>>>>>> spark.sql.json.filterPushdown.enabled
>>>>>>>>> spark.sql.avro.filterPushdown.enabled
>>>>>>>>> JDBC Option: pushDownPredicate
>>>>>>>>> 
>>>>>>>>> We can see that the lack of consistency is caused by each connector
>>>>>>>>> introducing different configuration options for the same behavior.
>>>>>>>>> This is one of the motivations for advocating the introduction of a
>>>>>>>>> unified configuration name.
>>>>>>>>> 
>>>>>>>>> [1] https://issues.apache.org/jira/browse/SPARK-24288
>>>>>>>>> [2] https://github.com/apache/spark/pull/27366
>>>>>>>>> [3]https://github.com/apache/spark/pull/26973
>>>>>>>>> [4] https://github.com/apache/spark/pull/29145
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Jiabao
>>>>>>>>> 
>>>>>>>>>> 2023年11月16日 08:10,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>> 
>>>>>>>>>> Hi Jiabao,
>>>>>>>>>> 
>>>>>>>>>> While we can always fix the formality of the config, a more
>>>>>> fundamental
>>>>>>>>>> issue here is whether this configuration is common enough.
>>>> Personally
>>>>>> I
>>>>>>>>> am
>>>>>>>>>> still not convinced it is.
>>>>>>>>>> 
>>>>>>>>>> Remember we don't have a common implementation for
>>>>>> SupportsFilterPushdown
>>>>>>>>>> itself. Why does a potential behavior of the
>>>>>>>>>> SupportsFilterPushdown.applyFilters() method deserve a common
>>>>>>>>>> configuration? A common implementation should always come first,
>>>> then
>>>>>> its
>>>>>>>>>> configuration becomes a common configuration as a natural result.
>>>> But
>>>>>>>>> here
>>>>>>>>>> we are trying to add an impl to a configuration just to fix its
>>>>>>>>> formality.
>>>>>>>>>> 
>>>>>>>>>> I agree that there might be a few Source implementations that may
>>>>>> want to
>>>>>>>>>> avoid additional burdens on the remote system in some
>> circumstances.
>>>>>> And
>>>>>>>>>> these circumstances are very specific:
>>>>>>>>>> 1. The source talks to a remote service that can help perform the
>>>>>> actual
>>>>>>>>>> filtering.
>>>>>>>>>> 2. The filtering done by the remote service is inefficient for
>> some
>>>>>>>>> reason
>>>>>>>>>> (e.g. missing index)
>>>>>>>>>> 3. The external service does not want to perform the inefficient
>>>>>>>>> filtering
>>>>>>>>>> for some reason (e.g. it is a shared service with others)
>>>>>>>>>> 
>>>>>>>>>> There are multiple approaches to address the issue. Pushing back
>> the
>>>>>>>>>> filters is just one way of achieving this. So here we are talking
>>>>>> about a
>>>>>>>>>> config for one of the possible solutions to a scenario with all
>> the
>>>>>> above
>>>>>>>>>> situations. I don't think there is enough justification for the
>>>>>> config to
>>>>>>>>>> be common.
>>>>>>>>>> 
>>>>>>>>>> There is always this trade-off between the proliferation of public
>>>>>>>>>> interfaces and the API standardization. As an extreme example, we
>>>> can
>>>>>>>>> make
>>>>>>>>>> our public API a union of all the configs potentially used in all
>>>> the
>>>>>>>>> cases
>>>>>>>>>> in the name of standardization. Apparently this won't work. So
>> there
>>>>>> must
>>>>>>>>>> be a bar here and this bar might be somewhat subjective. For this
>>>>>> FLIP,
>>>>>>>>>> personally I don't think the config meets my bar for the reason
>>>> stated
>>>>>>>>>> above.
>>>>>>>>>> 
>>>>>>>>>> Therefore, my suggestion remains the same. Keep the config as a
>>>> Source
>>>>>>>>>> implementation specific configuration.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Nov 16, 2023 at 12:36 AM Jiabao Sun <
>>>> jiabao....@xtransfer.cn
>>>>>>>>> .invalid>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Thanks Becket for the feedback,
>>>>>>>>>>> 
>>>>>>>>>>> Regarding concerns about common configurations, I think we can
>>>>>> introduce
>>>>>>>>>>> FiltersApplier to unify the behavior of various connectors.
>>>>>>>>>>> 
>>>>>>>>>>> public static class FiltersApplier {
>>>>>>>>>>> 
>>>>>>>>>>> private final ReadableConfig config;
>>>>>>>>>>> private final Function<List<ResolvedExpression>, Result> action;
>>>>>>>>>>> 
>>>>>>>>>>> private FiltersApplier(
>>>>>>>>>>>       ReadableConfig config,
>>>>>>>>>>>       Function<List<ResolvedExpression>, Result> action) {
>>>>>>>>>>>   this.config = config;
>>>>>>>>>>>   this.action = action;
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
>>>>>>>>>>>   if (config.get(ENABLE_FILTER_PUSH_DOWN)) {
>>>>>>>>>>>       return action.apply(filters);
>>>>>>>>>>>   } else {
>>>>>>>>>>>       return Result.of(Collections.emptyList(), filters);
>>>>>>>>>>>   }
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> public static FiltersApplier of(
>>>>>>>>>>>       ReadableConfig config,
>>>>>>>>>>>       Function<List<ResolvedExpression>, Result> action) {
>>>>>>>>>>>   return new FiltersApplier(config, action);
>>>>>>>>>>> }
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> For connectors implementation:
>>>>>>>>>>> 
>>>>>>>>>>> @Override
>>>>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
>>>>>>>>>>> return FiltersApplier.of(config,
>>>>>>>>>>>       f -> Result.of(new ArrayList<>(filters),
>>>>>>>>>>> Collections.emptyList()));
>>>>>>>>>>> }
>>>>>>>>>>> 
>>>>>>>>>>> As for the name, whether it is "source.filter-push-down.enabled"
>> or
>>>>>>>>>>> "source.ignore-pushed-down-filters.enabled", I think both are
>> okay.
>>>>>>>>>>> 
>>>>>>>>>>> Do you think this change is feasible?
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Jiabao
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>>> 2023年11月15日 23:44,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>> 
>>>>>>>>>>>> Yes, I still have concerns.
>>>>>>>>>>>> 
>>>>>>>>>>>> The FLIP violates the following two principles regarding
>>>>>> configuration:
>>>>>>>>>>>> 
>>>>>>>>>>>> 1.* A config of a class should never negate the semantic of a
>>>>>>>>> decorative
>>>>>>>>>>>> interface implemented by that class. *
>>>>>>>>>>>> A decorative interface is a public contract with other
>> components,
>>>>>>>>> while
>>>>>>>>>>> a
>>>>>>>>>>>> config is only internal to the class itself. The configurations
>>>> for
>>>>>> the
>>>>>>>>>>>> Sources are not (and should never be) visible or understood to
>>>>>>>>>>>> other components (e.g. optimizer). A configuration of a Source
>>>> only
>>>>>>>>>>>> controls the behavior of that Source, provided it is not
>> violating
>>>>>> the
>>>>>>>>>>> API
>>>>>>>>>>>> contract / semantic defined by the decorative interface. So
>> when a
>>>>>>>>> Source
>>>>>>>>>>>> implementation implements SupportsFilterPushdown, this is a
>> clear
>>>>>>>>> public
>>>>>>>>>>>> contract with Flink that filters should be pushed down to that
>>>>>> Source.
>>>>>>>>>>>> Therefore, for the same source, there should not be a
>>>> configuration
>>>>>>>>>>>> "source.filter-push-down.enabled" which stops the filters from
>>>> being
>>>>>>>>>>> pushed
>>>>>>>>>>>> down to that Source. However, that specific source
>> implementation
>>>>>> can
>>>>>>>>>>> have
>>>>>>>>>>>> its own config to control its internal behavior, e.g.
>>>>>>>>>>>> "ignore-pushed-down-filters.enabled" which may push back all the
>>>>>> pushed
>>>>>>>>>>>> down filters back to the Flink optimizer.
>>>>>>>>>>>> 
>>>>>>>>>>>> 2. When we are talking about "common configs", in fact we are
>>>>>> talking
>>>>>>>>>>> about
>>>>>>>>>>>> "configs for common (abstract) implementation classes". With
>> that
>>>>>> as a
>>>>>>>>>>>> context, *a common config should always be backed by a common
>>>>>>>>>>>> implementation class, so that consistent behavior can be
>>>>>> guaranteed. *
>>>>>>>>>>>> The LookupOptions you mentioned are configurations defined for
>>>>>> classes
>>>>>>>>>>>> DefaultLookupCache / PeriodicCacheReloadTrigger /
>>>>>>>>>>> TimedCacheReloadTrigger.
>>>>>>>>>>>> These configs are considered as "common" only because the
>>>>>>>>> implementation
>>>>>>>>>>>> classes using them are common building blocks for lookup table
>>>>>>>>>>>> implementations. It would not make sense to have a dangling
>> config
>>>>>> in
>>>>>>>>> the
>>>>>>>>>>>> LookupOptions without the underlying common implementation
>> class,
>>>>>> but
>>>>>>>>>>> only
>>>>>>>>>>>> relies on a specific source to implement the stated behavior.
>>>>>>>>>>>> As a bad example, there is this outlier config "max-retries" in
>>>>>>>>>>>> LookupOptions, which I don't think should be here. This is
>> because
>>>>>> the
>>>>>>>>>>>> retry behavior can be very implementation specific. For example,
>>>>>> there
>>>>>>>>>>> can
>>>>>>>>>>>> be many different flavors of retry related configurations,
>>>>>>>>> retry-backoff,
>>>>>>>>>>>> retry-timeout, retry-async, etc. Why only max-retry is put here?
>>>>>> should
>>>>>>>>>>> all
>>>>>>>>>>>> of them be put here? If we put all such kinds of configs in the
>>>>>> common
>>>>>>>>>>>> configs for "standardization and unification", the number of
>>>> "common
>>>>>>>>>>>> configs" can easily go crazy. And I don't see material benefits
>> of
>>>>>>>>> doing
>>>>>>>>>>>> that. So here I don't think the configuration "max-retry" should
>>>> be
>>>>>> in
>>>>>>>>>>>> LookupOptions, because it is not backed by any common
>>>> implementation
>>>>>>>>>>>> classes. If max-retry is implemented in the HBase source, it
>>>> should
>>>>>>>>> stay
>>>>>>>>>>>> there. For the same reason, the config proposed in this FLIP
>>>>>> (probably
>>>>>>>>>>> with
>>>>>>>>>>>> a name less confusing for the first reason mentioned above)
>>>> should
>>>>>>>>> stay
>>>>>>>>>>> in
>>>>>>>>>>>> the specific Source implementation.
>>>>>>>>>>>> 
>>>>>>>>>>>> For the two reasons above, I am -1 to what the FLIP currently
>>>>>> proposes.
>>>>>>>>>>>> 
>>>>>>>>>>>> I think the right way to address the motivation here is still to
>>>>>> have a
>>>>>>>>>>>> config like "ignore-pushed-down-filters.enabled" for the
>> specific
>>>>>>>>> source
>>>>>>>>>>>> implementation. Please let me know if this solves the problem
>> you
>>>>>> are
>>>>>>>>>>>> facing.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Nov 15, 2023 at 11:52 AM Jiabao Sun <
>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>> .invalid>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The purpose of introducing this configuration is that not all
>>>>>> filter
>>>>>>>>>>>>> pushdowns can improve overall performance.
>>>>>>>>>>>>> If the filter can hit the external index, then pushdown is
>>>>>> definitely
>>>>>>>>>>>>> worth it, as it can not only improve query time but also
>> decrease
>>>>>>>>>>> network
>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>> However, for filters that do not hit the external index, it may
>>>>>>>>>>> increase a
>>>>>>>>>>>>> lot of performance overhead on the external system.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Undeniably, if the connector can make accurate decisions for
>> good
>>>>>> and
>>>>>>>>>>> bad
>>>>>>>>>>>>> filters, we may not need to introduce this configuration option
>>>> to
>>>>>>>>>>> disable
>>>>>>>>>>>>> pushing down filters to the external system.
>>>>>>>>>>>>> However, it is currently not easy to achieve.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> IMO, supporting filter pushdown does not mean that always
>> filter
>>>>>>>>>>> pushdown
>>>>>>>>>>>>> is better.
>>>>>>>>>>>>> In the absence of automatic decision-making, I think we should
>>>>>> leave
>>>>>>>>>>> this
>>>>>>>>>>>>> decision to users.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The newly introduced configuration option is similar to
>>>>>> LookupOptions,
>>>>>>>>>>>>> providing unified naming and default values to avoid confusion
>>>>>> caused
>>>>>>>>> by
>>>>>>>>>>>>> inconsistent naming in different connectors for users.
>>>>>>>>>>>>> Setting the default value to true allows it to maintain
>>>>>> compatibility
>>>>>>>>>>> with
>>>>>>>>>>>>> the default behavior of "always pushdown".
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Do you have any other concerns about this proposal? Please let
>> me
>>>>>>>>> know.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2023年10月31日 17:29,Jiabao Sun <jiabao....@xtransfer.cn
>> .INVALID>
>>>>>> 写道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Actually, for FileSystemSource, it is not always desired, only
>>>> OCR
>>>>>>>>> file
>>>>>>>>>>>>> formats support filter pushdown.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> We can disable predicate pushdown for FileSystemSource by
>>>> setting
>>>>>>>>>>>>> 'table.optimizer.source.predicate-pushdown-enabled' to false.
>>>>>>>>>>>>>> I think we can also disable filter pushdown at a more granular
>>>>>> level
>>>>>>>>>>>>> through fine-grained configuration.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 2023年10月31日 16:50,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks for the explanation. Maybe it's easier to explain with
>>>> an
>>>>>>>>>>>>> example.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Let's take FileSystemTableSource as an example. Currently it
>>>>>>>>>>> implements
>>>>>>>>>>>>>>> SupportsFilterPushDown interface. With your proposal, does it
>>>>>> have
>>>>>>>>> to
>>>>>>>>>>>>>>> support `source.filter-push-down.enabled` as well? But this
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>> does not quite make sense for the FileSystemTableSource
>> because
>>>>>>>>> filter
>>>>>>>>>>>>>>> pushdown is always desired. However, because this
>> configuration
>>>>>> is a
>>>>>>>>>>>>> part
>>>>>>>>>>>>>>> of the SupportsFilterPushDown interface (which sounds
>> confusing
>>>>>> to
>>>>>>>>>>> begin
>>>>>>>>>>>>>>> with), the FileSystemTableSource can only do one of the
>>>>>> following:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 1. Ignore the user configuration to always apply the pushed
>>>> down
>>>>>>>>>>>>> filters -
>>>>>>>>>>>>>>> this is an apparent anti-pattern because a configuration
>> should
>>>>>>>>> always
>>>>>>>>>>>>> do
>>>>>>>>>>>>>>> what it says.
>>>>>>>>>>>>>>> 2. Throw an exception telling users that this configuration
>> is
>>>>>> not
>>>>>>>>>>>>>>> applicable to the FileSystemTableSource.
>>>>>>>>>>>>>>> 3. Implement this configuration to push back the pushed down
>>>>>>>>> filters,
>>>>>>>>>>>>> even
>>>>>>>>>>>>>>> though this is never desired.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> None of the above options looks awkward. I am curious what
>> your
>>>>>>>>>>>>> solution is
>>>>>>>>>>>>>>> here?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On Tue, Oct 31, 2023 at 3:11 PM Jiabao Sun <
>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks Becket for the further explanation.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Perhaps I didn't explain it clearly.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 1. If a source does not implement the SupportsFilterPushDown
>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>> the newly added configurations do not need to be added to
>>>> either
>>>>>>>>> the
>>>>>>>>>>>>>>>> requiredOptions or optionalOptions.
>>>>>>>>>>>>>>>> Similar to LookupOptions, if a source does not implement
>>>>>>>>>>>>>>>> LookupTableSource, there is no need to add LookupOptions to
>>>>>> either
>>>>>>>>>>>>>>>> requiredOptions or optionalOptions.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2. "And these configs are specific to those sources, instead
>>>> of
>>>>>>>>>>> common
>>>>>>>>>>>>>>>> configs."
>>>>>>>>>>>>>>>> The newly introduced configurations define standardized
>> names
>>>>>> and
>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>> values.
>>>>>>>>>>>>>>>> They still belong to the configuration at the individual
>>>> source
>>>>>>>>>>> level.
>>>>>>>>>>>>>>>> The purpose is to avoid scattered configuration items when
>>>>>>>>> different
>>>>>>>>>>>>>>>> sources implement the same logic.
>>>>>>>>>>>>>>>> Whether a source should accept these configurations is
>>>>>> determined
>>>>>>>>> by
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>> source's Factory.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 2023年10月31日 13:47,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Please see the replies inline.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Introducing common configurations does not mean that all
>>>>>> sources
>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>> accept these configuration options.
>>>>>>>>>>>>>>>>>> The configuration options supported by a source are
>>>>>> determined by
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory
>>>> interface.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> This is not true. Both required and optional options are
>>>>>>>>> SUPPORTED.
>>>>>>>>>>>>> That
>>>>>>>>>>>>>>>>> means they are implemented and if one specifies an optional
>>>>>> config
>>>>>>>>>>> it
>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>> still take effect. An OptionalConfig is "Optional" because
>>>> this
>>>>>>>>>>>>>>>>> configuration has a default value. Hence, it is OK that
>> users
>>>>>> do
>>>>>>>>> not
>>>>>>>>>>>>>>>>> specify their own value. In another word, it is "optional"
>>>> for
>>>>>> the
>>>>>>>>>>> end
>>>>>>>>>>>>>>>>> users to set the config, but the implementation and support
>>>> for
>>>>>>>>> that
>>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>>> is NOT optional. In case a source does not support a common
>>>>>>>>> config,
>>>>>>>>>>> an
>>>>>>>>>>>>>>>>> exception must be thrown when the config is provided by the
>>>> end
>>>>>>>>>>> users.
>>>>>>>>>>>>>>>>> However, the config we are talking about in this FLIP is a
>>>>>> common
>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>>> optional to implement, meaning that sometimes the claimed
>>>>>> behavior
>>>>>>>>>>>>> won't
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>> there even if users specify that config.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Similar to sources that do not implement the
>>>> LookupTableSource
>>>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
>>>>>>>>> interface
>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>> not need to accept newly introduced options.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> First of all, filter pushdown is a behavior of the query
>>>>>>>>> optimizer,
>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> behavior of Sources. The Sources tells the optimizer that
>> it
>>>>>> has
>>>>>>>>> the
>>>>>>>>>>>>>>>>> ability to accept pushed down filters by implementing the
>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface. And this is the only
>>>> contract
>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>> Source and Optimizer regarding whether filters should be
>>>> pushed
>>>>>>>>>>> down.
>>>>>>>>>>>>> As
>>>>>>>>>>>>>>>>> long as a specific source implements this decorative
>>>> interface,
>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>> pushdown should always take place, i.e.
>>>>>>>>>>>>>>>>> *SupportsFilterPushDown.applyFilters()* will be called.
>> There
>>>>>>>>> should
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>> other config to disable that call. However, Sources can
>>>> decide
>>>>>> how
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> behave based on their own configurations after
>>>>>> *applyFilters()* is
>>>>>>>>>>>>>>>> called.
>>>>>>>>>>>>>>>>> And these configs are specific to those sources, instead of
>>>>>> common
>>>>>>>>>>>>>>>> configs.
>>>>>>>>>>>>>>>>> Please see the examples I mentioned in the previous email.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Tue, Oct 31, 2023 at 10:27 AM Jiabao Sun <
>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Sorry, there was a typo in the second point. Let me
>> correct
>>>>>> it:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Introducing common configurations does not mean that all
>>>>>> sources
>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>> accept these configuration options.
>>>>>>>>>>>>>>>>>> The configuration options supported by a source are
>>>>>> determined by
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory
>>>> interface.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Similar to sources that do not implement the
>>>> LookupTableSource
>>>>>>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
>>>>>>>>> interface
>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>> not need to accept newly introduced options.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 2023年10月31日 10:13,Jiabao Sun <jiabao....@xtransfer.cn
>>>>>> .INVALID>
>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks Becket for the feedback.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 1. Currently, the SupportsFilterPushDown#applyFilters
>>>> method
>>>>>>>>>>>>> returns a
>>>>>>>>>>>>>>>>>> result that includes acceptedFilters and remainingFilters.
>>>> The
>>>>>>>>>>> source
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>> decide to push down some filters or not accept any of
>> them.
>>>>>>>>>>>>>>>>>>> 2. Introducing common configuration options does not mean
>>>>>> that a
>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>> that supports the SupportsFilterPushDown capability must
>>>>>> accept
>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>> configuration. Similar to LookupOptions, only sources that
>>>>>>>>>>> implement
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> LookupTableSource interface are necessary to accept these
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>> options.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 2023年10月31日 07:49,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Hi Jiabao and Ruanhang,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Adding a configuration of
>> source.filter-push-down.enabled
>>>>>> as a
>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>>>>> source configuration seems problematic.
>>>>>>>>>>>>>>>>>>>> 1. The config name is misleading. filter pushdown should
>>>>>> only
>>>>>>>>> be
>>>>>>>>>>>>>>>>>> determined
>>>>>>>>>>>>>>>>>>>> by whether the SupportsFilterPushdown interface is
>>>>>> implemented
>>>>>>>>> or
>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>>>> 2. The behavior of this configuration is only applicable
>>>> to
>>>>>>>>> some
>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>> implementations. Why is it a common configuration?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Here's my suggestion for design principles:
>>>>>>>>>>>>>>>>>>>> 1. Only add source impl specific configuration to
>>>>>> corresponding
>>>>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>> 2. The configuration name should not overrule existing
>>>>>> common
>>>>>>>>>>>>>>>> contracts.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> For example, in the case of MySql source. There are
>>>> several
>>>>>>>>>>>>> options:
>>>>>>>>>>>>>>>>>>>> 1. Have a configuration of
>>>>>>>>>>> `*mysql.avoid.remote.full.table.scan`*.
>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> configuration is set, and a filter pushdown does not hit
>>>> an
>>>>>>>>>>> index,
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> MySql source impl would not further pushdown the filter
>> to
>>>>>>>>> MySql
>>>>>>>>>>>>>>>>>> servers.
>>>>>>>>>>>>>>>>>>>> Note that this assumes the MySql source can retrieve the
>>>>>> index
>>>>>>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>>>>>> from the MySql servers.
>>>>>>>>>>>>>>>>>>>> 2. If the MySql index information is not available to
>> the
>>>>>> MySql
>>>>>>>>>>>>>>>> source,
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> configuration could be something like
>>>>>>>>>>>>>>>>>> *`mysql.pushback.pushed.down.filters`*.
>>>>>>>>>>>>>>>>>>>> Once set to true, MySql source would just add all the
>>>>>> filters
>>>>>>>>> to
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> RemainingFilters in the Result returned by
>>>>>>>>>>>>>>>>>>>> *SupportsFilterPushdown.applyFilters().*
>>>>>>>>>>>>>>>>>>>> 3. An alternative to option 2 is to have a `
>>>>>>>>>>>>>>>>>>>> *mysql.apply.predicates.after.scan*`. When it is set to
>>>>>> true,
>>>>>>>>>>> MySql
>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>> will not push the filter down to the MySql servers, but
>>>>>> apply
>>>>>>>>> the
>>>>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>>>> inside the MySql source itself.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> As you may see, the above configurations do not disable
>>>>>> filter
>>>>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>>>> itself. They just allow various implementations of
>> filter
>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> configuration name does not give any illusion that
>> filter
>>>>>>>>>>> pushdown
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>> disabled.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Mon, Oct 30, 2023 at 11:58 PM Jiabao Sun <
>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks Hang for the suggestion.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I think the configuration of TableSource is not closely
>>>>>>>>> related
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> SourceReader,
>>>>>>>>>>>>>>>>>>>>> so I prefer to introduce a independent configuration
>>>> class
>>>>>>>>>>>>>>>>>>>>> TableSourceOptions in the flink-table-common module,
>>>>>> similar
>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> LookupOptions.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> For the second point, I suggest adding Java doc to the
>>>>>>>>>>>>>>>>>> SupportsXXXPushDown
>>>>>>>>>>>>>>>>>>>>> interfaces, providing detailed information on these
>>>> options
>>>>>>>>> that
>>>>>>>>>>>>>>>> needs
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> be supported.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I have made updates in the FLIP document.
>>>>>>>>>>>>>>>>>>>>> Please help check it again.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com>
>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Thanks for the improvements, Jiabao.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> There are some details that I am not sure about.
>>>>>>>>>>>>>>>>>>>>>> 1. The new option `source.filter-push-down.enabled`
>> will
>>>>>> be
>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>> class? I think it should be `SourceReaderOptions`.
>>>>>>>>>>>>>>>>>>>>>> 2. How are the connector developers able to know and
>>>>>> follow
>>>>>>>>> the
>>>>>>>>>>>>>>>> FLIP?
>>>>>>>>>>>>>>>>>> Do
>>>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>> need an abstract base class or provide a default
>> method?
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>> Hang
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid>
>>>>>> 于2023年10月30日周一
>>>>>>>>>>>>>>>> 14:45写道:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Hi, all,
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Thanks for the lively discussion.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Based on the discussion, I have made some adjustments
>>>> to
>>>>>> the
>>>>>>>>>>>>> FLIP
>>>>>>>>>>>>>>>>>>>>> document:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 1. The name of the newly added option has been
>> changed
>>>> to
>>>>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled".
>>>>>>>>>>>>>>>>>>>>>>> 2. Considering compatibility with older versions, the
>>>>>> newly
>>>>>>>>>>>>> added
>>>>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled" option needs to
>>>> respect
>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> optimizer's
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>> option.
>>>>>>>>>>>>>>>>>>>>>>> But there is a consideration to remove the old option
>>>> in
>>>>>>>>> Flink
>>>>>>>>>>>>> 2.0.
>>>>>>>>>>>>>>>>>>>>>>> 3. We can provide more options to disable other
>> source
>>>>>>>>>>> abilities
>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>> side
>>>>>>>>>>>>>>>>>>>>>>> effects, such as “source.aggregate.enabled” and
>>>>>>>>>>>>>>>>>>>>> “source.projection.enabled"
>>>>>>>>>>>>>>>>>>>>>>> This is not urgent and can be continuously
>> introduced.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback again.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月29日 08:45,Becket Qin <becket....@gmail.com>
>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Thanks for digging into the git history, Jark. I
>> agree
>>>>>> it
>>>>>>>>>>> makes
>>>>>>>>>>>>>>>>>> sense
>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> deprecate this API in 2.0.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <
>>>>>> imj...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> I checked the history of "
>>>>>>>>>>>>>>>>>>>>>>>>> 
>> *table.optimizer.source.predicate-pushdown-enabled*",
>>>>>>>>>>>>>>>>>>>>>>>>> it seems it was introduced since the legacy
>>>>>>>>>>>>> FilterableTableSource
>>>>>>>>>>>>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>>>>>>>>> which might be an experiential feature at that
>> time.
>>>> I
>>>>>>>>> don't
>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> necessity
>>>>>>>>>>>>>>>>>>>>>>>>> of this option at the moment. Maybe we can
>> deprecate
>>>>>> this
>>>>>>>>>>>>> option
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> drop
>>>>>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>> in Flink 2.0[1] if it is not necessary anymore.
>> This
>>>>>> may
>>>>>>>>>>> help
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>> simplify this discussion.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> [1]:
>>>> https://issues.apache.org/jira/browse/FLINK-32383
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <
>>>>>>>>>>>>> becket....@gmail.com>
>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the proposal, Jiabao. My two cents
>> below:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 1. If I understand correctly, the motivation of
>> the
>>>>>> FLIP
>>>>>>>>> is
>>>>>>>>>>>>>>>>>> mainly to
>>>>>>>>>>>>>>>>>>>>>>>>>> make predicate pushdown optional on SOME of the
>>>>>> Sources.
>>>>>>>>> If
>>>>>>>>>>>>> so,
>>>>>>>>>>>>>>>>>>>>>>> intuitively
>>>>>>>>>>>>>>>>>>>>>>>>>> the configuration should be Source specific
>> instead
>>>> of
>>>>>>>>>>>>> general.
>>>>>>>>>>>>>>>>>>>>>>> Otherwise,
>>>>>>>>>>>>>>>>>>>>>>>>>> we will end up with general configurations that
>> may
>>>>>> not
>>>>>>>>>>> take
>>>>>>>>>>>>>>>>>> effect
>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>> some of the Source implementations. This violates
>>>> the
>>>>>>>>> basic
>>>>>>>>>>>>> rule
>>>>>>>>>>>>>>>>>> of a
>>>>>>>>>>>>>>>>>>>>>>>>>> configuration - it does what it says, regardless
>> of
>>>>>> the
>>>>>>>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>>>>>>>>>> While configuration standardization is usually a
>>>> good
>>>>>>>>>>> thing,
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>>> break the basic rules.
>>>>>>>>>>>>>>>>>>>>>>>>>> If we really want to have this general
>>>> configuration,
>>>>>> for
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> sources
>>>>>>>>>>>>>>>>>>>>>>>>>> this configuration does not apply, they should
>> throw
>>>>>> an
>>>>>>>>>>>>>>>> exception
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>>>>>>>>>>> it clear that this configuration is not supported.
>>>>>>>>> However,
>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>> seems
>>>>>>>>>>>>>>>>>>>>>>> ugly.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 2. I think the actual motivation of this FLIP is
>>>> about
>>>>>>>>>>> "how a
>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>> should implement predicate pushdown efficiently",
>>>> not
>>>>>>>>>>>>> "whether
>>>>>>>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown should be applied to the source." For
>>>>>> example,
>>>>>>>>> if
>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>> wants
>>>>>>>>>>>>>>>>>>>>>>>>>> to avoid additional computing load in the external
>>>>>>>>> system,
>>>>>>>>>>> it
>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>>>> always
>>>>>>>>>>>>>>>>>>>>>>>>>> read the entire record and apply the predicates by
>>>>>>>>> itself.
>>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>>>>>> the Flink perspective, the predicate pushdown is
>>>>>> applied,
>>>>>>>>>>> it
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>>>>>>>>>> implemented differently by the source. So the
>> design
>>>>>>>>>>>>> principle
>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>> Flink only cares about whether a source supports
>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>> not,
>>>>>>>>>>>>>>>>>>>>>>>>>> it does not care about the implementation
>>>> efficiency /
>>>>>>>>> side
>>>>>>>>>>>>>>>>>> effect of
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>> predicates pushdown. It is the Source
>>>> implementation's
>>>>>>>>>>>>>>>>>> responsibility
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>> ensure the predicates pushdown is implemented
>>>>>> efficiently
>>>>>>>>>>> and
>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>>> impose excessive pressure on the external system.
>>>> And
>>>>>> it
>>>>>>>>> is
>>>>>>>>>>>>> OK
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>>>>>>> additional configurations to achieve this goal.
>>>>>>>>> Obviously,
>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>>>>>>>>> configurations will be source specific in this
>> case.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Regarding the existing configurations of
>>>>>>>>>>>>>>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled.
>>>>>>>>>>>>>>>>>>>>>>>>>> *I am not sure why we need it. Supposedly, if a
>>>> source
>>>>>>>>>>>>>>>> implements
>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsXXXPushDown interface, the optimizer
>> should
>>>>>> push
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>>>>>>>>>>>>>>> predicates to the Source. I am not sure in which
>>>> case
>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>> would be used. Any ideas @Jark Wu <
>> imj...@gmail.com
>>>>> ?
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun
>>>>>>>>>>>>>>>>>>>>>>>>>> <jiabao....@xtransfer.cn.invalid> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for the detailed explanation.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> I think that for users, we should respect
>>>> conventions
>>>>>>>>> over
>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Conventions can be default values explicitly
>>>>>> specified
>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations, or they can be behaviors that
>>>> follow
>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>> versions.
>>>>>>>>>>>>>>>>>>>>>>>>>>> If the same code has different behaviors in
>>>> different
>>>>>>>>>>>>> versions,
>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>>>>>>>> be a very bad thing.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree that for regular users, it is not
>> necessary
>>>>>> to
>>>>>>>>>>>>>>>> understand
>>>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>>>>>>> the configurations related to Flink.
>>>>>>>>>>>>>>>>>>>>>>>>>>> By following conventions, they can have a good
>>>>>>>>> experience.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Let's get back to the practical situation and
>>>>>> consider
>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Case 1:
>>>>>>>>>>>>>>>>>>>>>>>>>>> The user is not familiar with the purpose of the
>>>>>>>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>>>> follows
>>>>>>>>>>>>>>>>>>>>>>>>>>> the convention of allowing predicate pushdown to
>>>> the
>>>>>>>>>>> source
>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>> default.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Just understanding the
>>>>>> source.predicate-pushdown-enabled
>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>> and performing fine-grained toggle control will
>>>> work
>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Case 2:
>>>>>>>>>>>>>>>>>>>>>>>>>>> The user understands the meaning of the
>>>>>>>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> has set
>>>>>>>>>>>>>>>>>>>>>>>>>>> its value to false.
>>>>>>>>>>>>>>>>>>>>>>>>>>> We have reason to believe that the user
>> understands
>>>>>> the
>>>>>>>>>>>>> meaning
>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> predicate pushdown configuration and the
>> intention
>>>>>> is to
>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown (rather than whether or not to allow
>> it).
>>>>>>>>>>>>>>>>>>>>>>>>>>> The previous choice of globally disabling it is
>>>>>> likely
>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>> couldn't be disabled on individual sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>> From this perspective, if we provide more
>>>>>> fine-grained
>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>> support and provide detailed explanations of the
>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>> behaviors in
>>>>>>>>>>>>>>>>>>>>>>>>>>> the documentation,
>>>>>>>>>>>>>>>>>>>>>>>>>>> users can clearly understand the differences
>>>> between
>>>>>>>>> these
>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations and use them correctly.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, I don't agree that
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> table.optimizer.source.predicate-pushdown-enabled =
>>>>>> true
>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>> source.predicate-pushdown-enabled = false means
>>>> that
>>>>>> the
>>>>>>>>>>>>> local
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration overrides the global configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, both configurations are
>>>> functioning
>>>>>>>>>>>>> correctly.
>>>>>>>>>>>>>>>>>>>>>>>>>>> The optimizer allows predicate pushdown to all
>>>>>> sources,
>>>>>>>>>>> but
>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>> sources
>>>>>>>>>>>>>>>>>>>>>>>>>>> can reject the filters pushed down by the
>>>> optimizer.
>>>>>>>>>>>>>>>>>>>>>>>>>>> This is natural, just like different components
>> at
>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>> levels
>>>>>>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>>>>>>> responsible for different tasks.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> The more serious issue is that if
>>>>>>>>>>>>>>>>>>>>> "source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>> does not respect
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled”,
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>> will be invalidated.
>>>>>>>>>>>>>>>>>>>>>>>>>>> This means that regardless of whether
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>> is
>>>>>>>>> set
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> true
>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>>>>>> false, it will have no effect.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 22:24,Jane Chan <
>>>> qingyue....@gmail.com>
>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the in-depth clarification. Here are
>> my
>>>>>>>>> cents
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are
>>>> configurations
>>>>>> for
>>>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> We cannot assume that every user would be
>>>>>> interested in
>>>>>>>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> internal components of Flink, such as the
>>>> optimizer
>>>>>> or
>>>>>>>>>>>>>>>>>> connectors,
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> specific configurations associated with each
>>>>>> component.
>>>>>>>>>>>>>>>> Instead,
>>>>>>>>>>>>>>>>>>>>>>> users
>>>>>>>>>>>>>>>>>>>>>>>>>>>> might be more concerned about knowing which
>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>> enables
>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>>>>>>> disables the filter push-down feature for all
>>>> source
>>>>>>>>>>>>>>>> connectors,
>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>>>>>> parameter provides the flexibility to override
>>>> this
>>>>>>>>>>>>> behavior
>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>>>>>>>>>>>>>>> source if needed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, from this perspective, I am inclined to
>> divide
>>>>>>>>> these
>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>>>>>>>>>>>>>>> based on the scope of their impact from the
>> user's
>>>>>>>>>>>>> perspective
>>>>>>>>>>>>>>>>>>>>> (i.e.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> global-level or operator-level), rather than
>>>>>>>>> categorizing
>>>>>>>>>>>>> them
>>>>>>>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> component hierarchy from a developer's point of
>>>>>> view.
>>>>>>>>>>>>>>>> Therefore,
>>>>>>>>>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>>>>>>>> this premise, it is intuitive and natural for
>>>> users
>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand fine-grained configuration options
>> can
>>>>>>>>>>> override
>>>>>>>>>>>>>>>>>> global
>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled"
>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>> respect to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>> and
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined
>> as
>>>>>> true,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as
>>>>>>>>>>>>> false
>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>> have no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be
>>>>>> performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in
>>>> behavior
>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <1>If I understand correctly,
>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector option, which means the only way to
>>>>>> configure
>>>>>>>>>>> it
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>> explicitly
>>>>>>>>>>>>>>>>>>>>>>>>>>>> specify it in DDL (no matter whether disable or
>>>>>>>>> enable),
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> SET
>>>>>>>>>>>>>>>>>>>>>>>>>>>> command is not applicable, so I think it's
>> natural
>>>>>> to
>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>> respect
>>>>>>>>>>>>>>>>>>>>>>>>>>> user's
>>>>>>>>>>>>>>>>>>>>>>>>>>>> specification here. Otherwise, users might be
>> more
>>>>>>>>>>> confused
>>>>>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>>>>>> why
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> DDL does not work as expected, and the reason is
>>>>>> just
>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>>>>>>>>>>>>> "optimizer" configuration is set to a different
>>>>>> value.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <2> From the implementation side, I am inclined
>> to
>>>>>> keep
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> parameter's
>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority consistent for all conditions.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let "global" denote
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> "table.optimizer.source.predicate-pushdown-enabled",
>>>>>>>>>>>>>>>>>>>>>>>>>>>> and let "per-source" denote
>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>>>>>>> source T,  the following Truth table (based on
>> the
>>>>>>>>>>> current
>>>>>>>>>>>>>>>>>> design)
>>>>>>>>>>>>>>>>>>>>>>>>>>>> indicates the inconsistent behavior for
>>>> "per-source
>>>>>>>>>>>>> override
>>>>>>>>>>>>>>>>>>>>> global".
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> .------------.---------------.-------------------
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----.-------------------------------------.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | global   | per-source | push-down for T |
>>>>>> per-source
>>>>>>>>>>>>>>>> override
>>>>>>>>>>>>>>>>>>>>>>> global
>>>>>>>>>>>>>>>>>>>>>>>>>>> |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> :-----------+--------------+-----------------------+------------------------------------:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | true       | false         | false
>>>>>>>>>>> | Y
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> :-----------+--------------+-----------------------+------------------------------------:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> | false     | true           | false
>>>>>>>>>>> | N
>>>>>>>>>>>>>>>>>>>>>>>>>>>>          |
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> .------------.---------------.-----------------------.-------------------------------------.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I understand that the configuration of global
>>>>>>>>>>> parallelism
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> task
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parallelism is at different granularities but
>>>> with
>>>>>> the
>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are
>>>> configurations
>>>>>> for
>>>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From a user's perspective, there are two
>>>> scenarios:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Disabling all filter pushdown
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, setting
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false is sufficient to meet the requirement.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Disabling filter pushdown for specific
>> sources
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, there is no need to adjust
>> the
>>>>>> value
>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Instead, the focus should be on the
>> configuration
>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to meet the
>>>>>>>>> requirement.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, users do not need to set
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" to
>>>>>>>>>>>>> false
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>> manually
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter pushdown for specific sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if
>> "scan.filter-push-down.enabled"
>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>> respect
>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>> and
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined
>> as
>>>>>> true,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as
>>>>>>>>>>>>> false
>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>> have no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be
>>>>>> performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in
>>>> behavior
>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The same SQL query that could successfully push
>>>>>> down
>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version but would no longer do so after the
>>>>>> upgrade.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 17:10,Benchao Li <
>>>>>> libenc...@apache.org>
>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jiabao for the detailed explanations,
>>>> that
>>>>>>>>>>> helps a
>>>>>>>>>>>>>>>>>> lot, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand your rationale now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong. Your perspective is
>>>> from
>>>>>>>>>>>>>>>> "developer",
>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> means there is an optimizer and connector
>>>>>> component,
>>>>>>>>>>> and
>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable this feature (pushing filters down into
>>>>>>>>>>>>> connectors),
>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable it firstly in optimizer, and only then
>>>>>>>>> connector
>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide to use it or not.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> My perspective is from "user" that (Why a user
>>>>>> should
>>>>>>>>>>>>> care
>>>>>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> difference of optimizer/connector) , this is a
>>>>>>>>> feature,
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> way to control it, one way is to config it
>>>>>> job-level,
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in table properties. What a user expects is
>> that
>>>>>> they
>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>> control a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> feature in a tiered way, that setting it per
>>>> job,
>>>>>> and
>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fine-grained tune it per table.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is some kind of similar to other
>> concepts,
>>>>>> such
>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>> parallelism,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users can set a job level default parallelism,
>>>> and
>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tune it per operator. There may be more such
>>>>>> debate
>>>>>>>>> in
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> future
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g., we can have a job level config about
>>>> adding
>>>>>>>>>>> key-by
>>>>>>>>>>>>>>>>>> before
>>>>>>>>>>>>>>>>>>>>>>>>>>> lookup
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> join, and also a hint/table property way to
>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>> control
>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> per lookup operator. Hence we'd better find a
>>>>>> unified
>>>>>>>>>>> way
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> those similar kind of features.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid>
>>>>>>>>>>>>> 于2023年10月25日周三
>>>>>>>>>>>>>>>>>>>>>>> 15:27写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for further explanation.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> These two configurations correspond to
>>>> different
>>>>>>>>>>> levels.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" does not make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" invalid.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The planner will still push down predicates
>> to
>>>>>> all
>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whether filter pushdown is allowed or not is
>>>>>>>>>>> determined
>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source's "scan.filter-push-down.enabled"
>>>>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, "table.optimizer.source.predicate"
>>>> does
>>>>>>>>>>>>> directly
>>>>>>>>>>>>>>>>>> affect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled”.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> When the planner disables predicate pushdown,
>>>> the
>>>>>>>>>>>>>>>>>> source-level
>>>>>>>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown will also not be executed, even if the
>>>>>> source
>>>>>>>>>>>>> allows
>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whatever, in point 1 and 2, our expectation
>> is
>>>>>>>>>>>>> consistent.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the 3rd point, I still think that the
>>>>>>>>>>> planner-level
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over the source-level
>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may seem counterintuitive when we globally
>>>>>>>>> disable
>>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown but allow filter pushdown at the
>> source
>>>>>>>>> level.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 14:35,Jane Chan <
>>>>>> qingyue....@gmail.com
>>>>>>>>>> 
>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for clarifying this. While by
>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority" I meant that this value
>>>> should
>>>>>> be
>>>>>>>>>>>>>>>> respected
>>>>>>>>>>>>>>>>>>>>>>>>>>> whenever
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> set explicitly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The conclusion that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" =
>> "true"
>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate
>>>>>> pushdown,
>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sources do
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This indicates that the option
>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled =
>>>>>>>>>>>>>>>>>>>>>>>>>>> false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an individual source connector does indeed
>>>>>> override
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> global-level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> planner settings to make a difference. And
>>>> thus
>>>>>>>>> "has
>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> While for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" =
>>>> "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the
>>>>>> planner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains
>>>>>> consistent
>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I still think
>> "scan.filter-push-down.enabled"
>>>>>>>>> should
>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>>> respected
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is enabled for individual connectors.
>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the current proposal, we recommend
>>>> keeping
>>>>>> the
>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" as true,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and setting the the default value of newly
>>>>>>>>>>> introduced
>>>>>>>>>>>>>>>>>> option
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to true as
>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The main purpose of doing this is to
>> maintain
>>>>>>>>>>>>> consistency
>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> versions, as whether to perform
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter pushdown in the old version solely
>>>>>> depends
>>>>>>>>> on
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" option.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That means by default, as long as a
>>>> TableSource
>>>>>>>>>>>>>>>> implements
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface, filter
>>>>>> pushdown
>>>>>>>>> is
>>>>>>>>>>>>>>>>>> allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And it seems that we don't have much
>> benefit
>>>> in
>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "table.optimizer.source.predicate" to
>>>> false.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding the priority of these two
>>>>>>>>> configurations,
>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>> believe
>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over
>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>> exhibits
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following behavior.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. "table.optimizer.source.predicate" =
>>>> "true"
>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "true"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is the default behavior, allowing
>> filter
>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" =
>>>> "true"
>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate
>>>>>> pushdown,
>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sources do
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" =
>>>> "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the
>>>>>> planner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains
>>>>>> consistent
>>>>>>>>>>> with
>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From an implementation perspective, setting
>>>> the
>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" higher than
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" is
>>>>>> difficult to
>>>>>>>>>>>>>>>> achieve
>>>>>>>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because the
>> PushFilterIntoSourceScanRuleBase
>>>> at
>>>>>>>>> the
>>>>>>>>>>>>>>>> planner
>>>>>>>>>>>>>>>>>>>>>>> level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> precedence over the source-level
>>>>>>>>> FilterPushDownSpec.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Only when the
>>>> PushFilterIntoSourceScanRuleBase
>>>>>> is
>>>>>>>>>>>>>>>> enabled,
>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Source-level filter pushdown be performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, in my opinion, there doesn't
>>>>>> seem to
>>>>>>>>>>> be
>>>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>>>>>>>>>>>>>> benefit in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setting a higher priority for
>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may instead affect compatibility and
>>>>>> increase
>>>>>>>>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <
>>>>>>>>> libenc...@apache.org
>>>>>>>>>>>> 
>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with Jane that fine-grained
>>>>>>>>> configurations
>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority than job level configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For current proposal, we can achieve that:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" =
>>>>>> "true"
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> enable
>>>>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
>>>>>>>>> ""scan.filter-push-down.enabled" =
>>>>>>>>>>>>>>>>>> "false"
>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" =
>>>>>>>>> "false"
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
>>>>>>>>> ""scan.filter-push-down.enabled" =
>>>>>>>>>>>>>>>>>> "true" to
>>>>>>>>>>>>>>>>>>>>>>>>>>> enable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com>
>>>>>> 于2023年10月24日周二
>>>>>>>>>>>>>>>> 23:55写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level
>> than
>>>>>> the
>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong, but I think the
>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" should
>>>> have a
>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default value of
>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> true.
>>>>>>>>>>>>>>>>>>>>>>> As
>>>>>>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> result,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> turning off filter push-down for a
>> specific
>>>>>>>>> source
>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>> take
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unless the default value of
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false, or, alternatively, let users
>>>>>> manually
>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" to
>> false
>>>>>>>>> first
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> selectively
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter push-down for the desired
>>>>>> sources,
>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>> less
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> intuitive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao
>> Sun
>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level
>> than
>>>>>> the
>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <
>>>>>>>>>>>>> qingyue....@gmail.com>
>>>>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this discussion. I
>>>> have
>>>>>> a
>>>>>>>>>>> small
>>>>>>>>>>>>>>>>>>>>> question
>>>>>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take
>>>>>>>>> precedence
>>>>>>>>>>>>> over
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" when
>>>> the
>>>>>>>>> two
>>>>>>>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conflict
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each other?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao
>>>> Sun
>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jark,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we only add configuration without
>>>>>> adding
>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enableFilterPushDown
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown
>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each connector would have to handle
>> the
>>>>>> same
>>>>>>>>>>>>> logic
>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method to determine whether filter
>>>>>> pushdown
>>>>>>>>> is
>>>>>>>>>>>>>>>> needed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This would increase complexity and
>>>> violate
>>>>>>>>> the
>>>>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>>>>>>>>>>>>>> behavior
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters method.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, we only need to pass
>>>> the
>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parameter in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> newly added enableFilterPushDown
>> method
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide whether to perform predicate
>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think this approach would be clearer
>>>> and
>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <
>>>>>> imj...@gmail.com
>>>>>>>>>> 
>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi JIabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think the current interface can
>>>> already
>>>>>>>>>>>>> satisfy
>>>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> requirements.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The connector can reject all the
>>>> filters
>>>>>> by
>>>>>>>>>>>>>>>> returning
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> input
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as `Result#remainingFilters`.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So maybe we don't need to introduce a
>>>> new
>>>>>>>>>>>>> method to
>>>>>>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown, but just introduce an
>> option
>>>>>> for
>>>>>>>>> the
>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard
>>>> Xu
>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>> xbjt...@gmail.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this
>>>>>>>>>>> discussion.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you add a section to explain
>> the
>>>>>>>>>>>>> difference
>>>>>>>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector level config
>>>>>>>>>>>>>>>>>> `scan.filter-push-down.enabled`
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> query
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level config
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> `table.optimizer.source.predicate-pushdown-enabled` ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonard
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Devs,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion
>> on
>>>>>>>>>>>>> FLIP-377:
>>>>>>>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable filter pushdown for
>> Table/SQL
>>>>>>>>>>>>> Sources[1].
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not
>>>>>> expose
>>>>>>>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> control
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users to enable or disable filter
>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, filter pushdown has some
>>>> side
>>>>>>>>>>>>> effects,
>>>>>>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computational pressure on external
>>>>>> systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead
>>>> to
>>>>>>>>>>> issues
>>>>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>> full
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scans,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which in turn can impact the
>> stability
>>>>>> of
>>>>>>>>>>>>> external
>>>>>>>>>>>>>>>>>>>>>>> systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with
>> two
>>>>>>>>>>> sources:
>>>>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>>>>>>>> and a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> database.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The database is sensitive to
>>>> pressure,
>>>>>> and
>>>>>>>>>>> we
>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configure
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not perform filter pushdown to the
>>>>>> database
>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, we still want to perform
>>>>>> filter
>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> decrease network IO.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I propose to support configuration
>> to
>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>> push
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> down for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide
>>>>>>>>> whether
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> perform
>>>>>>>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>> 
>> 
>> 


Reply via email to