Hi Becket,

The FLIP document has been updated as well.
Please take a look when you have time.

Thanks,
Jiabao


> 2023年12月17日 22:54,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道:
> 
> Thanks Becket,
> 
> I apologize for not being able to continue with this proposal due to being 
> too busy during this period.
> 
> The viewpoints you shared about the design of Flink Source make sense to me 
> The native configuration ‘ignore.filter.pushdown’ is good to me. 
> Having a unified name or naming style can indeed prevent confusion for users 
> regarding
> the inconsistent naming of this configuration across different connectors.
> 
> Currently, there are not many external connectors that support filter 
> pushdown.
> I propose that we first introduce it in flink-connector-jdbc and 
> flink-connector-mongodb. 
> Do you think this is feasible?
> 
> Best,
> Jiabao
> 
> 
>> 2023年11月16日 17:45,Becket Qin <becket....@gmail.com> 写道:
>> 
>> Hi Jiabao,
>> 
>> Arguments like "because Spark has it so Flink should also have it" does not
>> make sense. Different projects have different API flavors and styles. What
>> is really important is the rationale and the design principle behind the
>> API. They should conform to the convention of the project.
>> 
>> First of all, Spark Source API itself has a few issues and they ended up
>> introduce DataSource V2 in Spark 3.0, which added the decorative interfaces
>> like SupportsPushdownXXX. Some of the configurations predating DataSource
>> V2 may still be there.
>> 
>> For the Spark configurations you mentioned, they are all the configurations
>> for FileScanBuilder, which is equivalent to FileSource in Flink. Currently,
>> regardless of the format (ORC, Parquet, Avro, etc), the FileSource pushes
>> back all the filters to ensure correctness. The actual filters that got
>> applied to the specific format might still be different. This
>> implementation is the same in FileScanBuilder.pushFilters() for Spark. I
>> don't know why Spark got separate configurations for each format. Maybe it
>> is because the filters are actually implemented differently for different
>> format.
>> 
>> At least for the current implementation in FileScanBuilder, these
>> configurations can be merged to one configuration like
>> `apply.filters.to.format.enabled`. Note that this config, as well as the
>> separate configs you mentioned, are just visible and used by the
>> FileScanBuilder. It determines whether the filters should be passed down to
>> the format of the FileScanBuilder instance. Regardless of the value of
>> these configs, FileScanBuilder.pushFilters() will always be called, and
>> FileScanBuilder (as well as FileSource in Flink) will always push back all
>> the filters to the framework.
>> 
>> A MySql source can have a very different way to handle this. For example, A
>> MySql source  A config in this case might be "my.apply.filters" with three
>> different values:
>> - AUTO: The Source will issue a DESC Table request to understand whether a
>> filter can be applied efficiently. And decide which filters can be applied
>> and which cannot based on that.
>> - NEVER: Never apply filtering. It will always do a full table read and
>> let Flink do the filtering.
>> - ALWAYS: Always apply the filtering to the MySql server.
>> 
>> In the above examples of FileSource and MySql Source, I don't think it is a
>> good idea to shoehorn the behaviors into a naive config of
>> `ignore.filter.pushdown`. That is why I don't think this is a common config.
>> 
>> To recap, like I said, I do agree that in some cases, we may want to behave
>> differently when filters are pushed down to the sources, even if a source
>> implements SupportsFilterPushDown, but I don't think there is a suitable
>> common config for this. The behavior is very likely source specific.
>> 
>> Thanks,
>> 
>> Jiangjie (Becket) Qin
>> 
>> 
>> 
>> On Thu, Nov 16, 2023 at 3:41 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid>
>> wrote:
>> 
>>> Thanks Becket,
>>> 
>>> I still believe that adding a configuration at the source level to disable
>>> filter pushdown is needed. This demand exists in spark as well[1].
>>> 
>>> In Spark, most sources that support filter pushdown provide their own
>>> corresponding configuration options to enable or disable filter pushdown.
>>> For PRs[2-4] that support filter pushdown capability, they also provide
>>> configuration options to disable this capability.
>>> 
>>> I believe this configuration is applicable to most scenarios, and there is
>>> no need to dwell on why this configuration option was not introduced
>>> earlier than the SupportsFilterPushDown interface.
>>> 
>>> spark.sql.parquet.filterPushdown
>>> spark.sql.orc.filterPushdown
>>> spark.sql.csv.filterPushdown.enabled
>>> spark.sql.json.filterPushdown.enabled
>>> spark.sql.avro.filterPushdown.enabled
>>> JDBC Option: pushDownPredicate
>>> 
>>> We can see that the lack of consistency is caused by each connector
>>> introducing different configuration options for the same behavior.
>>> This is one of the motivations for advocating the introduction of a
>>> unified configuration name.
>>> 
>>> [1] https://issues.apache.org/jira/browse/SPARK-24288
>>> [2] https://github.com/apache/spark/pull/27366
>>> [3]https://github.com/apache/spark/pull/26973
>>> [4] https://github.com/apache/spark/pull/29145
>>> 
>>> Best,
>>> Jiabao
>>> 
>>>> 2023年11月16日 08:10,Becket Qin <becket....@gmail.com> 写道:
>>>> 
>>>> Hi Jiabao,
>>>> 
>>>> While we can always fix the formality of the config, a more fundamental
>>>> issue here is whether this configuration is common enough. Personally I
>>> am
>>>> still not convinced it is.
>>>> 
>>>> Remember we don't have a common implementation for SupportsFilterPushdown
>>>> itself. Why does a potential behavior of the
>>>> SupportsFilterPushdown.applyFilters() method deserve a common
>>>> configuration? A common implementation should always come first, then its
>>>> configuration becomes a common configuration as a natural result. But
>>> here
>>>> we are trying to add an impl to a configuration just to fix its
>>> formality.
>>>> 
>>>> I agree that there might be a few Source implementations that may want to
>>>> avoid additional burdens on the remote system in some circumstances. And
>>>> these circumstances are very specific:
>>>> 1. The source talks to a remote service that can help perform the actual
>>>> filtering.
>>>> 2. The filtering done by the remote service is inefficient for some
>>> reason
>>>> (e.g. missing index)
>>>> 3. The external service does not want to perform the inefficient
>>> filtering
>>>> for some reason (e.g. it is a shared service with others)
>>>> 
>>>> There are multiple approaches to address the issue. Pushing back the
>>>> filters is just one way of achieving this. So here we are talking about a
>>>> config for one of the possible solutions to a scenario with all the above
>>>> situations. I don't think there is enough justification for the config to
>>>> be common.
>>>> 
>>>> There is always this trade-off between the proliferation of public
>>>> interfaces and the API standardization. As an extreme example, we can
>>> make
>>>> our public API a union of all the configs potentially used in all the
>>> cases
>>>> in the name of standardization. Apparently this won't work. So there must
>>>> be a bar here and this bar might be somewhat subjective. For this FLIP,
>>>> personally I don't think the config meets my bar for the reason stated
>>>> above.
>>>> 
>>>> Therefore, my suggestion remains the same. Keep the config as a Source
>>>> implementation specific configuration.
>>>> 
>>>> Thanks,
>>>> 
>>>> Jiangjie (Becket) Qin
>>>> 
>>>> 
>>>> 
>>>> On Thu, Nov 16, 2023 at 12:36 AM Jiabao Sun <jiabao....@xtransfer.cn
>>> .invalid>
>>>> wrote:
>>>> 
>>>>> Thanks Becket for the feedback,
>>>>> 
>>>>> Regarding concerns about common configurations, I think we can introduce
>>>>> FiltersApplier to unify the behavior of various connectors.
>>>>> 
>>>>> public static class FiltersApplier {
>>>>> 
>>>>>  private final ReadableConfig config;
>>>>>  private final Function<List<ResolvedExpression>, Result> action;
>>>>> 
>>>>>  private FiltersApplier(
>>>>>          ReadableConfig config,
>>>>>          Function<List<ResolvedExpression>, Result> action) {
>>>>>      this.config = config;
>>>>>      this.action = action;
>>>>>  }
>>>>> 
>>>>>  public Result applyFilters(List<ResolvedExpression> filters) {
>>>>>      if (config.get(ENABLE_FILTER_PUSH_DOWN)) {
>>>>>          return action.apply(filters);
>>>>>      } else {
>>>>>          return Result.of(Collections.emptyList(), filters);
>>>>>      }
>>>>>  }
>>>>> 
>>>>>  public static FiltersApplier of(
>>>>>          ReadableConfig config,
>>>>>          Function<List<ResolvedExpression>, Result> action) {
>>>>>      return new FiltersApplier(config, action);
>>>>>  }
>>>>> }
>>>>> 
>>>>> For connectors implementation:
>>>>> 
>>>>> @Override
>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
>>>>>  return FiltersApplier.of(config,
>>>>>          f -> Result.of(new ArrayList<>(filters),
>>>>> Collections.emptyList()));
>>>>> }
>>>>> 
>>>>> As for the name, whether it is "source.filter-push-down.enabled" or
>>>>> "source.ignore-pushed-down-filters.enabled", I think both are okay.
>>>>> 
>>>>> Do you think this change is feasible?
>>>>> 
>>>>> 
>>>>> Best,
>>>>> Jiabao
>>>>> 
>>>>> 
>>>>>> 2023年11月15日 23:44,Becket Qin <becket....@gmail.com> 写道:
>>>>>> 
>>>>>> Hi Jiabao,
>>>>>> 
>>>>>> Yes, I still have concerns.
>>>>>> 
>>>>>> The FLIP violates the following two principles regarding configuration:
>>>>>> 
>>>>>> 1.* A config of a class should never negate the semantic of a
>>> decorative
>>>>>> interface implemented by that class. *
>>>>>> A decorative interface is a public contract with other components,
>>> while
>>>>> a
>>>>>> config is only internal to the class itself. The configurations for the
>>>>>> Sources are not (and should never be) visible or understood to
>>>>>> other components (e.g. optimizer). A configuration of a Source only
>>>>>> controls the behavior of that Source, provided it is not violating the
>>>>> API
>>>>>> contract / semantic defined by the decorative interface. So when a
>>> Source
>>>>>> implementation implements SupportsFilterPushdown, this is a clear
>>> public
>>>>>> contract with Flink that filters should be pushed down to that Source.
>>>>>> Therefore, for the same source, there should not be a configuration
>>>>>> "source.filter-push-down.enabled" which stops the filters from being
>>>>> pushed
>>>>>> down to that Source. However, that specific source implementation can
>>>>> have
>>>>>> its own config to control its internal behavior, e.g.
>>>>>> "ignore-pushed-down-filters.enabled" which may push back all the pushed
>>>>>> down filters back to the Flink optimizer.
>>>>>> 
>>>>>> 2. When we are talking about "common configs", in fact we are talking
>>>>> about
>>>>>> "configs for common (abstract) implementation classes". With that as a
>>>>>> context, *a common config should always be backed by a common
>>>>>> implementation class, so that consistent behavior can be guaranteed. *
>>>>>> The LookupOptions you mentioned are configurations defined for classes
>>>>>> DefaultLookupCache / PeriodicCacheReloadTrigger /
>>>>> TimedCacheReloadTrigger.
>>>>>> These configs are considered as "common" only because the
>>> implementation
>>>>>> classes using them are common building blocks for lookup table
>>>>>> implementations. It would not make sense to have a dangling config in
>>> the
>>>>>> LookupOptions without the underlying common implementation class, but
>>>>> only
>>>>>> relies on a specific source to implement the stated behavior.
>>>>>> As a bad example, there is this outlier config "max-retries" in
>>>>>> LookupOptions, which I don't think should be here. This is because the
>>>>>> retry behavior can be very implementation specific. For example, there
>>>>> can
>>>>>> be many different flavors of retry related configurations,
>>> retry-backoff,
>>>>>> retry-timeout, retry-async, etc. Why only max-retry is put here? should
>>>>> all
>>>>>> of them be put here? If we put all such kinds of configs in the common
>>>>>> configs for "standardization and unification", the number of "common
>>>>>> configs" can easily go crazy. And I don't see material benefits of
>>> doing
>>>>>> that. So here I don't think the configuration "max-retry" should be in
>>>>>> LookupOptions, because it is not backed by any common implementation
>>>>>> classes. If max-retry is implemented in the HBase source, it should
>>> stay
>>>>>> there. For the same reason, the config proposed in this FLIP (probably
>>>>> with
>>>>>> a name less confusing for the first reason mentioned above)  should
>>> stay
>>>>> in
>>>>>> the specific Source implementation.
>>>>>> 
>>>>>> For the two reasons above, I am -1 to what the FLIP currently proposes.
>>>>>> 
>>>>>> I think the right way to address the motivation here is still to have a
>>>>>> config like "ignore-pushed-down-filters.enabled" for the specific
>>> source
>>>>>> implementation. Please let me know if this solves the problem you are
>>>>>> facing.
>>>>>> 
>>>>>> Thanks,
>>>>>> 
>>>>>> Jiangjie (Becket) Qin
>>>>>> 
>>>>>> 
>>>>>> On Wed, Nov 15, 2023 at 11:52 AM Jiabao Sun <jiabao....@xtransfer.cn
>>>>> .invalid>
>>>>>> wrote:
>>>>>> 
>>>>>>> Hi Becket,
>>>>>>> 
>>>>>>> The purpose of introducing this configuration is that not all filter
>>>>>>> pushdowns can improve overall performance.
>>>>>>> If the filter can hit the external index, then pushdown is definitely
>>>>>>> worth it, as it can not only improve query time but also decrease
>>>>> network
>>>>>>> overhead.
>>>>>>> However, for filters that do not hit the external index, it may
>>>>> increase a
>>>>>>> lot of performance overhead on the external system.
>>>>>>> 
>>>>>>> Undeniably, if the connector can make accurate decisions for good and
>>>>> bad
>>>>>>> filters, we may not need to introduce this configuration option to
>>>>> disable
>>>>>>> pushing down filters to the external system.
>>>>>>> However, it is currently not easy to achieve.
>>>>>>> 
>>>>>>> IMO, supporting filter pushdown does not mean that always filter
>>>>> pushdown
>>>>>>> is better.
>>>>>>> In the absence of automatic decision-making, I think we should leave
>>>>> this
>>>>>>> decision to users.
>>>>>>> 
>>>>>>> The newly introduced configuration option is similar to LookupOptions,
>>>>>>> providing unified naming and default values to avoid confusion caused
>>> by
>>>>>>> inconsistent naming in different connectors for users.
>>>>>>> Setting the default value to true allows it to maintain compatibility
>>>>> with
>>>>>>> the default behavior of "always pushdown".
>>>>>>> 
>>>>>>> Do you have any other concerns about this proposal? Please let me
>>> know.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Jiabao
>>>>>>> 
>>>>>>> 
>>>>>>>> 2023年10月31日 17:29,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道:
>>>>>>>> 
>>>>>>>> Hi Becket,
>>>>>>>> 
>>>>>>>> Actually, for FileSystemSource, it is not always desired, only OCR
>>> file
>>>>>>> formats support filter pushdown.
>>>>>>>> 
>>>>>>>> We can disable predicate pushdown for FileSystemSource by setting
>>>>>>> 'table.optimizer.source.predicate-pushdown-enabled' to false.
>>>>>>>> I think we can also disable filter pushdown at a more granular level
>>>>>>> through fine-grained configuration.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Best,
>>>>>>>> Jiabao
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 2023年10月31日 16:50,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>> 
>>>>>>>>> Hi Jiabao,
>>>>>>>>> 
>>>>>>>>> Thanks for the explanation. Maybe it's easier to explain with an
>>>>>>> example.
>>>>>>>>> 
>>>>>>>>> Let's take FileSystemTableSource as an example. Currently it
>>>>> implements
>>>>>>>>> SupportsFilterPushDown interface. With your proposal, does it have
>>> to
>>>>>>>>> support `source.filter-push-down.enabled` as well? But this
>>>>>>> configuration
>>>>>>>>> does not quite make sense for the FileSystemTableSource because
>>> filter
>>>>>>>>> pushdown is always desired. However, because this configuration is a
>>>>>>> part
>>>>>>>>> of the SupportsFilterPushDown interface (which sounds confusing to
>>>>> begin
>>>>>>>>> with), the FileSystemTableSource can only do one of the following:
>>>>>>>>> 
>>>>>>>>> 1. Ignore the user configuration to always apply the pushed down
>>>>>>> filters -
>>>>>>>>> this is an apparent anti-pattern because a configuration should
>>> always
>>>>>>> do
>>>>>>>>> what it says.
>>>>>>>>> 2. Throw an exception telling users that this configuration is not
>>>>>>>>> applicable to the FileSystemTableSource.
>>>>>>>>> 3. Implement this configuration to push back the pushed down
>>> filters,
>>>>>>> even
>>>>>>>>> though this is never desired.
>>>>>>>>> 
>>>>>>>>> None of the above options looks awkward. I am curious what your
>>>>>>> solution is
>>>>>>>>> here?
>>>>>>>>> 
>>>>>>>>> Thanks,
>>>>>>>>> 
>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>> 
>>>>>>>>> On Tue, Oct 31, 2023 at 3:11 PM Jiabao Sun <jiabao....@xtransfer.cn
>>>>>>> .invalid>
>>>>>>>>> wrote:
>>>>>>>>> 
>>>>>>>>>> Thanks Becket for the further explanation.
>>>>>>>>>> 
>>>>>>>>>> Perhaps I didn't explain it clearly.
>>>>>>>>>> 
>>>>>>>>>> 1. If a source does not implement the SupportsFilterPushDown
>>>>> interface,
>>>>>>>>>> the newly added configurations do not need to be added to either
>>> the
>>>>>>>>>> requiredOptions or optionalOptions.
>>>>>>>>>> Similar to LookupOptions, if a source does not implement
>>>>>>>>>> LookupTableSource, there is no need to add LookupOptions to either
>>>>>>>>>> requiredOptions or optionalOptions.
>>>>>>>>>> 
>>>>>>>>>> 2. "And these configs are specific to those sources, instead of
>>>>> common
>>>>>>>>>> configs."
>>>>>>>>>> The newly introduced configurations define standardized names and
>>>>>>> default
>>>>>>>>>> values.
>>>>>>>>>> They still belong to the configuration at the individual source
>>>>> level.
>>>>>>>>>> The purpose is to avoid scattered configuration items when
>>> different
>>>>>>>>>> sources implement the same logic.
>>>>>>>>>> Whether a source should accept these configurations is determined
>>> by
>>>>>>> the
>>>>>>>>>> source's Factory.
>>>>>>>>>> 
>>>>>>>>>> Best,
>>>>>>>>>> Jiabao
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>>> 2023年10月31日 13:47,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>> 
>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>> 
>>>>>>>>>>> Please see the replies inline.
>>>>>>>>>>> 
>>>>>>>>>>> Introducing common configurations does not mean that all sources
>>>>> must
>>>>>>>>>>>> accept these configuration options.
>>>>>>>>>>>> The configuration options supported by a source are determined by
>>>>> the
>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory interface.
>>>>>>>>>>> 
>>>>>>>>>>> This is not true. Both required and optional options are
>>> SUPPORTED.
>>>>>>> That
>>>>>>>>>>> means they are implemented and if one specifies an optional config
>>>>> it
>>>>>>>>>> will
>>>>>>>>>>> still take effect. An OptionalConfig is "Optional" because this
>>>>>>>>>>> configuration has a default value. Hence, it is OK that users do
>>> not
>>>>>>>>>>> specify their own value. In another word, it is "optional" for the
>>>>> end
>>>>>>>>>>> users to set the config, but the implementation and support for
>>> that
>>>>>>>>>> config
>>>>>>>>>>> is NOT optional. In case a source does not support a common
>>> config,
>>>>> an
>>>>>>>>>>> exception must be thrown when the config is provided by the end
>>>>> users.
>>>>>>>>>>> However, the config we are talking about in this FLIP is a common
>>>>>>> config
>>>>>>>>>>> optional to implement, meaning that sometimes the claimed behavior
>>>>>>> won't
>>>>>>>>>> be
>>>>>>>>>>> there even if users specify that config.
>>>>>>>>>>> 
>>>>>>>>>>> Similar to sources that do not implement the LookupTableSource
>>>>>>> interface,
>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
>>> interface
>>>>>>> also
>>>>>>>>>> do
>>>>>>>>>>>> not need to accept newly introduced options.
>>>>>>>>>>> 
>>>>>>>>>>> First of all, filter pushdown is a behavior of the query
>>> optimizer,
>>>>>>> not
>>>>>>>>>> the
>>>>>>>>>>> behavior of Sources. The Sources tells the optimizer that it has
>>> the
>>>>>>>>>>> ability to accept pushed down filters by implementing the
>>>>>>>>>>> SupportsFilterPushDown interface. And this is the only contract
>>>>>>> between
>>>>>>>>>> the
>>>>>>>>>>> Source and Optimizer regarding whether filters should be pushed
>>>>> down.
>>>>>>> As
>>>>>>>>>>> long as a specific source implements this decorative interface,
>>>>> filter
>>>>>>>>>>> pushdown should always take place, i.e.
>>>>>>>>>>> *SupportsFilterPushDown.applyFilters()* will be called. There
>>> should
>>>>>>> be
>>>>>>>>>> no
>>>>>>>>>>> other config to disable that call. However, Sources can decide how
>>>>> to
>>>>>>>>>>> behave based on their own configurations after *applyFilters()* is
>>>>>>>>>> called.
>>>>>>>>>>> And these configs are specific to those sources, instead of common
>>>>>>>>>> configs.
>>>>>>>>>>> Please see the examples I mentioned in the previous email.
>>>>>>>>>>> 
>>>>>>>>>>> Thanks,
>>>>>>>>>>> 
>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>> 
>>>>>>>>>>> On Tue, Oct 31, 2023 at 10:27 AM Jiabao Sun <
>>>>> jiabao....@xtransfer.cn
>>>>>>>>>> .invalid>
>>>>>>>>>>> wrote:
>>>>>>>>>>> 
>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>> 
>>>>>>>>>>>> Sorry, there was a typo in the second point. Let me correct it:
>>>>>>>>>>>> 
>>>>>>>>>>>> Introducing common configurations does not mean that all sources
>>>>> must
>>>>>>>>>>>> accept these configuration options.
>>>>>>>>>>>> The configuration options supported by a source are determined by
>>>>> the
>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory interface.
>>>>>>>>>>>> 
>>>>>>>>>>>> Similar to sources that do not implement the LookupTableSource
>>>>>>>>>> interface,
>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
>>> interface
>>>>>>> also
>>>>>>>>>> do
>>>>>>>>>>>> not need to accept newly introduced options.
>>>>>>>>>>>> 
>>>>>>>>>>>> Best,
>>>>>>>>>>>> Jiabao
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>>> 2023年10月31日 10:13,Jiabao Sun <jiabao....@xtransfer.cn.INVALID>
>>>>> 写道:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks Becket for the feedback.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 1. Currently, the SupportsFilterPushDown#applyFilters method
>>>>>>> returns a
>>>>>>>>>>>> result that includes acceptedFilters and remainingFilters. The
>>>>> source
>>>>>>>>>> can
>>>>>>>>>>>> decide to push down some filters or not accept any of them.
>>>>>>>>>>>>> 2. Introducing common configuration options does not mean that a
>>>>>>> source
>>>>>>>>>>>> that supports the SupportsFilterPushDown capability must accept
>>>>> this
>>>>>>>>>>>> configuration. Similar to LookupOptions, only sources that
>>>>> implement
>>>>>>> the
>>>>>>>>>>>> LookupTableSource interface are necessary to accept these
>>>>>>> configuration
>>>>>>>>>>>> options.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2023年10月31日 07:49,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Jiabao and Ruanhang,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Adding a configuration of source.filter-push-down.enabled as a
>>>>>>> common
>>>>>>>>>>>>>> source configuration seems problematic.
>>>>>>>>>>>>>> 1. The config name is misleading. filter pushdown should only
>>> be
>>>>>>>>>>>> determined
>>>>>>>>>>>>>> by whether the SupportsFilterPushdown interface is implemented
>>> or
>>>>>>> not.
>>>>>>>>>>>>>> 2. The behavior of this configuration is only applicable to
>>> some
>>>>>>>>>> source
>>>>>>>>>>>>>> implementations. Why is it a common configuration?
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Here's my suggestion for design principles:
>>>>>>>>>>>>>> 1. Only add source impl specific configuration to corresponding
>>>>>>>>>> sources.
>>>>>>>>>>>>>> 2. The configuration name should not overrule existing common
>>>>>>>>>> contracts.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> For example, in the case of MySql source. There are several
>>>>>>> options:
>>>>>>>>>>>>>> 1. Have a configuration of
>>>>> `*mysql.avoid.remote.full.table.scan`*.
>>>>>>> If
>>>>>>>>>>>> this
>>>>>>>>>>>>>> configuration is set, and a filter pushdown does not hit an
>>>>> index,
>>>>>>> the
>>>>>>>>>>>>>> MySql source impl would not further pushdown the filter to
>>> MySql
>>>>>>>>>>>> servers.
>>>>>>>>>>>>>> Note that this assumes the MySql source can retrieve the index
>>>>>>>>>>>> information
>>>>>>>>>>>>>> from the MySql servers.
>>>>>>>>>>>>>> 2. If the MySql index information is not available to the MySql
>>>>>>>>>> source,
>>>>>>>>>>>> the
>>>>>>>>>>>>>> configuration could be something like
>>>>>>>>>>>> *`mysql.pushback.pushed.down.filters`*.
>>>>>>>>>>>>>> Once set to true, MySql source would just add all the filters
>>> to
>>>>>>> the
>>>>>>>>>>>>>> RemainingFilters in the Result returned by
>>>>>>>>>>>>>> *SupportsFilterPushdown.applyFilters().*
>>>>>>>>>>>>>> 3. An alternative to option 2 is to have a `
>>>>>>>>>>>>>> *mysql.apply.predicates.after.scan*`. When it is set to true,
>>>>> MySql
>>>>>>>>>>>> source
>>>>>>>>>>>>>> will not push the filter down to the MySql servers, but apply
>>> the
>>>>>>>>>>>> filters
>>>>>>>>>>>>>> inside the MySql source itself.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> As you may see, the above configurations do not disable filter
>>>>>>>>>> pushdown
>>>>>>>>>>>>>> itself. They just allow various implementations of filter
>>>>> pushdown.
>>>>>>>>>> And
>>>>>>>>>>>> the
>>>>>>>>>>>>>> configuration name does not give any illusion that filter
>>>>> pushdown
>>>>>>> is
>>>>>>>>>>>>>> disabled.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Mon, Oct 30, 2023 at 11:58 PM Jiabao Sun <
>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks Hang for the suggestion.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I think the configuration of TableSource is not closely
>>> related
>>>>> to
>>>>>>>>>>>>>>> SourceReader,
>>>>>>>>>>>>>>> so I prefer to introduce a independent configuration class
>>>>>>>>>>>>>>> TableSourceOptions in the flink-table-common module, similar
>>> to
>>>>>>>>>>>>>>> LookupOptions.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> For the second point, I suggest adding Java doc to the
>>>>>>>>>>>> SupportsXXXPushDown
>>>>>>>>>>>>>>> interfaces, providing detailed information on these options
>>> that
>>>>>>>>>> needs
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> be supported.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I have made updates in the FLIP document.
>>>>>>>>>>>>>>> Please help check it again.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com> 写道:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Thanks for the improvements, Jiabao.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> There are some details that I am not sure about.
>>>>>>>>>>>>>>>> 1. The new option `source.filter-push-down.enabled` will be
>>>>>>> added to
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>> class? I think it should be `SourceReaderOptions`.
>>>>>>>>>>>>>>>> 2. How are the connector developers able to know and follow
>>> the
>>>>>>>>>> FLIP?
>>>>>>>>>>>> Do
>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>> need an abstract base class or provide a default method?
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Hang
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> 于2023年10月30日周一
>>>>>>>>>> 14:45写道:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi, all,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks for the lively discussion.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Based on the discussion, I have made some adjustments to the
>>>>>>> FLIP
>>>>>>>>>>>>>>> document:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 1. The name of the newly added option has been changed to
>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled".
>>>>>>>>>>>>>>>>> 2. Considering compatibility with older versions, the newly
>>>>>>> added
>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled" option needs to respect
>>> the
>>>>>>>>>>>>>>> optimizer's
>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" option.
>>>>>>>>>>>>>>>>> But there is a consideration to remove the old option in
>>> Flink
>>>>>>> 2.0.
>>>>>>>>>>>>>>>>> 3. We can provide more options to disable other source
>>>>> abilities
>>>>>>>>>> with
>>>>>>>>>>>>>>> side
>>>>>>>>>>>>>>>>> effects, such as “source.aggregate.enabled” and
>>>>>>>>>>>>>>> “source.projection.enabled"
>>>>>>>>>>>>>>>>> This is not urgent and can be continuously introduced.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Looking forward to your feedback again.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 2023年10月29日 08:45,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks for digging into the git history, Jark. I agree it
>>>>> makes
>>>>>>>>>>>> sense
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>> deprecate this API in 2.0.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <imj...@gmail.com>
>>>>>>> wrote:
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> I checked the history of "
>>>>>>>>>>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled*",
>>>>>>>>>>>>>>>>>>> it seems it was introduced since the legacy
>>>>>>> FilterableTableSource
>>>>>>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>>> which might be an experiential feature at that time. I
>>> don't
>>>>>>> see
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> necessity
>>>>>>>>>>>>>>>>>>> of this option at the moment. Maybe we can deprecate this
>>>>>>> option
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> drop
>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>> in Flink 2.0[1] if it is not necessary anymore. This may
>>>>> help
>>>>>>> to
>>>>>>>>>>>>>>>>>>> simplify this discussion.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> [1]: https://issues.apache.org/jira/browse/FLINK-32383
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <
>>>>>>> becket....@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks for the proposal, Jiabao. My two cents below:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 1. If I understand correctly, the motivation of the FLIP
>>> is
>>>>>>>>>>>> mainly to
>>>>>>>>>>>>>>>>>>>> make predicate pushdown optional on SOME of the Sources.
>>> If
>>>>>>> so,
>>>>>>>>>>>>>>>>> intuitively
>>>>>>>>>>>>>>>>>>>> the configuration should be Source specific instead of
>>>>>>> general.
>>>>>>>>>>>>>>>>> Otherwise,
>>>>>>>>>>>>>>>>>>>> we will end up with general configurations that may not
>>>>> take
>>>>>>>>>>>> effect
>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> some of the Source implementations. This violates the
>>> basic
>>>>>>> rule
>>>>>>>>>>>> of a
>>>>>>>>>>>>>>>>>>>> configuration - it does what it says, regardless of the
>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>>>> While configuration standardization is usually a good
>>>>> thing,
>>>>>>> it
>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>> break the basic rules.
>>>>>>>>>>>>>>>>>>>> If we really want to have this general configuration, for
>>>>> the
>>>>>>>>>>>> sources
>>>>>>>>>>>>>>>>>>>> this configuration does not apply, they should throw an
>>>>>>>>>> exception
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>>>>> it clear that this configuration is not supported.
>>> However,
>>>>>>> that
>>>>>>>>>>>>>>> seems
>>>>>>>>>>>>>>>>> ugly.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 2. I think the actual motivation of this FLIP is about
>>>>> "how a
>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>> should implement predicate pushdown efficiently", not
>>>>>>> "whether
>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>> pushdown should be applied to the source." For example,
>>> if
>>>>> a
>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>> wants
>>>>>>>>>>>>>>>>>>>> to avoid additional computing load in the external
>>> system,
>>>>> it
>>>>>>>>>> can
>>>>>>>>>>>>>>>>> always
>>>>>>>>>>>>>>>>>>>> read the entire record and apply the predicates by
>>> itself.
>>>>>>>>>>>> However,
>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>> the Flink perspective, the predicate pushdown is applied,
>>>>> it
>>>>>>> is
>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>>>> implemented differently by the source. So the design
>>>>>>> principle
>>>>>>>>>>>> here
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>> Flink only cares about whether a source supports
>>> predicate
>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>> not,
>>>>>>>>>>>>>>>>>>>> it does not care about the implementation efficiency /
>>> side
>>>>>>>>>>>> effect of
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> predicates pushdown. It is the Source implementation's
>>>>>>>>>>>> responsibility
>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> ensure the predicates pushdown is implemented efficiently
>>>>> and
>>>>>>>>>> does
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>> impose excessive pressure on the external system. And it
>>> is
>>>>>>> OK
>>>>>>>>>> to
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>> additional configurations to achieve this goal.
>>> Obviously,
>>>>>>> such
>>>>>>>>>>>>>>>>>>>> configurations will be source specific in this case.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 3. Regarding the existing configurations of
>>>>>>>>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled.
>>>>>>>>>>>>>>>>>>>> *I am not sure why we need it. Supposedly, if a source
>>>>>>>>>> implements
>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>> SupportsXXXPushDown interface, the optimizer should push
>>>>> the
>>>>>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>>>>>>>>> predicates to the Source. I am not sure in which case
>>> this
>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>> would be used. Any ideas @Jark Wu <imj...@gmail.com>?
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun
>>>>>>>>>>>>>>>>>>>> <jiabao....@xtransfer.cn.invalid> wrote:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks Jane for the detailed explanation.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I think that for users, we should respect conventions
>>> over
>>>>>>>>>>>>>>>>>>>>> configurations.
>>>>>>>>>>>>>>>>>>>>> Conventions can be default values explicitly specified
>>> in
>>>>>>>>>>>>>>>>>>>>> configurations, or they can be behaviors that follow
>>>>>>> previous
>>>>>>>>>>>>>>>>> versions.
>>>>>>>>>>>>>>>>>>>>> If the same code has different behaviors in different
>>>>>>> versions,
>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>> be a very bad thing.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> I agree that for regular users, it is not necessary to
>>>>>>>>>> understand
>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>> the configurations related to Flink.
>>>>>>>>>>>>>>>>>>>>> By following conventions, they can have a good
>>> experience.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Let's get back to the practical situation and consider
>>> it.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Case 1:
>>>>>>>>>>>>>>>>>>>>> The user is not familiar with the purpose of the
>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled
>>>>>>> configuration
>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>> follows
>>>>>>>>>>>>>>>>>>>>> the convention of allowing predicate pushdown to the
>>>>> source
>>>>>>> by
>>>>>>>>>>>>>>>>> default.
>>>>>>>>>>>>>>>>>>>>> Just understanding the source.predicate-pushdown-enabled
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>> and performing fine-grained toggle control will work
>>> well.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Case 2:
>>>>>>>>>>>>>>>>>>>>> The user understands the meaning of the
>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled
>>>>>>> configuration
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>> has set
>>>>>>>>>>>>>>>>>>>>> its value to false.
>>>>>>>>>>>>>>>>>>>>> We have reason to believe that the user understands the
>>>>>>> meaning
>>>>>>>>>>>> of
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> predicate pushdown configuration and the intention is to
>>>>>>>>>> disable
>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>> pushdown (rather than whether or not to allow it).
>>>>>>>>>>>>>>>>>>>>> The previous choice of globally disabling it is likely
>>>>>>> because
>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>> couldn't be disabled on individual sources.
>>>>>>>>>>>>>>>>>>>>> From this perspective, if we provide more fine-grained
>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>> support and provide detailed explanations of the
>>>>>>> configuration
>>>>>>>>>>>>>>>>> behaviors in
>>>>>>>>>>>>>>>>>>>>> the documentation,
>>>>>>>>>>>>>>>>>>>>> users can clearly understand the differences between
>>> these
>>>>>>> two
>>>>>>>>>>>>>>>>>>>>> configurations and use them correctly.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Also, I don't agree that
>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled = true
>>>>> and
>>>>>>>>>>>>>>>>>>>>> source.predicate-pushdown-enabled = false means that the
>>>>>>> local
>>>>>>>>>>>>>>>>>>>>> configuration overrides the global configuration.
>>>>>>>>>>>>>>>>>>>>> On the contrary, both configurations are functioning
>>>>>>> correctly.
>>>>>>>>>>>>>>>>>>>>> The optimizer allows predicate pushdown to all sources,
>>>>> but
>>>>>>>>>> some
>>>>>>>>>>>>>>>>> sources
>>>>>>>>>>>>>>>>>>>>> can reject the filters pushed down by the optimizer.
>>>>>>>>>>>>>>>>>>>>> This is natural, just like different components at
>>>>> different
>>>>>>>>>>>> levels
>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>> responsible for different tasks.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> The more serious issue is that if
>>>>>>>>>>>>>>> "source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>> does not respect
>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled”,
>>>>>>>>>>>>>>>>>>>>> the "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>> will be invalidated.
>>>>>>>>>>>>>>>>>>>>> This means that regardless of whether
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" is
>>> set
>>>>>>> to
>>>>>>>>>>>> true
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>> false, it will have no effect.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 22:24,Jane Chan <qingyue....@gmail.com>
>>> 写道:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Thanks for the in-depth clarification. Here are my
>>> cents
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> However,
>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are configurations for
>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> We cannot assume that every user would be interested in
>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> internal components of Flink, such as the optimizer or
>>>>>>>>>>>> connectors,
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> specific configurations associated with each component.
>>>>>>>>>> Instead,
>>>>>>>>>>>>>>>>> users
>>>>>>>>>>>>>>>>>>>>>> might be more concerned about knowing which
>>> configuration
>>>>>>>>>>>> enables
>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>> disables the filter push-down feature for all source
>>>>>>>>>> connectors,
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>> parameter provides the flexibility to override this
>>>>>>> behavior
>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>>>>>>>>> source if needed.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> So, from this perspective, I am inclined to divide
>>> these
>>>>>>> two
>>>>>>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>>>>>>>>> based on the scope of their impact from the user's
>>>>>>> perspective
>>>>>>>>>>>>>>> (i.e.
>>>>>>>>>>>>>>>>>>>>>> global-level or operator-level), rather than
>>> categorizing
>>>>>>> them
>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>>>>>>> component hierarchy from a developer's point of view.
>>>>>>>>>> Therefore,
>>>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>> this premise, it is intuitive and natural for users to
>>>>>>>>>>>>>>>>>>>>>> understand fine-grained configuration options can
>>>>> override
>>>>>>>>>>>> global
>>>>>>>>>>>>>>>>>>>>>> configurations.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled"
>>> doesn't
>>>>>>>>>>>> respect to
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>> and
>>>>>>> the
>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined as true,
>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as
>>>>>>> false
>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>> have no
>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be performed.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in behavior
>>>>>>> compared
>>>>>>>>>> to
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> <1>If I understand correctly,
>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>>>>>> connector option, which means the only way to configure
>>>>> it
>>>>>>> is
>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>> explicitly
>>>>>>>>>>>>>>>>>>>>>> specify it in DDL (no matter whether disable or
>>> enable),
>>>>>>> and
>>>>>>>>>> the
>>>>>>>>>>>>>>> SET
>>>>>>>>>>>>>>>>>>>>>> command is not applicable, so I think it's natural to
>>>>> still
>>>>>>>>>>>> respect
>>>>>>>>>>>>>>>>>>>>> user's
>>>>>>>>>>>>>>>>>>>>>> specification here. Otherwise, users might be more
>>>>> confused
>>>>>>>>>>>> about
>>>>>>>>>>>>>>> why
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> DDL does not work as expected, and the reason is just
>>>>>>> because
>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>>>>>>> "optimizer" configuration is set to a different value.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> <2> From the implementation side, I am inclined to keep
>>>>> the
>>>>>>>>>>>>>>>>> parameter's
>>>>>>>>>>>>>>>>>>>>>> priority consistent for all conditions.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Let "global" denote
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled",
>>>>>>>>>>>>>>>>>>>>>> and let "per-source" denote
>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>> source T,  the following Truth table (based on the
>>>>> current
>>>>>>>>>>>> design)
>>>>>>>>>>>>>>>>>>>>>> indicates the inconsistent behavior for "per-source
>>>>>>> override
>>>>>>>>>>>>>>> global".
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> .------------.---------------.-------------------
>>>>>>>>>>>>>>>>>>>>>> ----.-------------------------------------.
>>>>>>>>>>>>>>>>>>>>>> | global   | per-source | push-down for T | per-source
>>>>>>>>>> override
>>>>>>>>>>>>>>>>> global
>>>>>>>>>>>>>>>>>>>>> |
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> :-----------+--------------+-----------------------+------------------------------------:
>>>>>>>>>>>>>>>>>>>>>> | true       | false         | false
>>>>> | Y
>>>>>>>>>>>>>>>>>>>>>>             |
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> :-----------+--------------+-----------------------+------------------------------------:
>>>>>>>>>>>>>>>>>>>>>> | false     | true           | false
>>>>> | N
>>>>>>>>>>>>>>>>>>>>>>             |
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> .------------.---------------.-----------------------.-------------------------------------.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <
>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I understand that the configuration of global
>>>>> parallelism
>>>>>>> and
>>>>>>>>>>>> task
>>>>>>>>>>>>>>>>>>>>>>> parallelism is at different granularities but with the
>>>>>>> same
>>>>>>>>>>>>>>>>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>> However,
>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are configurations for
>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> From a user's perspective, there are two scenarios:
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 1. Disabling all filter pushdown
>>>>>>>>>>>>>>>>>>>>>>> In this case, setting
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>>>> to false is sufficient to meet the requirement.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 2. Disabling filter pushdown for specific sources
>>>>>>>>>>>>>>>>>>>>>>> In this scenario, there is no need to adjust the value
>>>>> of
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled".
>>>>>>>>>>>>>>>>>>>>>>> Instead, the focus should be on the configuration of
>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to meet the
>>> requirement.
>>>>>>>>>>>>>>>>>>>>>>> In this case, users do not need to set
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" to
>>>>>>> false
>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>> manually
>>>>>>>>>>>>>>>>>>>>>>> enable filter pushdown for specific sources.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled"
>>> doesn't
>>>>>>>>>>>> respect
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>> and
>>>>>>> the
>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined as true,
>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as
>>>>>>> false
>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>> have no
>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be performed.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in behavior
>>>>>>> compared
>>>>>>>>>> to
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>>>>>>>>>>> The same SQL query that could successfully push down
>>>>>>> filters
>>>>>>>>>> in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>> version but would no longer do so after the upgrade.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 17:10,Benchao Li <libenc...@apache.org>
>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jiabao for the detailed explanations, that
>>>>> helps a
>>>>>>>>>>>> lot, I
>>>>>>>>>>>>>>>>>>>>>>>> understand your rationale now.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong. Your perspective is from
>>>>>>>>>> "developer",
>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>> means there is an optimizer and connector component,
>>>>> and
>>>>>>> if
>>>>>>>>>> we
>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>> enable this feature (pushing filters down into
>>>>>>> connectors),
>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>>>>>>>> enable it firstly in optimizer, and only then
>>> connector
>>>>>>> has
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>>>>>>>>> to decide to use it or not.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> My perspective is from "user" that (Why a user should
>>>>>>> care
>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>> difference of optimizer/connector) , this is a
>>> feature,
>>>>>>> and
>>>>>>>>>>>> has
>>>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>>> way to control it, one way is to config it job-level,
>>>>> the
>>>>>>>>>>>> other
>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>> in table properties. What a user expects is that they
>>>>> can
>>>>>>>>>>>>>>> control a
>>>>>>>>>>>>>>>>>>>>>>>> feature in a tiered way, that setting it per job, and
>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>>> fine-grained tune it per table.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> This is some kind of similar to other concepts, such
>>> as
>>>>>>>>>>>>>>>>> parallelism,
>>>>>>>>>>>>>>>>>>>>>>>> users can set a job level default parallelism, and
>>> then
>>>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>> tune it per operator. There may be more such debate
>>> in
>>>>>>> the
>>>>>>>>>>>> future
>>>>>>>>>>>>>>>>>>>>>>>> e.g., we can have a job level config about adding
>>>>> key-by
>>>>>>>>>>>> before
>>>>>>>>>>>>>>>>>>>>> lookup
>>>>>>>>>>>>>>>>>>>>>>>> join, and also a hint/table property way to
>>>>> fine-grained
>>>>>>>>>>>> control
>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>> per lookup operator. Hence we'd better find a unified
>>>>> way
>>>>>>>>>> for
>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>>>> those similar kind of features.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid>
>>>>>>> 于2023年10月25日周三
>>>>>>>>>>>>>>>>> 15:27写道:
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for further explanation.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> These two configurations correspond to different
>>>>> levels.
>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" does not make
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" invalid.
>>>>>>>>>>>>>>>>>>>>>>>>> The planner will still push down predicates to all
>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>>>>> Whether filter pushdown is allowed or not is
>>>>> determined
>>>>>>> by
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>> source's "scan.filter-push-down.enabled"
>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> However, "table.optimizer.source.predicate" does
>>>>>>> directly
>>>>>>>>>>>> affect
>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled”.
>>>>>>>>>>>>>>>>>>>>>>>>> When the planner disables predicate pushdown, the
>>>>>>>>>>>> source-level
>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>> pushdown will also not be executed, even if the source
>>>>>>> allows
>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Whatever, in point 1 and 2, our expectation is
>>>>>>> consistent.
>>>>>>>>>>>>>>>>>>>>>>>>> For the 3rd point, I still think that the
>>>>> planner-level
>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>> takes precedence over the source-level configuration.
>>>>>>>>>>>>>>>>>>>>>>>>> It may seem counterintuitive when we globally
>>> disable
>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>> pushdown but allow filter pushdown at the source
>>> level.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 14:35,Jane Chan <qingyue....@gmail.com
>>>> 
>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for clarifying this. While by
>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled
>>>>>>>>>>>>>>>>>>>>>>> takes a
>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority" I meant that this value should be
>>>>>>>>>> respected
>>>>>>>>>>>>>>>>>>>>> whenever
>>>>>>>>>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>>>>>>>>>>>>> set explicitly.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> The conclusion that
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and
>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate pushdown,
>>> but
>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>>> sources do
>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> This indicates that the option
>>>>>>>>>>>> "scan.filter-push-down.enabled =
>>>>>>>>>>>>>>>>>>>>> false"
>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>> an individual source connector does indeed override
>>>>> the
>>>>>>>>>>>>>>>>>>>>> global-level
>>>>>>>>>>>>>>>>>>>>>>>>>> planner settings to make a difference. And thus
>>> "has
>>>>> a
>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>> priority".
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> While for
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the planner.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains consistent
>>>>> with
>>>>>>>>>> the
>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>> version as
>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> I still think "scan.filter-push-down.enabled"
>>> should
>>>>>>> also
>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>> respected
>>>>>>>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>>>>>>>>> it is enabled for individual connectors. WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> For the current proposal, we recommend keeping the
>>>>>>>>>> default
>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" as true,
>>>>>>>>>>>>>>>>>>>>>>>>>>> and setting the the default value of newly
>>>>> introduced
>>>>>>>>>>>> option
>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to true as well.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> The main purpose of doing this is to maintain
>>>>>>> consistency
>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>> versions, as whether to perform
>>>>>>>>>>>>>>>>>>>>>>>>>>> filter pushdown in the old version solely depends
>>> on
>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" option.
>>>>>>>>>>>>>>>>>>>>>>>>>>> That means by default, as long as a TableSource
>>>>>>>>>> implements
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface, filter pushdown
>>> is
>>>>>>>>>>>> allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>> And it seems that we don't have much benefit in
>>>>>>> changing
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>> of "table.optimizer.source.predicate" to false.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding the priority of these two
>>> configurations,
>>>>> I
>>>>>>>>>>>> believe
>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over
>>>>> "scan.filter-push-down.enabled"
>>>>>>> and
>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>> exhibits
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> following behavior.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. "table.optimizer.source.predicate" = "true" and
>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "true"
>>>>>>>>>>>>>>>>>>>>>>>>>>> This is the default behavior, allowing filter
>>>>> pushdown
>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" = "true" and
>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate pushdown,
>>> but
>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>>> sources do
>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the planner.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains consistent
>>>>> with
>>>>>>>>>> the
>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>> version as
>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> From an implementation perspective, setting the
>>>>>>> priority
>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" higher than
>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" is difficult to
>>>>>>>>>> achieve
>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Because the PushFilterIntoSourceScanRuleBase at
>>> the
>>>>>>>>>> planner
>>>>>>>>>>>>>>>>> level
>>>>>>>>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>>>>>>>>>>> precedence over the source-level
>>> FilterPushDownSpec.
>>>>>>>>>>>>>>>>>>>>>>>>>>> Only when the PushFilterIntoSourceScanRuleBase is
>>>>>>>>>> enabled,
>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> Source-level filter pushdown be performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, in my opinion, there doesn't seem to
>>>>> be
>>>>>>>>>> much
>>>>>>>>>>>>>>>>>>>>> benefit in
>>>>>>>>>>>>>>>>>>>>>>>>>>> setting a higher priority for
>>>>>>>>>>>> "scan.filter-push-down.enabled".
>>>>>>>>>>>>>>>>>>>>>>>>>>> It may instead affect compatibility and increase
>>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <
>>> libenc...@apache.org
>>>>>> 
>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with Jane that fine-grained
>>> configurations
>>>>>>>>>> should
>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority than job level configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> For current proposal, we can achieve that:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = "true"
>>>>> to
>>>>>>>>>>>> enable
>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
>>> ""scan.filter-push-down.enabled" =
>>>>>>>>>>>> "false"
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" =
>>> "false"
>>>>> to
>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
>>> ""scan.filter-push-down.enabled" =
>>>>>>>>>>>> "true" to
>>>>>>>>>>>>>>>>>>>>> enable
>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com> 于2023年10月24日周二
>>>>>>>>>> 23:55写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level than the
>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong, but I think the
>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" should have a
>>>>> higher
>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default value of
>>>>> "table.optimizer.source.predicate"
>>>>>>> is
>>>>>>>>>>>> true.
>>>>>>>>>>>>>>>>> As
>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>> result,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> turning off filter push-down for a specific
>>> source
>>>>>>> will
>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>> take
>>>>>>>>>>>>>>>>>>>>>>> effect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unless the default value of
>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false, or, alternatively, let users manually
>>>>> set
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" to false
>>> first
>>>>>>> and
>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>> selectively
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter push-down for the desired sources,
>>>>>>> which
>>>>>>>>>> is
>>>>>>>>>>>>>>> less
>>>>>>>>>>>>>>>>>>>>>>>>>>> intuitive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level than the
>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <
>>>>>>> qingyue....@gmail.com>
>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this discussion. I have a
>>>>> small
>>>>>>>>>>>>>>> question
>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take
>>> precedence
>>>>>>> over
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" when the
>>> two
>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>>>>>>>>> conflict
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each other?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jark,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we only add configuration without adding
>>> the
>>>>>>>>>>>>>>>>>>>>>>> enableFilterPushDown
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown
>>> interface,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each connector would have to handle the same
>>>>>>> logic
>>>>>>>>>> in
>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method to determine whether filter pushdown
>>> is
>>>>>>>>>> needed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This would increase complexity and violate
>>> the
>>>>>>>>>>>> original
>>>>>>>>>>>>>>>>>>>>> behavior
>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters method.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, we only need to pass the
>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>> parameter in
>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> newly added enableFilterPushDown method
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide whether to perform predicate
>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think this approach would be clearer and
>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <imj...@gmail.com
>>>> 
>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi JIabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think the current interface can already
>>>>>>> satisfy
>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>>>>>>> requirements.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The connector can reject all the filters by
>>>>>>>>>> returning
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> input
>>>>>>>>>>>>>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as `Result#remainingFilters`.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So maybe we don't need to introduce a new
>>>>>>> method to
>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown, but just introduce an option for
>>> the
>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>> connector.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard Xu <
>>>>>>>>>>>>>>>>> xbjt...@gmail.com
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this
>>>>> discussion.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you add a section to explain the
>>>>>>> difference
>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector level config
>>>>>>>>>>>> `scan.filter-push-down.enabled`
>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> query
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level config
>>>>>>>>>>>>>>>>>>>>>>> `table.optimizer.source.predicate-pushdown-enabled` ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonard
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <
>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>>>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Devs,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion on
>>>>>>> FLIP-377:
>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable filter pushdown for Table/SQL
>>>>>>> Sources[1].
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not expose
>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>> control
>>>>>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users to enable or disable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, filter pushdown has some side
>>>>>>> effects,
>>>>>>>>>>>> such
>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computational pressure on external systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead to
>>>>> issues
>>>>>>>>>> such
>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>> full
>>>>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scans,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which in turn can impact the stability of
>>>>>>> external
>>>>>>>>>>>>>>>>> systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with two
>>>>> sources:
>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>> and a
>>>>>>>>>>>>>>>>>>>>>>>>>>> database.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The database is sensitive to pressure, and
>>>>> we
>>>>>>>>>> want
>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> configure
>>>>>>>>>>>>>>>>>>>>>>>>>>> it to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not perform filter pushdown to the database
>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, we still want to perform filter
>>>>>>> pushdown
>>>>>>>>>>>> to
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> decrease network IO.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I propose to support configuration to
>>>>> disable
>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>> push
>>>>>>>>>>>>>>>>>>>>>>> down for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide
>>> whether
>>>>> to
>>>>>>>>>>>> perform
>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>> 
>>>>> 
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>> 
>>>>> 
>>> 
>>> 


Reply via email to