Thanks Becket for the suggestions,

Updated.
Please help review it again when you have time.

Best,
Jiabao


> 2023年12月19日 09:06,Becket Qin <becket....@gmail.com> 写道:
> 
> Hi JIabao,
> 
> Thanks for updating the FLIP. It looks better. Some suggestions / questions:
> 
> 1. In the motivation section:
> 
>> *Currently, Flink Table/SQL does not expose fine-grained control for users
>> to control filter pushdown. **However, filter pushdown has some side
>> effects, such as additional computational pressure on external
>> systems. Moreover, Improper queries can lead to issues such as full table
>> scans, which in turn can impact the stability of external systems.*
> 
> This statement sounds like the side effects are there for all the systems,
> which is inaccurate. Maybe we can say:
> *Currently, Flink Table/SQL does not expose fine-grained control for users
> to control filter pushdown. **However, filter pushdown may have side
> effects in some cases, **such as additional computational pressure on
> external systems. The JDBC source is a typical example of that. If a filter
> is pushed down to the database, an expensive full table scan may happen if
> the filter involves unindexed columns.*
> 
> 2. Regarding the prefix, usually a prefix is not required for the top level
> connector options. This is because the *connector* option is already there.
> So
>    'connector' = 'jdbc',
>  'filter.handling.policy' = 'ALWAYS'
> is sufficient.
> 
> The prefix is needed when the option is for a 2nd+ level. For example,
>    'connector' = 'jdbc',
>    'format' = 'orc',
>    'orc.some.option' = 'some_value'
> In this case, the prefix of "orc" is needed to make it clear this option is
> for the format.
> 
> I am guessing that the reason that currently the connector prefix is there
> is because the values of this configuration may be different depending on
> the connectors. For example, jdbc may have INDEXED_ONLY while MongoDB may
> have something else. Personally speaking, I am fine if we do not have a
> prefix in this case because users have already specified the connector type
> and it is intuitive enough that the option value is for that connector, not
> others.
> 
> 3. can we clarify on the following statement:
> 
>> *Introduce the native configuration [prefix].filter.handling.policy in the
>> connector.*
> 
> What do you mean by "native configuration"? From what I understand, the
> FLIP does the following:
> - introduces a new configuration to the JDBC and MongoDB connector.
> - Suggests a convention option name if other connectors are going to add an
> option for the same purpose.
> 
> Thanks,
> 
> Jiangjie (Becket) Qin
> 
> 
> 
> On Mon, Dec 18, 2023 at 5:45 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid>
> wrote:
> 
>> Hi Becket,
>> 
>> The FLIP document[1] has been updated.
>> Could you help take a look again?
>> 
>> Thanks,
>> Jiabao
>> 
>> [1]
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
>> 
>> 
>>> 2023年12月18日 16:53,Becket Qin <becket....@gmail.com> 写道:
>>> 
>>> Yes, that sounds reasonable to me. We can start with ALWAYS and NEVER,
>> and
>>> add more policies as needed.
>>> 
>>> Thanks,
>>> 
>>> Jiangjie (Becket) Qin
>>> 
>>> On Mon, Dec 18, 2023 at 4:48 PM Jiabao Sun <jiabao....@xtransfer.cn
>> .invalid>
>>> wrote:
>>> 
>>>> Thanks Bucket,
>>>> 
>>>> The jdbc.filter.handling.policy is good to me as it provides sufficient
>>>> extensibility for future filter pushdown optimizations.
>>>> However, currently, we don't have an implementation for the AUTO mode,
>> and
>>>> it seems that the AUTO mode can easily be confused with the ALWAYS mode
>>>> because users don't have the opportunity to MANUALLY decide which
>> filters
>>>> to push down.
>>>> 
>>>> I suggest that we only introduce the ALWAYS and NEVER modes for now, and
>>>> we can consider introducing more flexible policies in the future,
>>>> such as INDEX_ONLY, NUMBERIC_ONLY and so on.
>>>> 
>>>> WDYT?
>>>> 
>>>> Best,
>>>> Jiabao
>>>> 
>>>> 
>>>> 
>>>>> 2023年12月18日 16:27,Becket Qin <becket....@gmail.com> 写道:
>>>>> 
>>>>> Hi Jiabao,
>>>>> 
>>>>> Please see the reply inline.
>>>>> 
>>>>> 
>>>>>> The MySQL connector is currently in the flink-connector-jdbc
>> repository
>>>>>> and is not a standalone connector.
>>>>>> Is it too unique to use "mysql" as the configuration option prefix?
>>>>> 
>>>>> If the intended behavior makes sense to all the supported JDBC drivers,
>>>> we
>>>>> can make this a JDBC connector configuration.
>>>>> 
>>>>> Also, I would like to ask about the difference in behavior between AUTO
>>>> and
>>>>>> ALWAYS.
>>>>>> It seems that we cannot guarantee the pushing down of all filters to
>> the
>>>>>> external system under the ALWAYS
>>>>>> mode because not all filters in Flink SQL are supported by the
>> external
>>>>>> system.
>>>>>> Should we throw an error when encountering a filter that cannot be
>>>> pushed
>>>>>> down in the ALWAYS mode?
>>>>> 
>>>>> The idea of AUTO is to do efficiency-aware pushdowns. The source will
>>>> query
>>>>> the external system (MySQL, Oracle, SQL Server, etc) first to retrieve
>>>> the
>>>>> information of the table. With that information, the source will decide
>>>>> whether to further push a filter to the external system based on the
>>>>> efficiency. E.g. only push the indexed fields. In contrast, ALWAYS will
>>>>> just always push the supported filters to the external system,
>> regardless
>>>>> of the efficiency. In case there are filters that are not supported,
>>>>> according to the current contract of SupportsFilterPushdown, these
>>>>> unsupported filters should just be returned by the
>>>>> *SupportsFilterPushdown.applyFilters()* method as remaining filters.
>>>>> Therefore, there is no need to throw exceptions here. This is likely
>> the
>>>>> desired behavior for most users, IMO. If there are cases that users
>>>> really
>>>>> want to get alerted when a filter cannot be pushed to the external
>>>> system,
>>>>> we can add another value like "ENFORCED_ALWAYS", which behaves like
>>>> ALWAYS,
>>>>> but throws exceptions when a filter cannot be applied to the external
>>>>> system. But personally I don't see much value in doing this.
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Jiangjie (Becket) Qin
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mon, Dec 18, 2023 at 3:54 PM Jiabao Sun <jiabao....@xtransfer.cn
>>>> .invalid>
>>>>> wrote:
>>>>> 
>>>>>> Hi Becket,
>>>>>> 
>>>>>> The MySQL connector is currently in the flink-connector-jdbc
>> repository
>>>>>> and is not a standalone connector.
>>>>>> Is it too unique to use "mysql" as the configuration option prefix?
>>>>>> 
>>>>>> Also, I would like to ask about the difference in behavior between
>> AUTO
>>>>>> and ALWAYS.
>>>>>> It seems that we cannot guarantee the pushing down of all filters to
>> the
>>>>>> external system under the ALWAYS
>>>>>> mode because not all filters in Flink SQL are supported by the
>> external
>>>>>> system.
>>>>>> Should we throw an error when encountering a filter that cannot be
>>>> pushed
>>>>>> down in the ALWAYS mode?
>>>>>> 
>>>>>> Thanks,
>>>>>> Jiabao
>>>>>> 
>>>>>>> 2023年12月18日 15:34,Becket Qin <becket....@gmail.com> 写道:
>>>>>>> 
>>>>>>> Hi JIabao,
>>>>>>> 
>>>>>>> Thanks for updating the FLIP. Maybe I did not explain it clearly
>>>> enough.
>>>>>> My
>>>>>>> point is that given there are various good flavors of behaviors
>>>> handling
>>>>>>> filters pushed down, we should not have a common config of
>>>>>>> "ignore.filter.pushdown", because the behavior is not *common*.
>>>>>>> 
>>>>>>> It looks like the original motivation of this FLIP is just for MySql.
>>>>>> Let's
>>>>>>> focus on what is the best solution for MySql connector here first.
>>>> After
>>>>>>> that, if people think the best behavior for MySql happens to be a
>>>> common
>>>>>>> one, we can then discuss whether that is worth being added to the
>> base
>>>>>>> implementation of source. For MySQL, if we are going to introduce a
>>>>>> config
>>>>>>> to MySql, why not have something like "mysql.filter.handling.policy"
>>>> with
>>>>>>> value of AUTO / NEVER / ALWAYS? Isn't that better than
>>>>>>> "ignore.filter.pushdown"?
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> 
>>>>>>> Jiangjie (Becket) Qin
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> On Sun, Dec 17, 2023 at 11:30 PM Jiabao Sun <jiabao....@xtransfer.cn
>>>>>> .invalid>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hi Becket,
>>>>>>>> 
>>>>>>>> The FLIP document has been updated as well.
>>>>>>>> Please take a look when you have time.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Jiabao
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> 2023年12月17日 22:54,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道:
>>>>>>>>> 
>>>>>>>>> Thanks Becket,
>>>>>>>>> 
>>>>>>>>> I apologize for not being able to continue with this proposal due
>> to
>>>>>>>> being too busy during this period.
>>>>>>>>> 
>>>>>>>>> The viewpoints you shared about the design of Flink Source make
>> sense
>>>>>> to
>>>>>>>> me
>>>>>>>>> The native configuration ‘ignore.filter.pushdown’ is good to me.
>>>>>>>>> Having a unified name or naming style can indeed prevent confusion
>>>> for
>>>>>>>> users regarding
>>>>>>>>> the inconsistent naming of this configuration across different
>>>>>>>> connectors.
>>>>>>>>> 
>>>>>>>>> Currently, there are not many external connectors that support
>> filter
>>>>>>>> pushdown.
>>>>>>>>> I propose that we first introduce it in flink-connector-jdbc and
>>>>>>>> flink-connector-mongodb.
>>>>>>>>> Do you think this is feasible?
>>>>>>>>> 
>>>>>>>>> Best,
>>>>>>>>> Jiabao
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>>> 2023年11月16日 17:45,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>> 
>>>>>>>>>> Hi Jiabao,
>>>>>>>>>> 
>>>>>>>>>> Arguments like "because Spark has it so Flink should also have it"
>>>>>> does
>>>>>>>> not
>>>>>>>>>> make sense. Different projects have different API flavors and
>>>> styles.
>>>>>>>> What
>>>>>>>>>> is really important is the rationale and the design principle
>> behind
>>>>>> the
>>>>>>>>>> API. They should conform to the convention of the project.
>>>>>>>>>> 
>>>>>>>>>> First of all, Spark Source API itself has a few issues and they
>>>> ended
>>>>>> up
>>>>>>>>>> introduce DataSource V2 in Spark 3.0, which added the decorative
>>>>>>>> interfaces
>>>>>>>>>> like SupportsPushdownXXX. Some of the configurations predating
>>>>>>>> DataSource
>>>>>>>>>> V2 may still be there.
>>>>>>>>>> 
>>>>>>>>>> For the Spark configurations you mentioned, they are all the
>>>>>>>> configurations
>>>>>>>>>> for FileScanBuilder, which is equivalent to FileSource in Flink.
>>>>>>>> Currently,
>>>>>>>>>> regardless of the format (ORC, Parquet, Avro, etc), the FileSource
>>>>>>>> pushes
>>>>>>>>>> back all the filters to ensure correctness. The actual filters
>> that
>>>>>> got
>>>>>>>>>> applied to the specific format might still be different. This
>>>>>>>>>> implementation is the same in FileScanBuilder.pushFilters() for
>>>>>> Spark. I
>>>>>>>>>> don't know why Spark got separate configurations for each format.
>>>>>> Maybe
>>>>>>>> it
>>>>>>>>>> is because the filters are actually implemented differently for
>>>>>>>> different
>>>>>>>>>> format.
>>>>>>>>>> 
>>>>>>>>>> At least for the current implementation in FileScanBuilder, these
>>>>>>>>>> configurations can be merged to one configuration like
>>>>>>>>>> `apply.filters.to.format.enabled`. Note that this config, as well
>> as
>>>>>> the
>>>>>>>>>> separate configs you mentioned, are just visible and used by the
>>>>>>>>>> FileScanBuilder. It determines whether the filters should be
>> passed
>>>>>>>> down to
>>>>>>>>>> the format of the FileScanBuilder instance. Regardless of the
>> value
>>>> of
>>>>>>>>>> these configs, FileScanBuilder.pushFilters() will always be
>> called,
>>>>>> and
>>>>>>>>>> FileScanBuilder (as well as FileSource in Flink) will always push
>>>> back
>>>>>>>> all
>>>>>>>>>> the filters to the framework.
>>>>>>>>>> 
>>>>>>>>>> A MySql source can have a very different way to handle this. For
>>>>>>>> example, A
>>>>>>>>>> MySql source  A config in this case might be "my.apply.filters"
>> with
>>>>>>>> three
>>>>>>>>>> different values:
>>>>>>>>>> - AUTO: The Source will issue a DESC Table request to understand
>>>>>>>> whether a
>>>>>>>>>> filter can be applied efficiently. And decide which filters can be
>>>>>>>> applied
>>>>>>>>>> and which cannot based on that.
>>>>>>>>>> - NEVER: Never apply filtering. It will always do a full table
>> read
>>>>>> and
>>>>>>>>>> let Flink do the filtering.
>>>>>>>>>> - ALWAYS: Always apply the filtering to the MySql server.
>>>>>>>>>> 
>>>>>>>>>> In the above examples of FileSource and MySql Source, I don't
>> think
>>>> it
>>>>>>>> is a
>>>>>>>>>> good idea to shoehorn the behaviors into a naive config of
>>>>>>>>>> `ignore.filter.pushdown`. That is why I don't think this is a
>> common
>>>>>>>> config.
>>>>>>>>>> 
>>>>>>>>>> To recap, like I said, I do agree that in some cases, we may want
>> to
>>>>>>>> behave
>>>>>>>>>> differently when filters are pushed down to the sources, even if a
>>>>>>>> source
>>>>>>>>>> implements SupportsFilterPushDown, but I don't think there is a
>>>>>> suitable
>>>>>>>>>> common config for this. The behavior is very likely source
>> specific.
>>>>>>>>>> 
>>>>>>>>>> Thanks,
>>>>>>>>>> 
>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> On Thu, Nov 16, 2023 at 3:41 PM Jiabao Sun <
>> jiabao....@xtransfer.cn
>>>>>>>> .invalid>
>>>>>>>>>> wrote:
>>>>>>>>>> 
>>>>>>>>>>> Thanks Becket,
>>>>>>>>>>> 
>>>>>>>>>>> I still believe that adding a configuration at the source level
>> to
>>>>>>>> disable
>>>>>>>>>>> filter pushdown is needed. This demand exists in spark as
>> well[1].
>>>>>>>>>>> 
>>>>>>>>>>> In Spark, most sources that support filter pushdown provide their
>>>> own
>>>>>>>>>>> corresponding configuration options to enable or disable filter
>>>>>>>> pushdown.
>>>>>>>>>>> For PRs[2-4] that support filter pushdown capability, they also
>>>>>> provide
>>>>>>>>>>> configuration options to disable this capability.
>>>>>>>>>>> 
>>>>>>>>>>> I believe this configuration is applicable to most scenarios, and
>>>>>>>> there is
>>>>>>>>>>> no need to dwell on why this configuration option was not
>>>> introduced
>>>>>>>>>>> earlier than the SupportsFilterPushDown interface.
>>>>>>>>>>> 
>>>>>>>>>>> spark.sql.parquet.filterPushdown
>>>>>>>>>>> spark.sql.orc.filterPushdown
>>>>>>>>>>> spark.sql.csv.filterPushdown.enabled
>>>>>>>>>>> spark.sql.json.filterPushdown.enabled
>>>>>>>>>>> spark.sql.avro.filterPushdown.enabled
>>>>>>>>>>> JDBC Option: pushDownPredicate
>>>>>>>>>>> 
>>>>>>>>>>> We can see that the lack of consistency is caused by each
>> connector
>>>>>>>>>>> introducing different configuration options for the same
>> behavior.
>>>>>>>>>>> This is one of the motivations for advocating the introduction
>> of a
>>>>>>>>>>> unified configuration name.
>>>>>>>>>>> 
>>>>>>>>>>> [1] https://issues.apache.org/jira/browse/SPARK-24288
>>>>>>>>>>> [2] https://github.com/apache/spark/pull/27366
>>>>>>>>>>> [3]https://github.com/apache/spark/pull/26973
>>>>>>>>>>> [4] https://github.com/apache/spark/pull/29145
>>>>>>>>>>> 
>>>>>>>>>>> Best,
>>>>>>>>>>> Jiabao
>>>>>>>>>>> 
>>>>>>>>>>>> 2023年11月16日 08:10,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>> 
>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>> 
>>>>>>>>>>>> While we can always fix the formality of the config, a more
>>>>>>>> fundamental
>>>>>>>>>>>> issue here is whether this configuration is common enough.
>>>>>> Personally
>>>>>>>> I
>>>>>>>>>>> am
>>>>>>>>>>>> still not convinced it is.
>>>>>>>>>>>> 
>>>>>>>>>>>> Remember we don't have a common implementation for
>>>>>>>> SupportsFilterPushdown
>>>>>>>>>>>> itself. Why does a potential behavior of the
>>>>>>>>>>>> SupportsFilterPushdown.applyFilters() method deserve a common
>>>>>>>>>>>> configuration? A common implementation should always come first,
>>>>>> then
>>>>>>>> its
>>>>>>>>>>>> configuration becomes a common configuration as a natural
>> result.
>>>>>> But
>>>>>>>>>>> here
>>>>>>>>>>>> we are trying to add an impl to a configuration just to fix its
>>>>>>>>>>> formality.
>>>>>>>>>>>> 
>>>>>>>>>>>> I agree that there might be a few Source implementations that
>> may
>>>>>>>> want to
>>>>>>>>>>>> avoid additional burdens on the remote system in some
>>>> circumstances.
>>>>>>>> And
>>>>>>>>>>>> these circumstances are very specific:
>>>>>>>>>>>> 1. The source talks to a remote service that can help perform
>> the
>>>>>>>> actual
>>>>>>>>>>>> filtering.
>>>>>>>>>>>> 2. The filtering done by the remote service is inefficient for
>>>> some
>>>>>>>>>>> reason
>>>>>>>>>>>> (e.g. missing index)
>>>>>>>>>>>> 3. The external service does not want to perform the inefficient
>>>>>>>>>>> filtering
>>>>>>>>>>>> for some reason (e.g. it is a shared service with others)
>>>>>>>>>>>> 
>>>>>>>>>>>> There are multiple approaches to address the issue. Pushing back
>>>> the
>>>>>>>>>>>> filters is just one way of achieving this. So here we are
>> talking
>>>>>>>> about a
>>>>>>>>>>>> config for one of the possible solutions to a scenario with all
>>>> the
>>>>>>>> above
>>>>>>>>>>>> situations. I don't think there is enough justification for the
>>>>>>>> config to
>>>>>>>>>>>> be common.
>>>>>>>>>>>> 
>>>>>>>>>>>> There is always this trade-off between the proliferation of
>> public
>>>>>>>>>>>> interfaces and the API standardization. As an extreme example,
>> we
>>>>>> can
>>>>>>>>>>> make
>>>>>>>>>>>> our public API a union of all the configs potentially used in
>> all
>>>>>> the
>>>>>>>>>>> cases
>>>>>>>>>>>> in the name of standardization. Apparently this won't work. So
>>>> there
>>>>>>>> must
>>>>>>>>>>>> be a bar here and this bar might be somewhat subjective. For
>> this
>>>>>>>> FLIP,
>>>>>>>>>>>> personally I don't think the config meets my bar for the reason
>>>>>> stated
>>>>>>>>>>>> above.
>>>>>>>>>>>> 
>>>>>>>>>>>> Therefore, my suggestion remains the same. Keep the config as a
>>>>>> Source
>>>>>>>>>>>> implementation specific configuration.
>>>>>>>>>>>> 
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> 
>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Thu, Nov 16, 2023 at 12:36 AM Jiabao Sun <
>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>> .invalid>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> 
>>>>>>>>>>>>> Thanks Becket for the feedback,
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Regarding concerns about common configurations, I think we can
>>>>>>>> introduce
>>>>>>>>>>>>> FiltersApplier to unify the behavior of various connectors.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> public static class FiltersApplier {
>>>>>>>>>>>>> 
>>>>>>>>>>>>> private final ReadableConfig config;
>>>>>>>>>>>>> private final Function<List<ResolvedExpression>, Result>
>> action;
>>>>>>>>>>>>> 
>>>>>>>>>>>>> private FiltersApplier(
>>>>>>>>>>>>>      ReadableConfig config,
>>>>>>>>>>>>>      Function<List<ResolvedExpression>, Result> action) {
>>>>>>>>>>>>>  this.config = config;
>>>>>>>>>>>>>  this.action = action;
>>>>>>>>>>>>> }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
>>>>>>>>>>>>>  if (config.get(ENABLE_FILTER_PUSH_DOWN)) {
>>>>>>>>>>>>>      return action.apply(filters);
>>>>>>>>>>>>>  } else {
>>>>>>>>>>>>>      return Result.of(Collections.emptyList(), filters);
>>>>>>>>>>>>>  }
>>>>>>>>>>>>> }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> public static FiltersApplier of(
>>>>>>>>>>>>>      ReadableConfig config,
>>>>>>>>>>>>>      Function<List<ResolvedExpression>, Result> action) {
>>>>>>>>>>>>>  return new FiltersApplier(config, action);
>>>>>>>>>>>>> }
>>>>>>>>>>>>> }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> For connectors implementation:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> @Override
>>>>>>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
>>>>>>>>>>>>> return FiltersApplier.of(config,
>>>>>>>>>>>>>      f -> Result.of(new ArrayList<>(filters),
>>>>>>>>>>>>> Collections.emptyList()));
>>>>>>>>>>>>> }
>>>>>>>>>>>>> 
>>>>>>>>>>>>> As for the name, whether it is
>> "source.filter-push-down.enabled"
>>>> or
>>>>>>>>>>>>> "source.ignore-pushed-down-filters.enabled", I think both are
>>>> okay.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Do you think this change is feasible?
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Best,
>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2023年11月15日 23:44,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Yes, I still have concerns.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> The FLIP violates the following two principles regarding
>>>>>>>> configuration:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 1.* A config of a class should never negate the semantic of a
>>>>>>>>>>> decorative
>>>>>>>>>>>>>> interface implemented by that class. *
>>>>>>>>>>>>>> A decorative interface is a public contract with other
>>>> components,
>>>>>>>>>>> while
>>>>>>>>>>>>> a
>>>>>>>>>>>>>> config is only internal to the class itself. The
>> configurations
>>>>>> for
>>>>>>>> the
>>>>>>>>>>>>>> Sources are not (and should never be) visible or understood to
>>>>>>>>>>>>>> other components (e.g. optimizer). A configuration of a Source
>>>>>> only
>>>>>>>>>>>>>> controls the behavior of that Source, provided it is not
>>>> violating
>>>>>>>> the
>>>>>>>>>>>>> API
>>>>>>>>>>>>>> contract / semantic defined by the decorative interface. So
>>>> when a
>>>>>>>>>>> Source
>>>>>>>>>>>>>> implementation implements SupportsFilterPushdown, this is a
>>>> clear
>>>>>>>>>>> public
>>>>>>>>>>>>>> contract with Flink that filters should be pushed down to that
>>>>>>>> Source.
>>>>>>>>>>>>>> Therefore, for the same source, there should not be a
>>>>>> configuration
>>>>>>>>>>>>>> "source.filter-push-down.enabled" which stops the filters from
>>>>>> being
>>>>>>>>>>>>> pushed
>>>>>>>>>>>>>> down to that Source. However, that specific source
>>>> implementation
>>>>>>>> can
>>>>>>>>>>>>> have
>>>>>>>>>>>>>> its own config to control its internal behavior, e.g.
>>>>>>>>>>>>>> "ignore-pushed-down-filters.enabled" which may push back all
>> the
>>>>>>>> pushed
>>>>>>>>>>>>>> down filters back to the Flink optimizer.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 2. When we are talking about "common configs", in fact we are
>>>>>>>> talking
>>>>>>>>>>>>> about
>>>>>>>>>>>>>> "configs for common (abstract) implementation classes". With
>>>> that
>>>>>>>> as a
>>>>>>>>>>>>>> context, *a common config should always be backed by a common
>>>>>>>>>>>>>> implementation class, so that consistent behavior can be
>>>>>>>> guaranteed. *
>>>>>>>>>>>>>> The LookupOptions you mentioned are configurations defined for
>>>>>>>> classes
>>>>>>>>>>>>>> DefaultLookupCache / PeriodicCacheReloadTrigger /
>>>>>>>>>>>>> TimedCacheReloadTrigger.
>>>>>>>>>>>>>> These configs are considered as "common" only because the
>>>>>>>>>>> implementation
>>>>>>>>>>>>>> classes using them are common building blocks for lookup table
>>>>>>>>>>>>>> implementations. It would not make sense to have a dangling
>>>> config
>>>>>>>> in
>>>>>>>>>>> the
>>>>>>>>>>>>>> LookupOptions without the underlying common implementation
>>>> class,
>>>>>>>> but
>>>>>>>>>>>>> only
>>>>>>>>>>>>>> relies on a specific source to implement the stated behavior.
>>>>>>>>>>>>>> As a bad example, there is this outlier config "max-retries"
>> in
>>>>>>>>>>>>>> LookupOptions, which I don't think should be here. This is
>>>> because
>>>>>>>> the
>>>>>>>>>>>>>> retry behavior can be very implementation specific. For
>> example,
>>>>>>>> there
>>>>>>>>>>>>> can
>>>>>>>>>>>>>> be many different flavors of retry related configurations,
>>>>>>>>>>> retry-backoff,
>>>>>>>>>>>>>> retry-timeout, retry-async, etc. Why only max-retry is put
>> here?
>>>>>>>> should
>>>>>>>>>>>>> all
>>>>>>>>>>>>>> of them be put here? If we put all such kinds of configs in
>> the
>>>>>>>> common
>>>>>>>>>>>>>> configs for "standardization and unification", the number of
>>>>>> "common
>>>>>>>>>>>>>> configs" can easily go crazy. And I don't see material
>> benefits
>>>> of
>>>>>>>>>>> doing
>>>>>>>>>>>>>> that. So here I don't think the configuration "max-retry"
>> should
>>>>>> be
>>>>>>>> in
>>>>>>>>>>>>>> LookupOptions, because it is not backed by any common
>>>>>> implementation
>>>>>>>>>>>>>> classes. If max-retry is implemented in the HBase source, it
>>>>>> should
>>>>>>>>>>> stay
>>>>>>>>>>>>>> there. For the same reason, the config proposed in this FLIP
>>>>>>>> (probably
>>>>>>>>>>>>> with
>>>>>>>>>>>>>> a name less confusing for the first reason mentioned above)
>>>>>> should
>>>>>>>>>>> stay
>>>>>>>>>>>>> in
>>>>>>>>>>>>>> the specific Source implementation.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> For the two reasons above, I am -1 to what the FLIP currently
>>>>>>>> proposes.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think the right way to address the motivation here is still
>> to
>>>>>>>> have a
>>>>>>>>>>>>>> config like "ignore-pushed-down-filters.enabled" for the
>>>> specific
>>>>>>>>>>> source
>>>>>>>>>>>>>> implementation. Please let me know if this solves the problem
>>>> you
>>>>>>>> are
>>>>>>>>>>>>>> facing.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Wed, Nov 15, 2023 at 11:52 AM Jiabao Sun <
>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The purpose of introducing this configuration is that not all
>>>>>>>> filter
>>>>>>>>>>>>>>> pushdowns can improve overall performance.
>>>>>>>>>>>>>>> If the filter can hit the external index, then pushdown is
>>>>>>>> definitely
>>>>>>>>>>>>>>> worth it, as it can not only improve query time but also
>>>> decrease
>>>>>>>>>>>>> network
>>>>>>>>>>>>>>> overhead.
>>>>>>>>>>>>>>> However, for filters that do not hit the external index, it
>> may
>>>>>>>>>>>>> increase a
>>>>>>>>>>>>>>> lot of performance overhead on the external system.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Undeniably, if the connector can make accurate decisions for
>>>> good
>>>>>>>> and
>>>>>>>>>>>>> bad
>>>>>>>>>>>>>>> filters, we may not need to introduce this configuration
>> option
>>>>>> to
>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>> pushing down filters to the external system.
>>>>>>>>>>>>>>> However, it is currently not easy to achieve.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> IMO, supporting filter pushdown does not mean that always
>>>> filter
>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>> is better.
>>>>>>>>>>>>>>> In the absence of automatic decision-making, I think we
>> should
>>>>>>>> leave
>>>>>>>>>>>>> this
>>>>>>>>>>>>>>> decision to users.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The newly introduced configuration option is similar to
>>>>>>>> LookupOptions,
>>>>>>>>>>>>>>> providing unified naming and default values to avoid
>> confusion
>>>>>>>> caused
>>>>>>>>>>> by
>>>>>>>>>>>>>>> inconsistent naming in different connectors for users.
>>>>>>>>>>>>>>> Setting the default value to true allows it to maintain
>>>>>>>> compatibility
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>> the default behavior of "always pushdown".
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Do you have any other concerns about this proposal? Please
>> let
>>>> me
>>>>>>>>>>> know.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 2023年10月31日 17:29,Jiabao Sun <jiabao....@xtransfer.cn
>>>> .INVALID>
>>>>>>>> 写道:
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Actually, for FileSystemSource, it is not always desired,
>> only
>>>>>> OCR
>>>>>>>>>>> file
>>>>>>>>>>>>>>> formats support filter pushdown.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> We can disable predicate pushdown for FileSystemSource by
>>>>>> setting
>>>>>>>>>>>>>>> 'table.optimizer.source.predicate-pushdown-enabled' to false.
>>>>>>>>>>>>>>>> I think we can also disable filter pushdown at a more
>> granular
>>>>>>>> level
>>>>>>>>>>>>>>> through fine-grained configuration.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 2023年10月31日 16:50,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks for the explanation. Maybe it's easier to explain
>> with
>>>>>> an
>>>>>>>>>>>>>>> example.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Let's take FileSystemTableSource as an example. Currently
>> it
>>>>>>>>>>>>> implements
>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface. With your proposal, does
>> it
>>>>>>>> have
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>> support `source.filter-push-down.enabled` as well? But this
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>> does not quite make sense for the FileSystemTableSource
>>>> because
>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>> pushdown is always desired. However, because this
>>>> configuration
>>>>>>>> is a
>>>>>>>>>>>>>>> part
>>>>>>>>>>>>>>>>> of the SupportsFilterPushDown interface (which sounds
>>>> confusing
>>>>>>>> to
>>>>>>>>>>>>> begin
>>>>>>>>>>>>>>>>> with), the FileSystemTableSource can only do one of the
>>>>>>>> following:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> 1. Ignore the user configuration to always apply the pushed
>>>>>> down
>>>>>>>>>>>>>>> filters -
>>>>>>>>>>>>>>>>> this is an apparent anti-pattern because a configuration
>>>> should
>>>>>>>>>>> always
>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>> what it says.
>>>>>>>>>>>>>>>>> 2. Throw an exception telling users that this configuration
>>>> is
>>>>>>>> not
>>>>>>>>>>>>>>>>> applicable to the FileSystemTableSource.
>>>>>>>>>>>>>>>>> 3. Implement this configuration to push back the pushed
>> down
>>>>>>>>>>> filters,
>>>>>>>>>>>>>>> even
>>>>>>>>>>>>>>>>> though this is never desired.
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> None of the above options looks awkward. I am curious what
>>>> your
>>>>>>>>>>>>>>> solution is
>>>>>>>>>>>>>>>>> here?
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>> On Tue, Oct 31, 2023 at 3:11 PM Jiabao Sun <
>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Thanks Becket for the further explanation.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Perhaps I didn't explain it clearly.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 1. If a source does not implement the
>> SupportsFilterPushDown
>>>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>> the newly added configurations do not need to be added to
>>>>>> either
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> requiredOptions or optionalOptions.
>>>>>>>>>>>>>>>>>> Similar to LookupOptions, if a source does not implement
>>>>>>>>>>>>>>>>>> LookupTableSource, there is no need to add LookupOptions
>> to
>>>>>>>> either
>>>>>>>>>>>>>>>>>> requiredOptions or optionalOptions.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 2. "And these configs are specific to those sources,
>> instead
>>>>>> of
>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>>> configs."
>>>>>>>>>>>>>>>>>> The newly introduced configurations define standardized
>>>> names
>>>>>>>> and
>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>> values.
>>>>>>>>>>>>>>>>>> They still belong to the configuration at the individual
>>>>>> source
>>>>>>>>>>>>> level.
>>>>>>>>>>>>>>>>>> The purpose is to avoid scattered configuration items when
>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>> sources implement the same logic.
>>>>>>>>>>>>>>>>>> Whether a source should accept these configurations is
>>>>>>>> determined
>>>>>>>>>>> by
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> source's Factory.
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> 2023年10月31日 13:47,Becket Qin <becket....@gmail.com> 写道:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Please see the replies inline.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Introducing common configurations does not mean that all
>>>>>>>> sources
>>>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>>>> accept these configuration options.
>>>>>>>>>>>>>>>>>>>> The configuration options supported by a source are
>>>>>>>> determined by
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory
>>>>>> interface.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> This is not true. Both required and optional options are
>>>>>>>>>>> SUPPORTED.
>>>>>>>>>>>>>>> That
>>>>>>>>>>>>>>>>>>> means they are implemented and if one specifies an
>> optional
>>>>>>>> config
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>> still take effect. An OptionalConfig is "Optional"
>> because
>>>>>> this
>>>>>>>>>>>>>>>>>>> configuration has a default value. Hence, it is OK that
>>>> users
>>>>>>>> do
>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>> specify their own value. In another word, it is
>> "optional"
>>>>>> for
>>>>>>>> the
>>>>>>>>>>>>> end
>>>>>>>>>>>>>>>>>>> users to set the config, but the implementation and
>> support
>>>>>> for
>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>>>>> is NOT optional. In case a source does not support a
>> common
>>>>>>>>>>> config,
>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>> exception must be thrown when the config is provided by
>> the
>>>>>> end
>>>>>>>>>>>>> users.
>>>>>>>>>>>>>>>>>>> However, the config we are talking about in this FLIP is
>> a
>>>>>>>> common
>>>>>>>>>>>>>>> config
>>>>>>>>>>>>>>>>>>> optional to implement, meaning that sometimes the claimed
>>>>>>>> behavior
>>>>>>>>>>>>>>> won't
>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>> there even if users specify that config.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Similar to sources that do not implement the
>>>>>> LookupTableSource
>>>>>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
>>>>>>>>>>> interface
>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>> not need to accept newly introduced options.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> First of all, filter pushdown is a behavior of the query
>>>>>>>>>>> optimizer,
>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> behavior of Sources. The Sources tells the optimizer that
>>>> it
>>>>>>>> has
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> ability to accept pushed down filters by implementing the
>>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface. And this is the only
>>>>>> contract
>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>> Source and Optimizer regarding whether filters should be
>>>>>> pushed
>>>>>>>>>>>>> down.
>>>>>>>>>>>>>>> As
>>>>>>>>>>>>>>>>>>> long as a specific source implements this decorative
>>>>>> interface,
>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>> pushdown should always take place, i.e.
>>>>>>>>>>>>>>>>>>> *SupportsFilterPushDown.applyFilters()* will be called.
>>>> There
>>>>>>>>>>> should
>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> no
>>>>>>>>>>>>>>>>>>> other config to disable that call. However, Sources can
>>>>>> decide
>>>>>>>> how
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>> behave based on their own configurations after
>>>>>>>> *applyFilters()* is
>>>>>>>>>>>>>>>>>> called.
>>>>>>>>>>>>>>>>>>> And these configs are specific to those sources, instead
>> of
>>>>>>>> common
>>>>>>>>>>>>>>>>>> configs.
>>>>>>>>>>>>>>>>>>> Please see the examples I mentioned in the previous
>> email.
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>> On Tue, Oct 31, 2023 at 10:27 AM Jiabao Sun <
>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Sorry, there was a typo in the second point. Let me
>>>> correct
>>>>>>>> it:
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Introducing common configurations does not mean that all
>>>>>>>> sources
>>>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>>>> accept these configuration options.
>>>>>>>>>>>>>>>>>>>> The configuration options supported by a source are
>>>>>>>> determined by
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory
>>>>>> interface.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Similar to sources that do not implement the
>>>>>> LookupTableSource
>>>>>>>>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
>>>>>>>>>>> interface
>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>> do
>>>>>>>>>>>>>>>>>>>> not need to accept newly introduced options.
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 2023年10月31日 10:13,Jiabao Sun <jiabao....@xtransfer.cn
>>>>>>>> .INVALID>
>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Thanks Becket for the feedback.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 1. Currently, the SupportsFilterPushDown#applyFilters
>>>>>> method
>>>>>>>>>>>>>>> returns a
>>>>>>>>>>>>>>>>>>>> result that includes acceptedFilters and
>> remainingFilters.
>>>>>> The
>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>> decide to push down some filters or not accept any of
>>>> them.
>>>>>>>>>>>>>>>>>>>>> 2. Introducing common configuration options does not
>> mean
>>>>>>>> that a
>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>> that supports the SupportsFilterPushDown capability must
>>>>>>>> accept
>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>> configuration. Similar to LookupOptions, only sources
>> that
>>>>>>>>>>>>> implement
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> LookupTableSource interface are necessary to accept
>> these
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>> options.
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> 2023年10月31日 07:49,Becket Qin <becket....@gmail.com>
>> 写道:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao and Ruanhang,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Adding a configuration of
>>>> source.filter-push-down.enabled
>>>>>>>> as a
>>>>>>>>>>>>>>> common
>>>>>>>>>>>>>>>>>>>>>> source configuration seems problematic.
>>>>>>>>>>>>>>>>>>>>>> 1. The config name is misleading. filter pushdown
>> should
>>>>>>>> only
>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>> determined
>>>>>>>>>>>>>>>>>>>>>> by whether the SupportsFilterPushdown interface is
>>>>>>>> implemented
>>>>>>>>>>> or
>>>>>>>>>>>>>>> not.
>>>>>>>>>>>>>>>>>>>>>> 2. The behavior of this configuration is only
>> applicable
>>>>>> to
>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>> implementations. Why is it a common configuration?
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Here's my suggestion for design principles:
>>>>>>>>>>>>>>>>>>>>>> 1. Only add source impl specific configuration to
>>>>>>>> corresponding
>>>>>>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>> 2. The configuration name should not overrule existing
>>>>>>>> common
>>>>>>>>>>>>>>>>>> contracts.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> For example, in the case of MySql source. There are
>>>>>> several
>>>>>>>>>>>>>>> options:
>>>>>>>>>>>>>>>>>>>>>> 1. Have a configuration of
>>>>>>>>>>>>> `*mysql.avoid.remote.full.table.scan`*.
>>>>>>>>>>>>>>> If
>>>>>>>>>>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>> configuration is set, and a filter pushdown does not
>> hit
>>>>>> an
>>>>>>>>>>>>> index,
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> MySql source impl would not further pushdown the
>> filter
>>>> to
>>>>>>>>>>> MySql
>>>>>>>>>>>>>>>>>>>> servers.
>>>>>>>>>>>>>>>>>>>>>> Note that this assumes the MySql source can retrieve
>> the
>>>>>>>> index
>>>>>>>>>>>>>>>>>>>> information
>>>>>>>>>>>>>>>>>>>>>> from the MySql servers.
>>>>>>>>>>>>>>>>>>>>>> 2. If the MySql index information is not available to
>>>> the
>>>>>>>> MySql
>>>>>>>>>>>>>>>>>> source,
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> configuration could be something like
>>>>>>>>>>>>>>>>>>>> *`mysql.pushback.pushed.down.filters`*.
>>>>>>>>>>>>>>>>>>>>>> Once set to true, MySql source would just add all the
>>>>>>>> filters
>>>>>>>>>>> to
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> RemainingFilters in the Result returned by
>>>>>>>>>>>>>>>>>>>>>> *SupportsFilterPushdown.applyFilters().*
>>>>>>>>>>>>>>>>>>>>>> 3. An alternative to option 2 is to have a `
>>>>>>>>>>>>>>>>>>>>>> *mysql.apply.predicates.after.scan*`. When it is set
>> to
>>>>>>>> true,
>>>>>>>>>>>>> MySql
>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>> will not push the filter down to the MySql servers,
>> but
>>>>>>>> apply
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>>>>>> inside the MySql source itself.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> As you may see, the above configurations do not
>> disable
>>>>>>>> filter
>>>>>>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>>>>>> itself. They just allow various implementations of
>>>> filter
>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>> And
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>> configuration name does not give any illusion that
>>>> filter
>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>> disabled.
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>> On Mon, Oct 30, 2023 at 11:58 PM Jiabao Sun <
>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Thanks Hang for the suggestion.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I think the configuration of TableSource is not
>> closely
>>>>>>>>>>> related
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> SourceReader,
>>>>>>>>>>>>>>>>>>>>>>> so I prefer to introduce a independent configuration
>>>>>> class
>>>>>>>>>>>>>>>>>>>>>>> TableSourceOptions in the flink-table-common module,
>>>>>>>> similar
>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> LookupOptions.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> For the second point, I suggest adding Java doc to
>> the
>>>>>>>>>>>>>>>>>>>> SupportsXXXPushDown
>>>>>>>>>>>>>>>>>>>>>>> interfaces, providing detailed information on these
>>>>>> options
>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>> needs
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> be supported.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> I have made updates in the FLIP document.
>>>>>>>>>>>>>>>>>>>>>>> Please help check it again.
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com
>>> 
>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the improvements, Jiabao.
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> There are some details that I am not sure about.
>>>>>>>>>>>>>>>>>>>>>>>> 1. The new option `source.filter-push-down.enabled`
>>>> will
>>>>>>>> be
>>>>>>>>>>>>>>> added to
>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>> class? I think it should be `SourceReaderOptions`.
>>>>>>>>>>>>>>>>>>>>>>>> 2. How are the connector developers able to know and
>>>>>>>> follow
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> FLIP?
>>>>>>>>>>>>>>>>>>>> Do
>>>>>>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>>>> need an abstract base class or provide a default
>>>> method?
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>> Hang
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid>
>>>>>>>> 于2023年10月30日周一
>>>>>>>>>>>>>>>>>> 14:45写道:
>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Hi, all,
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the lively discussion.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Based on the discussion, I have made some
>> adjustments
>>>>>> to
>>>>>>>> the
>>>>>>>>>>>>>>> FLIP
>>>>>>>>>>>>>>>>>>>>>>> document:
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 1. The name of the newly added option has been
>>>> changed
>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled".
>>>>>>>>>>>>>>>>>>>>>>>>> 2. Considering compatibility with older versions,
>> the
>>>>>>>> newly
>>>>>>>>>>>>>>> added
>>>>>>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled" option needs to
>>>>>> respect
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> optimizer's
>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>> option.
>>>>>>>>>>>>>>>>>>>>>>>>> But there is a consideration to remove the old
>> option
>>>>>> in
>>>>>>>>>>> Flink
>>>>>>>>>>>>>>> 2.0.
>>>>>>>>>>>>>>>>>>>>>>>>> 3. We can provide more options to disable other
>>>> source
>>>>>>>>>>>>> abilities
>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>> side
>>>>>>>>>>>>>>>>>>>>>>>>> effects, such as “source.aggregate.enabled” and
>>>>>>>>>>>>>>>>>>>>>>> “source.projection.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>> This is not urgent and can be continuously
>>>> introduced.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback again.
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月29日 08:45,Becket Qin <
>> becket....@gmail.com>
>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for digging into the git history, Jark. I
>>>> agree
>>>>>>>> it
>>>>>>>>>>>>> makes
>>>>>>>>>>>>>>>>>>>> sense
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>> deprecate this API in 2.0.
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <
>>>>>>>> imj...@gmail.com>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Becket,
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> I checked the history of "
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> *table.optimizer.source.predicate-pushdown-enabled*",
>>>>>>>>>>>>>>>>>>>>>>>>>>> it seems it was introduced since the legacy
>>>>>>>>>>>>>>> FilterableTableSource
>>>>>>>>>>>>>>>>>>>>>>>>>>> interface
>>>>>>>>>>>>>>>>>>>>>>>>>>> which might be an experiential feature at that
>>>> time.
>>>>>> I
>>>>>>>>>>> don't
>>>>>>>>>>>>>>> see
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>> necessity
>>>>>>>>>>>>>>>>>>>>>>>>>>> of this option at the moment. Maybe we can
>>>> deprecate
>>>>>>>> this
>>>>>>>>>>>>>>> option
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>> drop
>>>>>>>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>> in Flink 2.0[1] if it is not necessary anymore.
>>>> This
>>>>>>>> may
>>>>>>>>>>>>> help
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>> simplify this discussion.
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]:
>>>>>> https://issues.apache.org/jira/browse/FLINK-32383
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <
>>>>>>>>>>>>>>> becket....@gmail.com>
>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the proposal, Jiabao. My two cents
>>>> below:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. If I understand correctly, the motivation of
>>>> the
>>>>>>>> FLIP
>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>> mainly to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> make predicate pushdown optional on SOME of the
>>>>>>>> Sources.
>>>>>>>>>>> If
>>>>>>>>>>>>>>> so,
>>>>>>>>>>>>>>>>>>>>>>>>> intuitively
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the configuration should be Source specific
>>>> instead
>>>>>> of
>>>>>>>>>>>>>>> general.
>>>>>>>>>>>>>>>>>>>>>>>>> Otherwise,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> we will end up with general configurations that
>>>> may
>>>>>>>> not
>>>>>>>>>>>>> take
>>>>>>>>>>>>>>>>>>>> effect
>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>> some of the Source implementations. This
>> violates
>>>>>> the
>>>>>>>>>>> basic
>>>>>>>>>>>>>>> rule
>>>>>>>>>>>>>>>>>>>> of a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration - it does what it says, regardless
>>>> of
>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> implementation.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> While configuration standardization is usually a
>>>>>> good
>>>>>>>>>>>>> thing,
>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>>>>> break the basic rules.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we really want to have this general
>>>>>> configuration,
>>>>>>>> for
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> sources
>>>>>>>>>>>>>>>>>>>>>>>>>>>> this configuration does not apply, they should
>>>> throw
>>>>>>>> an
>>>>>>>>>>>>>>>>>> exception
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>> make
>>>>>>>>>>>>>>>>>>>>>>>>>>>> it clear that this configuration is not
>> supported.
>>>>>>>>>>> However,
>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>> seems
>>>>>>>>>>>>>>>>>>>>>>>>> ugly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. I think the actual motivation of this FLIP is
>>>>>> about
>>>>>>>>>>>>> "how a
>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>> should implement predicate pushdown
>> efficiently",
>>>>>> not
>>>>>>>>>>>>>>> "whether
>>>>>>>>>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown should be applied to the source." For
>>>>>>>> example,
>>>>>>>>>>> if
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>> wants
>>>>>>>>>>>>>>>>>>>>>>>>>>>> to avoid additional computing load in the
>> external
>>>>>>>>>>> system,
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>>>>>> always
>>>>>>>>>>>>>>>>>>>>>>>>>>>> read the entire record and apply the predicates
>> by
>>>>>>>>>>> itself.
>>>>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>>>>>>>>>>>>> from
>>>>>>>>>>>>>>>>>>>>>>>>>>>> the Flink perspective, the predicate pushdown is
>>>>>>>> applied,
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>> just
>>>>>>>>>>>>>>>>>>>>>>>>>>>> implemented differently by the source. So the
>>>> design
>>>>>>>>>>>>>>> principle
>>>>>>>>>>>>>>>>>>>> here
>>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Flink only cares about whether a source supports
>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>>>> not,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> it does not care about the implementation
>>>>>> efficiency /
>>>>>>>>>>> side
>>>>>>>>>>>>>>>>>>>> effect of
>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>> predicates pushdown. It is the Source
>>>>>> implementation's
>>>>>>>>>>>>>>>>>>>> responsibility
>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>> ensure the predicates pushdown is implemented
>>>>>>>> efficiently
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> does
>>>>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>>>>> impose excessive pressure on the external
>> system.
>>>>>> And
>>>>>>>> it
>>>>>>>>>>> is
>>>>>>>>>>>>>>> OK
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>>>>>>>>> additional configurations to achieve this goal.
>>>>>>>>>>> Obviously,
>>>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations will be source specific in this
>>>> case.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. Regarding the existing configurations of
>>>>>>>>>>>>>>>>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>>> *I am not sure why we need it. Supposedly, if a
>>>>>> source
>>>>>>>>>>>>>>>>>> implements
>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsXXXPushDown interface, the optimizer
>>>> should
>>>>>>>> push
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> corresponding
>>>>>>>>>>>>>>>>>>>>>>>>>>>> predicates to the Source. I am not sure in which
>>>>>> case
>>>>>>>>>>> this
>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>> would be used. Any ideas @Jark Wu <
>>>> imj...@gmail.com
>>>>>>> ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun
>>>>>>>>>>>>>>>>>>>>>>>>>>>> <jiabao....@xtransfer.cn.invalid> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for the detailed explanation.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think that for users, we should respect
>>>>>> conventions
>>>>>>>>>>> over
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Conventions can be default values explicitly
>>>>>>>> specified
>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations, or they can be behaviors that
>>>>>> follow
>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>> versions.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If the same code has different behaviors in
>>>>>> different
>>>>>>>>>>>>>>> versions,
>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>> would
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> be a very bad thing.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree that for regular users, it is not
>>>> necessary
>>>>>>>> to
>>>>>>>>>>>>>>>>>> understand
>>>>>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the configurations related to Flink.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> By following conventions, they can have a good
>>>>>>>>>>> experience.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let's get back to the practical situation and
>>>>>>>> consider
>>>>>>>>>>> it.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Case 1:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The user is not familiar with the purpose of
>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> table.optimizer.source.predicate-pushdown-enabled
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>>>>>> follows
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the convention of allowing predicate pushdown
>> to
>>>>>> the
>>>>>>>>>>>>> source
>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>> default.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Just understanding the
>>>>>>>> source.predicate-pushdown-enabled
>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and performing fine-grained toggle control will
>>>>>> work
>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Case 2:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The user understands the meaning of the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>> table.optimizer.source.predicate-pushdown-enabled
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>> has set
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> its value to false.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We have reason to believe that the user
>>>> understands
>>>>>>>> the
>>>>>>>>>>>>>>> meaning
>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> predicate pushdown configuration and the
>>>> intention
>>>>>>>> is to
>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown (rather than whether or not to allow
>>>> it).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The previous choice of globally disabling it is
>>>>>>>> likely
>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> couldn't be disabled on individual sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From this perspective, if we provide more
>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> support and provide detailed explanations of
>> the
>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>> behaviors in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the documentation,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users can clearly understand the differences
>>>>>> between
>>>>>>>>>>> these
>>>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations and use them correctly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Also, I don't agree that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> table.optimizer.source.predicate-pushdown-enabled =
>>>>>>>> true
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source.predicate-pushdown-enabled = false means
>>>>>> that
>>>>>>>> the
>>>>>>>>>>>>>>> local
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration overrides the global
>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, both configurations are
>>>>>> functioning
>>>>>>>>>>>>>>> correctly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The optimizer allows predicate pushdown to all
>>>>>>>> sources,
>>>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>>>> sources
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> can reject the filters pushed down by the
>>>>>> optimizer.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is natural, just like different components
>>>> at
>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>> levels
>>>>>>>>>>>>>>>>>>>>>>>>> are
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> responsible for different tasks.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The more serious issue is that if
>>>>>>>>>>>>>>>>>>>>>>> "source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> does not respect
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled”,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will be invalidated.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This means that regardless of whether
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>> is
>>>>>>>>>>> set
>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> true
>>>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false, it will have no effect.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 22:24,Jane Chan <
>>>>>> qingyue....@gmail.com>
>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the in-depth clarification. Here
>> are
>>>> my
>>>>>>>>>>> cents
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are
>>>>>> configurations
>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> We cannot assume that every user would be
>>>>>>>> interested in
>>>>>>>>>>>>>>>>>>>>>>> understanding
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> internal components of Flink, such as the
>>>>>> optimizer
>>>>>>>> or
>>>>>>>>>>>>>>>>>>>> connectors,
>>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specific configurations associated with each
>>>>>>>> component.
>>>>>>>>>>>>>>>>>> Instead,
>>>>>>>>>>>>>>>>>>>>>>>>> users
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> might be more concerned about knowing which
>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>> enables
>>>>>>>>>>>>>>>>>>>>>>> or
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disables the filter push-down feature for all
>>>>>> source
>>>>>>>>>>>>>>>>>> connectors,
>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parameter provides the flexibility to override
>>>>>> this
>>>>>>>>>>>>>>> behavior
>>>>>>>>>>>>>>>>>>>> for a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> single
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source if needed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, from this perspective, I am inclined to
>>>> divide
>>>>>>>>>>> these
>>>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> based on the scope of their impact from the
>>>> user's
>>>>>>>>>>>>>>> perspective
>>>>>>>>>>>>>>>>>>>>>>> (i.e.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> global-level or operator-level), rather than
>>>>>>>>>>> categorizing
>>>>>>>>>>>>>>> them
>>>>>>>>>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> component hierarchy from a developer's point
>> of
>>>>>>>> view.
>>>>>>>>>>>>>>>>>> Therefore,
>>>>>>>>>>>>>>>>>>>>>>>>> based
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> on
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this premise, it is intuitive and natural for
>>>>>> users
>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand fine-grained configuration options
>>>> can
>>>>>>>>>>>>> override
>>>>>>>>>>>>>>>>>>>> global
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if
>> "scan.filter-push-down.enabled"
>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>>>> respect to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>> and
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined
>>>> as
>>>>>>>> true,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as
>>>>>>>>>>>>>>> false
>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be
>>>>>>>> performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in
>>>>>> behavior
>>>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <1>If I understand correctly,
>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>> is a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector option, which means the only way to
>>>>>>>> configure
>>>>>>>>>>>>> it
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> explicitly
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specify it in DDL (no matter whether disable
>> or
>>>>>>>>>>> enable),
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> SET
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> command is not applicable, so I think it's
>>>> natural
>>>>>>>> to
>>>>>>>>>>>>> still
>>>>>>>>>>>>>>>>>>>> respect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> user's
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specification here. Otherwise, users might be
>>>> more
>>>>>>>>>>>>> confused
>>>>>>>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>>>>>>>> why
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> DDL does not work as expected, and the reason
>> is
>>>>>>>> just
>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>> some
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "optimizer" configuration is set to a
>> different
>>>>>>>> value.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> <2> From the implementation side, I am
>> inclined
>>>> to
>>>>>>>> keep
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>> parameter's
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority consistent for all conditions.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let "global" denote
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled",
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and let "per-source" denote
>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source T,  the following Truth table (based on
>>>> the
>>>>>>>>>>>>> current
>>>>>>>>>>>>>>>>>>>> design)
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> indicates the inconsistent behavior for
>>>>>> "per-source
>>>>>>>>>>>>>>> override
>>>>>>>>>>>>>>>>>>>>>>> global".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> .------------.---------------.-------------------
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> ----.-------------------------------------.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | global   | per-source | push-down for T |
>>>>>>>> per-source
>>>>>>>>>>>>>>>>>> override
>>>>>>>>>>>>>>>>>>>>>>>>> global
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> |
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> :-----------+--------------+-----------------------+------------------------------------:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | true       | false         | false
>>>>>>>>>>>>> | Y
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> :-----------+--------------+-----------------------+------------------------------------:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> | false     | true           | false
>>>>>>>>>>>>> | N
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>         |
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> .------------.---------------.-----------------------.-------------------------------------.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I understand that the configuration of global
>>>>>>>>>>>>> parallelism
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> task
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parallelism is at different granularities but
>>>>>> with
>>>>>>>> the
>>>>>>>>>>>>>>> same
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However,
>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are
>>>>>> configurations
>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> different
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From a user's perspective, there are two
>>>>>> scenarios:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Disabling all filter pushdown
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, setting
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false is sufficient to meet the
>> requirement.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Disabling filter pushdown for specific
>>>> sources
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, there is no need to adjust
>>>> the
>>>>>>>> value
>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Instead, the focus should be on the
>>>> configuration
>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to meet the
>>>>>>>>>>> requirement.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, users do not need to set
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" to
>>>>>>>>>>>>>>> false
>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> manually
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter pushdown for specific sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if
>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>> doesn't
>>>>>>>>>>>>>>>>>>>> respect
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
>>>>>>>>>>> and
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined
>>>> as
>>>>>>>> true,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" as
>>>>>>>>>>>>>>> false
>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> have no
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be
>>>>>>>> performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in
>>>>>> behavior
>>>>>>>>>>>>>>> compared
>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The same SQL query that could successfully
>> push
>>>>>>>> down
>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version but would no longer do so after the
>>>>>>>> upgrade.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 17:10,Benchao Li <
>>>>>>>> libenc...@apache.org>
>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jiabao for the detailed explanations,
>>>>>> that
>>>>>>>>>>>>> helps a
>>>>>>>>>>>>>>>>>>>> lot, I
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> understand your rationale now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong. Your perspective is
>>>>>> from
>>>>>>>>>>>>>>>>>> "developer",
>>>>>>>>>>>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> means there is an optimizer and connector
>>>>>>>> component,
>>>>>>>>>>>>> and
>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable this feature (pushing filters down
>> into
>>>>>>>>>>>>>>> connectors),
>>>>>>>>>>>>>>>>>>>> you
>>>>>>>>>>>>>>>>>>>>>>>>> must
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable it firstly in optimizer, and only
>> then
>>>>>>>>>>> connector
>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> chance
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide to use it or not.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> My perspective is from "user" that (Why a
>> user
>>>>>>>> should
>>>>>>>>>>>>>>> care
>>>>>>>>>>>>>>>>>>>> about
>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> difference of optimizer/connector) , this
>> is a
>>>>>>>>>>> feature,
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> has
>>>>>>>>>>>>>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> way to control it, one way is to config it
>>>>>>>> job-level,
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> other
>>>>>>>>>>>>>>>>>>>>>>> one
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in table properties. What a user expects is
>>>> that
>>>>>>>> they
>>>>>>>>>>>>> can
>>>>>>>>>>>>>>>>>>>>>>> control a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> feature in a tiered way, that setting it per
>>>>>> job,
>>>>>>>> and
>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> fine-grained tune it per table.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is some kind of similar to other
>>>> concepts,
>>>>>>>> such
>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>> parallelism,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users can set a job level default
>> parallelism,
>>>>>> and
>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> tune it per operator. There may be more such
>>>>>>>> debate
>>>>>>>>>>> in
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> future
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g., we can have a job level config about
>>>>>> adding
>>>>>>>>>>>>> key-by
>>>>>>>>>>>>>>>>>>>> before
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> lookup
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> join, and also a hint/table property way to
>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>> control
>>>>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> per lookup operator. Hence we'd better find
>> a
>>>>>>>> unified
>>>>>>>>>>>>> way
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>> all
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> those similar kind of features.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn
>> .invalid>
>>>>>>>>>>>>>>> 于2023年10月25日周三
>>>>>>>>>>>>>>>>>>>>>>>>> 15:27写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for further explanation.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> These two configurations correspond to
>>>>>> different
>>>>>>>>>>>>> levels.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" does not make
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" invalid.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The planner will still push down predicates
>>>> to
>>>>>>>> all
>>>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whether filter pushdown is allowed or not
>> is
>>>>>>>>>>>>> determined
>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source's "scan.filter-push-down.enabled"
>>>>>>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, "table.optimizer.source.predicate"
>>>>>> does
>>>>>>>>>>>>>>> directly
>>>>>>>>>>>>>>>>>>>> affect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled”.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> When the planner disables predicate
>> pushdown,
>>>>>> the
>>>>>>>>>>>>>>>>>>>> source-level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown will also not be executed, even if
>> the
>>>>>>>> source
>>>>>>>>>>>>>>> allows
>>>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whatever, in point 1 and 2, our expectation
>>>> is
>>>>>>>>>>>>>>> consistent.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the 3rd point, I still think that the
>>>>>>>>>>>>> planner-level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over the source-level
>>>>>>>> configuration.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may seem counterintuitive when we
>> globally
>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>> predicate
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown but allow filter pushdown at the
>>>> source
>>>>>>>>>>> level.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 14:35,Jane Chan <
>>>>>>>> qingyue....@gmail.com
>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for clarifying this. While by
>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority" I meant that this value
>>>>>> should
>>>>>>>> be
>>>>>>>>>>>>>>>>>> respected
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> whenever
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> set explicitly.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The conclusion that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" =
>>>> "true"
>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate
>>>>>>>> pushdown,
>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sources do
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This indicates that the option
>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled =
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an individual source connector does indeed
>>>>>>>> override
>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> global-level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> planner settings to make a difference. And
>>>>>> thus
>>>>>>>>>>> "has
>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> While for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" =
>>>>>> "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the
>>>>>>>> planner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is
>> disabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains
>>>>>>>> consistent
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I still think
>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>> should
>>>>>>>>>>>>>>> also
>>>>>>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> respected
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is enabled for individual connectors.
>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao
>> Sun <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the current proposal, we recommend
>>>>>> keeping
>>>>>>>> the
>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" as
>> true,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and setting the the default value of
>> newly
>>>>>>>>>>>>> introduced
>>>>>>>>>>>>>>>>>>>> option
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to true
>> as
>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The main purpose of doing this is to
>>>> maintain
>>>>>>>>>>>>>>> consistency
>>>>>>>>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> previous
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> versions, as whether to perform
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter pushdown in the old version solely
>>>>>>>> depends
>>>>>>>>>>> on
>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>> option.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That means by default, as long as a
>>>>>> TableSource
>>>>>>>>>>>>>>>>>> implements
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface, filter
>>>>>>>> pushdown
>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>> allowed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And it seems that we don't have much
>>>> benefit
>>>>>> in
>>>>>>>>>>>>>>> changing
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> value
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "table.optimizer.source.predicate" to
>>>>>> false.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding the priority of these two
>>>>>>>>>>> configurations,
>>>>>>>>>>>>> I
>>>>>>>>>>>>>>>>>>>> believe
>>>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over
>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> it
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> exhibits
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following behavior.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. "table.optimizer.source.predicate" =
>>>>>> "true"
>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "true"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is the default behavior, allowing
>>>> filter
>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sources.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" =
>>>>>> "true"
>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate
>>>>>>>> pushdown,
>>>>>>>>>>> but
>>>>>>>>>>>>>>>>>>>>>>> individual
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> sources do
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" =
>>>>>> "false"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the
>>>>>>>> planner.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is
>> disabled.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains
>>>>>>>> consistent
>>>>>>>>>>>>> with
>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> old
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> version as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From an implementation perspective,
>> setting
>>>>>> the
>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" higher
>> than
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" is
>>>>>>>> difficult to
>>>>>>>>>>>>>>>>>> achieve
>>>>>>>>>>>>>>>>>>>>>>> now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because the
>>>> PushFilterIntoSourceScanRuleBase
>>>>>> at
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>> planner
>>>>>>>>>>>>>>>>>>>>>>>>> level
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> precedence over the source-level
>>>>>>>>>>> FilterPushDownSpec.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Only when the
>>>>>> PushFilterIntoSourceScanRuleBase
>>>>>>>> is
>>>>>>>>>>>>>>>>>> enabled,
>>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Source-level filter pushdown be
>> performed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, in my opinion, there
>> doesn't
>>>>>>>> seem to
>>>>>>>>>>>>> be
>>>>>>>>>>>>>>>>>> much
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> benefit in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setting a higher priority for
>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled".
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may instead affect compatibility and
>>>>>>>> increase
>>>>>>>>>>>>>>>>>>>>>>> implementation
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> complexity.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <
>>>>>>>>>>> libenc...@apache.org
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with Jane that fine-grained
>>>>>>>>>>> configurations
>>>>>>>>>>>>>>>>>> should
>>>>>>>>>>>>>>>>>>>>>>> have
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority than job level configurations.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For current proposal, we can achieve
>> that:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set
>> "table.optimizer.source.predicate" =
>>>>>>>> "true"
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> enable
>>>>>>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
>>>>>>>>>>> ""scan.filter-push-down.enabled" =
>>>>>>>>>>>>>>>>>>>> "false"
>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set
>> "table.optimizer.source.predicate" =
>>>>>>>>>>> "false"
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>> by
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
>>>>>>>>>>> ""scan.filter-push-down.enabled" =
>>>>>>>>>>>>>>>>>>>> "true" to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com>
>>>>>>>> 于2023年10月24日周二
>>>>>>>>>>>>>>>>>> 23:55写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level
>>>> than
>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong, but I think
>> the
>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" should
>>>>>> have a
>>>>>>>>>>>>> higher
>>>>>>>>>>>>>>>>>>>>>>> priority
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> because
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default value of
>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>> true.
>>>>>>>>>>>>>>>>>>>>>>>>> As
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> result,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> turning off filter push-down for a
>>>> specific
>>>>>>>>>>> source
>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>> not
>>>>>>>>>>>>>>>>>>>>>>>>> take
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> effect
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unless the default value of
>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> changed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false, or, alternatively, let users
>>>>>>>> manually
>>>>>>>>>>>>> set
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" to
>>>> false
>>>>>>>>>>> first
>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>> then
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> selectively
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter push-down for the desired
>>>>>>>> sources,
>>>>>>>>>>>>>>> which
>>>>>>>>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>>>>>>> less
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> intuitive.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao
>>>> Sun
>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level
>>>> than
>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>> at
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <
>>>>>>>>>>>>>>> qingyue....@gmail.com>
>>>>>>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this discussion. I
>>>>>> have
>>>>>>>> a
>>>>>>>>>>>>> small
>>>>>>>>>>>>>>>>>>>>>>> question
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> that
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> will
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take
>>>>>>>>>>> precedence
>>>>>>>>>>>>>>> over
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
>> when
>>>>>> the
>>>>>>>>>>> two
>>>>>>>>>>>>>>>>>>>> parameters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> might
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conflict
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each other?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM
>> Jiabao
>>>>>> Sun
>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jark,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we only add configuration without
>>>>>>>> adding
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enableFilterPushDown
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown
>>>>>>>>>>> interface,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each connector would have to handle
>>>> the
>>>>>>>> same
>>>>>>>>>>>>>>> logic
>>>>>>>>>>>>>>>>>> in
>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method to determine whether filter
>>>>>>>> pushdown
>>>>>>>>>>> is
>>>>>>>>>>>>>>>>>> needed.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This would increase complexity and
>>>>>> violate
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> original
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> behavior
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters method.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, we only need to
>> pass
>>>>>> the
>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> parameter in
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> newly added enableFilterPushDown
>>>> method
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide whether to perform
>> predicate
>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think this approach would be
>> clearer
>>>>>> and
>>>>>>>>>>>>>>> simpler.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <
>>>>>>>> imj...@gmail.com
>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi JIabao,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think the current interface can
>>>>>> already
>>>>>>>>>>>>>>> satisfy
>>>>>>>>>>>>>>>>>>>> your
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> requirements.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The connector can reject all the
>>>>>> filters
>>>>>>>> by
>>>>>>>>>>>>>>>>>> returning
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> input
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filters
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as `Result#remainingFilters`.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So maybe we don't need to
>> introduce a
>>>>>> new
>>>>>>>>>>>>>>> method to
>>>>>>>>>>>>>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown, but just introduce an
>>>> option
>>>>>>>> for
>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>> specific
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38,
>> Leonard
>>>>>> Xu
>>>>>>>> <
>>>>>>>>>>>>>>>>>>>>>>>>> xbjt...@gmail.com
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off
>> this
>>>>>>>>>>>>> discussion.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you add a section to explain
>>>> the
>>>>>>>>>>>>>>> difference
>>>>>>>>>>>>>>>>>>>>>>> between
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> proposed
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector level config
>>>>>>>>>>>>>>>>>>>> `scan.filter-push-down.enabled`
>>>>>>>>>>>>>>>>>>>>>>>>> and
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> existing
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> query
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level config
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>> `table.optimizer.source.predicate-pushdown-enabled` ?
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonard
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <
>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .INVALID>
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 写道:
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Devs,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a
>> discussion
>>>> on
>>>>>>>>>>>>>>> FLIP-377:
>>>>>>>>>>>>>>>>>>>>>>> support
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable filter pushdown for
>>>> Table/SQL
>>>>>>>>>>>>>>> Sources[1].
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does
>> not
>>>>>>>> expose
>>>>>>>>>>>>>>>>>>>>>>> fine-grained
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> control
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users to enable or disable filter
>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, filter pushdown has some
>>>>>> side
>>>>>>>>>>>>>>> effects,
>>>>>>>>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> additional
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computational pressure on external
>>>>>>>> systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can
>> lead
>>>>>> to
>>>>>>>>>>>>> issues
>>>>>>>>>>>>>>>>>> such
>>>>>>>>>>>>>>>>>>>> as
>>>>>>>>>>>>>>>>>>>>>>>>> full
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> table
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scans,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which in turn can impact the
>>>> stability
>>>>>>>> of
>>>>>>>>>>>>>>> external
>>>>>>>>>>>>>>>>>>>>>>>>> systems.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with
>>>> two
>>>>>>>>>>>>> sources:
>>>>>>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>>>>>>>>>> and a
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> database.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The database is sensitive to
>>>>>> pressure,
>>>>>>>> and
>>>>>>>>>>>>> we
>>>>>>>>>>>>>>>>>> want
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configure
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not perform filter pushdown to the
>>>>>>>> database
>>>>>>>>>>>>>>>>>> source.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, we still want to perform
>>>>>>>> filter
>>>>>>>>>>>>>>> pushdown
>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>> the
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Kafka
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> decrease network IO.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I propose to support
>> configuration
>>>> to
>>>>>>>>>>>>> disable
>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>> push
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> down for
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Table/SQL sources to let user
>> decide
>>>>>>>>>>> whether
>>>>>>>>>>>>> to
>>>>>>>>>>>>>>>>>>>> perform
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback.
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>> 
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>>> 
>> 
>> 
>> 

Reply via email to