Yes, that sounds reasonable to me. We can start with ALWAYS and NEVER, and
add more policies as needed.

Thanks,

Jiangjie (Becket) Qin

On Mon, Dec 18, 2023 at 4:48 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid>
wrote:

> Thanks Bucket,
>
> The jdbc.filter.handling.policy is good to me as it provides sufficient
> extensibility for future filter pushdown optimizations.
> However, currently, we don't have an implementation for the AUTO mode, and
> it seems that the AUTO mode can easily be confused with the ALWAYS mode
> because users don't have the opportunity to MANUALLY decide which filters
> to push down.
>
> I suggest that we only introduce the ALWAYS and NEVER modes for now, and
> we can consider introducing more flexible policies in the future,
> such as INDEX_ONLY, NUMBERIC_ONLY and so on.
>
> WDYT?
>
> Best,
> Jiabao
>
>
>
> > 2023年12月18日 16:27,Becket Qin <becket....@gmail.com> 写道:
> >
> > Hi Jiabao,
> >
> > Please see the reply inline.
> >
> >
> >> The MySQL connector is currently in the flink-connector-jdbc repository
> >> and is not a standalone connector.
> >> Is it too unique to use "mysql" as the configuration option prefix?
> >
> > If the intended behavior makes sense to all the supported JDBC drivers,
> we
> > can make this a JDBC connector configuration.
> >
> > Also, I would like to ask about the difference in behavior between AUTO
> and
> >> ALWAYS.
> >> It seems that we cannot guarantee the pushing down of all filters to the
> >> external system under the ALWAYS
> >> mode because not all filters in Flink SQL are supported by the external
> >> system.
> >> Should we throw an error when encountering a filter that cannot be
> pushed
> >> down in the ALWAYS mode?
> >
> > The idea of AUTO is to do efficiency-aware pushdowns. The source will
> query
> > the external system (MySQL, Oracle, SQL Server, etc) first to retrieve
> the
> > information of the table. With that information, the source will decide
> > whether to further push a filter to the external system based on the
> > efficiency. E.g. only push the indexed fields. In contrast, ALWAYS will
> > just always push the supported filters to the external system, regardless
> > of the efficiency. In case there are filters that are not supported,
> > according to the current contract of SupportsFilterPushdown, these
> > unsupported filters should just be returned by the
> > *SupportsFilterPushdown.applyFilters()* method as remaining filters.
> > Therefore, there is no need to throw exceptions here. This is likely the
> > desired behavior for most users, IMO. If there are cases that users
> really
> > want to get alerted when a filter cannot be pushed to the external
> system,
> > we can add another value like "ENFORCED_ALWAYS", which behaves like
> ALWAYS,
> > but throws exceptions when a filter cannot be applied to the external
> > system. But personally I don't see much value in doing this.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> > On Mon, Dec 18, 2023 at 3:54 PM Jiabao Sun <jiabao....@xtransfer.cn
> .invalid>
> > wrote:
> >
> >> Hi Becket,
> >>
> >> The MySQL connector is currently in the flink-connector-jdbc repository
> >> and is not a standalone connector.
> >> Is it too unique to use "mysql" as the configuration option prefix?
> >>
> >> Also, I would like to ask about the difference in behavior between AUTO
> >> and ALWAYS.
> >> It seems that we cannot guarantee the pushing down of all filters to the
> >> external system under the ALWAYS
> >> mode because not all filters in Flink SQL are supported by the external
> >> system.
> >> Should we throw an error when encountering a filter that cannot be
> pushed
> >> down in the ALWAYS mode?
> >>
> >> Thanks,
> >> Jiabao
> >>
> >>> 2023年12月18日 15:34,Becket Qin <becket....@gmail.com> 写道:
> >>>
> >>> Hi JIabao,
> >>>
> >>> Thanks for updating the FLIP. Maybe I did not explain it clearly
> enough.
> >> My
> >>> point is that given there are various good flavors of behaviors
> handling
> >>> filters pushed down, we should not have a common config of
> >>> "ignore.filter.pushdown", because the behavior is not *common*.
> >>>
> >>> It looks like the original motivation of this FLIP is just for MySql.
> >> Let's
> >>> focus on what is the best solution for MySql connector here first.
> After
> >>> that, if people think the best behavior for MySql happens to be a
> common
> >>> one, we can then discuss whether that is worth being added to the base
> >>> implementation of source. For MySQL, if we are going to introduce a
> >> config
> >>> to MySql, why not have something like "mysql.filter.handling.policy"
> with
> >>> value of AUTO / NEVER / ALWAYS? Isn't that better than
> >>> "ignore.filter.pushdown"?
> >>>
> >>> Thanks,
> >>>
> >>> Jiangjie (Becket) Qin
> >>>
> >>>
> >>>
> >>> On Sun, Dec 17, 2023 at 11:30 PM Jiabao Sun <jiabao....@xtransfer.cn
> >> .invalid>
> >>> wrote:
> >>>
> >>>> Hi Becket,
> >>>>
> >>>> The FLIP document has been updated as well.
> >>>> Please take a look when you have time.
> >>>>
> >>>> Thanks,
> >>>> Jiabao
> >>>>
> >>>>
> >>>>> 2023年12月17日 22:54,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道:
> >>>>>
> >>>>> Thanks Becket,
> >>>>>
> >>>>> I apologize for not being able to continue with this proposal due to
> >>>> being too busy during this period.
> >>>>>
> >>>>> The viewpoints you shared about the design of Flink Source make sense
> >> to
> >>>> me
> >>>>> The native configuration ‘ignore.filter.pushdown’ is good to me.
> >>>>> Having a unified name or naming style can indeed prevent confusion
> for
> >>>> users regarding
> >>>>> the inconsistent naming of this configuration across different
> >>>> connectors.
> >>>>>
> >>>>> Currently, there are not many external connectors that support filter
> >>>> pushdown.
> >>>>> I propose that we first introduce it in flink-connector-jdbc and
> >>>> flink-connector-mongodb.
> >>>>> Do you think this is feasible?
> >>>>>
> >>>>> Best,
> >>>>> Jiabao
> >>>>>
> >>>>>
> >>>>>> 2023年11月16日 17:45,Becket Qin <becket....@gmail.com> 写道:
> >>>>>>
> >>>>>> Hi Jiabao,
> >>>>>>
> >>>>>> Arguments like "because Spark has it so Flink should also have it"
> >> does
> >>>> not
> >>>>>> make sense. Different projects have different API flavors and
> styles.
> >>>> What
> >>>>>> is really important is the rationale and the design principle behind
> >> the
> >>>>>> API. They should conform to the convention of the project.
> >>>>>>
> >>>>>> First of all, Spark Source API itself has a few issues and they
> ended
> >> up
> >>>>>> introduce DataSource V2 in Spark 3.0, which added the decorative
> >>>> interfaces
> >>>>>> like SupportsPushdownXXX. Some of the configurations predating
> >>>> DataSource
> >>>>>> V2 may still be there.
> >>>>>>
> >>>>>> For the Spark configurations you mentioned, they are all the
> >>>> configurations
> >>>>>> for FileScanBuilder, which is equivalent to FileSource in Flink.
> >>>> Currently,
> >>>>>> regardless of the format (ORC, Parquet, Avro, etc), the FileSource
> >>>> pushes
> >>>>>> back all the filters to ensure correctness. The actual filters that
> >> got
> >>>>>> applied to the specific format might still be different. This
> >>>>>> implementation is the same in FileScanBuilder.pushFilters() for
> >> Spark. I
> >>>>>> don't know why Spark got separate configurations for each format.
> >> Maybe
> >>>> it
> >>>>>> is because the filters are actually implemented differently for
> >>>> different
> >>>>>> format.
> >>>>>>
> >>>>>> At least for the current implementation in FileScanBuilder, these
> >>>>>> configurations can be merged to one configuration like
> >>>>>> `apply.filters.to.format.enabled`. Note that this config, as well as
> >> the
> >>>>>> separate configs you mentioned, are just visible and used by the
> >>>>>> FileScanBuilder. It determines whether the filters should be passed
> >>>> down to
> >>>>>> the format of the FileScanBuilder instance. Regardless of the value
> of
> >>>>>> these configs, FileScanBuilder.pushFilters() will always be called,
> >> and
> >>>>>> FileScanBuilder (as well as FileSource in Flink) will always push
> back
> >>>> all
> >>>>>> the filters to the framework.
> >>>>>>
> >>>>>> A MySql source can have a very different way to handle this. For
> >>>> example, A
> >>>>>> MySql source  A config in this case might be "my.apply.filters" with
> >>>> three
> >>>>>> different values:
> >>>>>> - AUTO: The Source will issue a DESC Table request to understand
> >>>> whether a
> >>>>>> filter can be applied efficiently. And decide which filters can be
> >>>> applied
> >>>>>> and which cannot based on that.
> >>>>>> - NEVER: Never apply filtering. It will always do a full table read
> >> and
> >>>>>> let Flink do the filtering.
> >>>>>> - ALWAYS: Always apply the filtering to the MySql server.
> >>>>>>
> >>>>>> In the above examples of FileSource and MySql Source, I don't think
> it
> >>>> is a
> >>>>>> good idea to shoehorn the behaviors into a naive config of
> >>>>>> `ignore.filter.pushdown`. That is why I don't think this is a common
> >>>> config.
> >>>>>>
> >>>>>> To recap, like I said, I do agree that in some cases, we may want to
> >>>> behave
> >>>>>> differently when filters are pushed down to the sources, even if a
> >>>> source
> >>>>>> implements SupportsFilterPushDown, but I don't think there is a
> >> suitable
> >>>>>> common config for this. The behavior is very likely source specific.
> >>>>>>
> >>>>>> Thanks,
> >>>>>>
> >>>>>> Jiangjie (Becket) Qin
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On Thu, Nov 16, 2023 at 3:41 PM Jiabao Sun <jiabao....@xtransfer.cn
> >>>> .invalid>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks Becket,
> >>>>>>>
> >>>>>>> I still believe that adding a configuration at the source level to
> >>>> disable
> >>>>>>> filter pushdown is needed. This demand exists in spark as well[1].
> >>>>>>>
> >>>>>>> In Spark, most sources that support filter pushdown provide their
> own
> >>>>>>> corresponding configuration options to enable or disable filter
> >>>> pushdown.
> >>>>>>> For PRs[2-4] that support filter pushdown capability, they also
> >> provide
> >>>>>>> configuration options to disable this capability.
> >>>>>>>
> >>>>>>> I believe this configuration is applicable to most scenarios, and
> >>>> there is
> >>>>>>> no need to dwell on why this configuration option was not
> introduced
> >>>>>>> earlier than the SupportsFilterPushDown interface.
> >>>>>>>
> >>>>>>> spark.sql.parquet.filterPushdown
> >>>>>>> spark.sql.orc.filterPushdown
> >>>>>>> spark.sql.csv.filterPushdown.enabled
> >>>>>>> spark.sql.json.filterPushdown.enabled
> >>>>>>> spark.sql.avro.filterPushdown.enabled
> >>>>>>> JDBC Option: pushDownPredicate
> >>>>>>>
> >>>>>>> We can see that the lack of consistency is caused by each connector
> >>>>>>> introducing different configuration options for the same behavior.
> >>>>>>> This is one of the motivations for advocating the introduction of a
> >>>>>>> unified configuration name.
> >>>>>>>
> >>>>>>> [1] https://issues.apache.org/jira/browse/SPARK-24288
> >>>>>>> [2] https://github.com/apache/spark/pull/27366
> >>>>>>> [3]https://github.com/apache/spark/pull/26973
> >>>>>>> [4] https://github.com/apache/spark/pull/29145
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Jiabao
> >>>>>>>
> >>>>>>>> 2023年11月16日 08:10,Becket Qin <becket....@gmail.com> 写道:
> >>>>>>>>
> >>>>>>>> Hi Jiabao,
> >>>>>>>>
> >>>>>>>> While we can always fix the formality of the config, a more
> >>>> fundamental
> >>>>>>>> issue here is whether this configuration is common enough.
> >> Personally
> >>>> I
> >>>>>>> am
> >>>>>>>> still not convinced it is.
> >>>>>>>>
> >>>>>>>> Remember we don't have a common implementation for
> >>>> SupportsFilterPushdown
> >>>>>>>> itself. Why does a potential behavior of the
> >>>>>>>> SupportsFilterPushdown.applyFilters() method deserve a common
> >>>>>>>> configuration? A common implementation should always come first,
> >> then
> >>>> its
> >>>>>>>> configuration becomes a common configuration as a natural result.
> >> But
> >>>>>>> here
> >>>>>>>> we are trying to add an impl to a configuration just to fix its
> >>>>>>> formality.
> >>>>>>>>
> >>>>>>>> I agree that there might be a few Source implementations that may
> >>>> want to
> >>>>>>>> avoid additional burdens on the remote system in some
> circumstances.
> >>>> And
> >>>>>>>> these circumstances are very specific:
> >>>>>>>> 1. The source talks to a remote service that can help perform the
> >>>> actual
> >>>>>>>> filtering.
> >>>>>>>> 2. The filtering done by the remote service is inefficient for
> some
> >>>>>>> reason
> >>>>>>>> (e.g. missing index)
> >>>>>>>> 3. The external service does not want to perform the inefficient
> >>>>>>> filtering
> >>>>>>>> for some reason (e.g. it is a shared service with others)
> >>>>>>>>
> >>>>>>>> There are multiple approaches to address the issue. Pushing back
> the
> >>>>>>>> filters is just one way of achieving this. So here we are talking
> >>>> about a
> >>>>>>>> config for one of the possible solutions to a scenario with all
> the
> >>>> above
> >>>>>>>> situations. I don't think there is enough justification for the
> >>>> config to
> >>>>>>>> be common.
> >>>>>>>>
> >>>>>>>> There is always this trade-off between the proliferation of public
> >>>>>>>> interfaces and the API standardization. As an extreme example, we
> >> can
> >>>>>>> make
> >>>>>>>> our public API a union of all the configs potentially used in all
> >> the
> >>>>>>> cases
> >>>>>>>> in the name of standardization. Apparently this won't work. So
> there
> >>>> must
> >>>>>>>> be a bar here and this bar might be somewhat subjective. For this
> >>>> FLIP,
> >>>>>>>> personally I don't think the config meets my bar for the reason
> >> stated
> >>>>>>>> above.
> >>>>>>>>
> >>>>>>>> Therefore, my suggestion remains the same. Keep the config as a
> >> Source
> >>>>>>>> implementation specific configuration.
> >>>>>>>>
> >>>>>>>> Thanks,
> >>>>>>>>
> >>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Thu, Nov 16, 2023 at 12:36 AM Jiabao Sun <
> >> jiabao....@xtransfer.cn
> >>>>>>> .invalid>
> >>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks Becket for the feedback,
> >>>>>>>>>
> >>>>>>>>> Regarding concerns about common configurations, I think we can
> >>>> introduce
> >>>>>>>>> FiltersApplier to unify the behavior of various connectors.
> >>>>>>>>>
> >>>>>>>>> public static class FiltersApplier {
> >>>>>>>>>
> >>>>>>>>> private final ReadableConfig config;
> >>>>>>>>> private final Function<List<ResolvedExpression>, Result> action;
> >>>>>>>>>
> >>>>>>>>> private FiltersApplier(
> >>>>>>>>>        ReadableConfig config,
> >>>>>>>>>        Function<List<ResolvedExpression>, Result> action) {
> >>>>>>>>>    this.config = config;
> >>>>>>>>>    this.action = action;
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
> >>>>>>>>>    if (config.get(ENABLE_FILTER_PUSH_DOWN)) {
> >>>>>>>>>        return action.apply(filters);
> >>>>>>>>>    } else {
> >>>>>>>>>        return Result.of(Collections.emptyList(), filters);
> >>>>>>>>>    }
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> public static FiltersApplier of(
> >>>>>>>>>        ReadableConfig config,
> >>>>>>>>>        Function<List<ResolvedExpression>, Result> action) {
> >>>>>>>>>    return new FiltersApplier(config, action);
> >>>>>>>>> }
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> For connectors implementation:
> >>>>>>>>>
> >>>>>>>>> @Override
> >>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) {
> >>>>>>>>> return FiltersApplier.of(config,
> >>>>>>>>>        f -> Result.of(new ArrayList<>(filters),
> >>>>>>>>> Collections.emptyList()));
> >>>>>>>>> }
> >>>>>>>>>
> >>>>>>>>> As for the name, whether it is "source.filter-push-down.enabled"
> or
> >>>>>>>>> "source.ignore-pushed-down-filters.enabled", I think both are
> okay.
> >>>>>>>>>
> >>>>>>>>> Do you think this change is feasible?
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> Best,
> >>>>>>>>> Jiabao
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>> 2023年11月15日 23:44,Becket Qin <becket....@gmail.com> 写道:
> >>>>>>>>>>
> >>>>>>>>>> Hi Jiabao,
> >>>>>>>>>>
> >>>>>>>>>> Yes, I still have concerns.
> >>>>>>>>>>
> >>>>>>>>>> The FLIP violates the following two principles regarding
> >>>> configuration:
> >>>>>>>>>>
> >>>>>>>>>> 1.* A config of a class should never negate the semantic of a
> >>>>>>> decorative
> >>>>>>>>>> interface implemented by that class. *
> >>>>>>>>>> A decorative interface is a public contract with other
> components,
> >>>>>>> while
> >>>>>>>>> a
> >>>>>>>>>> config is only internal to the class itself. The configurations
> >> for
> >>>> the
> >>>>>>>>>> Sources are not (and should never be) visible or understood to
> >>>>>>>>>> other components (e.g. optimizer). A configuration of a Source
> >> only
> >>>>>>>>>> controls the behavior of that Source, provided it is not
> violating
> >>>> the
> >>>>>>>>> API
> >>>>>>>>>> contract / semantic defined by the decorative interface. So
> when a
> >>>>>>> Source
> >>>>>>>>>> implementation implements SupportsFilterPushdown, this is a
> clear
> >>>>>>> public
> >>>>>>>>>> contract with Flink that filters should be pushed down to that
> >>>> Source.
> >>>>>>>>>> Therefore, for the same source, there should not be a
> >> configuration
> >>>>>>>>>> "source.filter-push-down.enabled" which stops the filters from
> >> being
> >>>>>>>>> pushed
> >>>>>>>>>> down to that Source. However, that specific source
> implementation
> >>>> can
> >>>>>>>>> have
> >>>>>>>>>> its own config to control its internal behavior, e.g.
> >>>>>>>>>> "ignore-pushed-down-filters.enabled" which may push back all the
> >>>> pushed
> >>>>>>>>>> down filters back to the Flink optimizer.
> >>>>>>>>>>
> >>>>>>>>>> 2. When we are talking about "common configs", in fact we are
> >>>> talking
> >>>>>>>>> about
> >>>>>>>>>> "configs for common (abstract) implementation classes". With
> that
> >>>> as a
> >>>>>>>>>> context, *a common config should always be backed by a common
> >>>>>>>>>> implementation class, so that consistent behavior can be
> >>>> guaranteed. *
> >>>>>>>>>> The LookupOptions you mentioned are configurations defined for
> >>>> classes
> >>>>>>>>>> DefaultLookupCache / PeriodicCacheReloadTrigger /
> >>>>>>>>> TimedCacheReloadTrigger.
> >>>>>>>>>> These configs are considered as "common" only because the
> >>>>>>> implementation
> >>>>>>>>>> classes using them are common building blocks for lookup table
> >>>>>>>>>> implementations. It would not make sense to have a dangling
> config
> >>>> in
> >>>>>>> the
> >>>>>>>>>> LookupOptions without the underlying common implementation
> class,
> >>>> but
> >>>>>>>>> only
> >>>>>>>>>> relies on a specific source to implement the stated behavior.
> >>>>>>>>>> As a bad example, there is this outlier config "max-retries" in
> >>>>>>>>>> LookupOptions, which I don't think should be here. This is
> because
> >>>> the
> >>>>>>>>>> retry behavior can be very implementation specific. For example,
> >>>> there
> >>>>>>>>> can
> >>>>>>>>>> be many different flavors of retry related configurations,
> >>>>>>> retry-backoff,
> >>>>>>>>>> retry-timeout, retry-async, etc. Why only max-retry is put here?
> >>>> should
> >>>>>>>>> all
> >>>>>>>>>> of them be put here? If we put all such kinds of configs in the
> >>>> common
> >>>>>>>>>> configs for "standardization and unification", the number of
> >> "common
> >>>>>>>>>> configs" can easily go crazy. And I don't see material benefits
> of
> >>>>>>> doing
> >>>>>>>>>> that. So here I don't think the configuration "max-retry" should
> >> be
> >>>> in
> >>>>>>>>>> LookupOptions, because it is not backed by any common
> >> implementation
> >>>>>>>>>> classes. If max-retry is implemented in the HBase source, it
> >> should
> >>>>>>> stay
> >>>>>>>>>> there. For the same reason, the config proposed in this FLIP
> >>>> (probably
> >>>>>>>>> with
> >>>>>>>>>> a name less confusing for the first reason mentioned above)
> >> should
> >>>>>>> stay
> >>>>>>>>> in
> >>>>>>>>>> the specific Source implementation.
> >>>>>>>>>>
> >>>>>>>>>> For the two reasons above, I am -1 to what the FLIP currently
> >>>> proposes.
> >>>>>>>>>>
> >>>>>>>>>> I think the right way to address the motivation here is still to
> >>>> have a
> >>>>>>>>>> config like "ignore-pushed-down-filters.enabled" for the
> specific
> >>>>>>> source
> >>>>>>>>>> implementation. Please let me know if this solves the problem
> you
> >>>> are
> >>>>>>>>>> facing.
> >>>>>>>>>>
> >>>>>>>>>> Thanks,
> >>>>>>>>>>
> >>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Nov 15, 2023 at 11:52 AM Jiabao Sun <
> >>>> jiabao....@xtransfer.cn
> >>>>>>>>> .invalid>
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Becket,
> >>>>>>>>>>>
> >>>>>>>>>>> The purpose of introducing this configuration is that not all
> >>>> filter
> >>>>>>>>>>> pushdowns can improve overall performance.
> >>>>>>>>>>> If the filter can hit the external index, then pushdown is
> >>>> definitely
> >>>>>>>>>>> worth it, as it can not only improve query time but also
> decrease
> >>>>>>>>> network
> >>>>>>>>>>> overhead.
> >>>>>>>>>>> However, for filters that do not hit the external index, it may
> >>>>>>>>> increase a
> >>>>>>>>>>> lot of performance overhead on the external system.
> >>>>>>>>>>>
> >>>>>>>>>>> Undeniably, if the connector can make accurate decisions for
> good
> >>>> and
> >>>>>>>>> bad
> >>>>>>>>>>> filters, we may not need to introduce this configuration option
> >> to
> >>>>>>>>> disable
> >>>>>>>>>>> pushing down filters to the external system.
> >>>>>>>>>>> However, it is currently not easy to achieve.
> >>>>>>>>>>>
> >>>>>>>>>>> IMO, supporting filter pushdown does not mean that always
> filter
> >>>>>>>>> pushdown
> >>>>>>>>>>> is better.
> >>>>>>>>>>> In the absence of automatic decision-making, I think we should
> >>>> leave
> >>>>>>>>> this
> >>>>>>>>>>> decision to users.
> >>>>>>>>>>>
> >>>>>>>>>>> The newly introduced configuration option is similar to
> >>>> LookupOptions,
> >>>>>>>>>>> providing unified naming and default values to avoid confusion
> >>>> caused
> >>>>>>> by
> >>>>>>>>>>> inconsistent naming in different connectors for users.
> >>>>>>>>>>> Setting the default value to true allows it to maintain
> >>>> compatibility
> >>>>>>>>> with
> >>>>>>>>>>> the default behavior of "always pushdown".
> >>>>>>>>>>>
> >>>>>>>>>>> Do you have any other concerns about this proposal? Please let
> me
> >>>>>>> know.
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks,
> >>>>>>>>>>> Jiabao
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>> 2023年10月31日 17:29,Jiabao Sun <jiabao....@xtransfer.cn
> .INVALID>
> >>>> 写道:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Hi Becket,
> >>>>>>>>>>>>
> >>>>>>>>>>>> Actually, for FileSystemSource, it is not always desired, only
> >> OCR
> >>>>>>> file
> >>>>>>>>>>> formats support filter pushdown.
> >>>>>>>>>>>>
> >>>>>>>>>>>> We can disable predicate pushdown for FileSystemSource by
> >> setting
> >>>>>>>>>>> 'table.optimizer.source.predicate-pushdown-enabled' to false.
> >>>>>>>>>>>> I think we can also disable filter pushdown at a more granular
> >>>> level
> >>>>>>>>>>> through fine-grained configuration.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best,
> >>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>> 2023年10月31日 16:50,Becket Qin <becket....@gmail.com> 写道:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hi Jiabao,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for the explanation. Maybe it's easier to explain with
> >> an
> >>>>>>>>>>> example.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Let's take FileSystemTableSource as an example. Currently it
> >>>>>>>>> implements
> >>>>>>>>>>>>> SupportsFilterPushDown interface. With your proposal, does it
> >>>> have
> >>>>>>> to
> >>>>>>>>>>>>> support `source.filter-push-down.enabled` as well? But this
> >>>>>>>>>>> configuration
> >>>>>>>>>>>>> does not quite make sense for the FileSystemTableSource
> because
> >>>>>>> filter
> >>>>>>>>>>>>> pushdown is always desired. However, because this
> configuration
> >>>> is a
> >>>>>>>>>>> part
> >>>>>>>>>>>>> of the SupportsFilterPushDown interface (which sounds
> confusing
> >>>> to
> >>>>>>>>> begin
> >>>>>>>>>>>>> with), the FileSystemTableSource can only do one of the
> >>>> following:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> 1. Ignore the user configuration to always apply the pushed
> >> down
> >>>>>>>>>>> filters -
> >>>>>>>>>>>>> this is an apparent anti-pattern because a configuration
> should
> >>>>>>> always
> >>>>>>>>>>> do
> >>>>>>>>>>>>> what it says.
> >>>>>>>>>>>>> 2. Throw an exception telling users that this configuration
> is
> >>>> not
> >>>>>>>>>>>>> applicable to the FileSystemTableSource.
> >>>>>>>>>>>>> 3. Implement this configuration to push back the pushed down
> >>>>>>> filters,
> >>>>>>>>>>> even
> >>>>>>>>>>>>> though this is never desired.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> None of the above options looks awkward. I am curious what
> your
> >>>>>>>>>>> solution is
> >>>>>>>>>>>>> here?
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Oct 31, 2023 at 3:11 PM Jiabao Sun <
> >>>> jiabao....@xtransfer.cn
> >>>>>>>>>>> .invalid>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Thanks Becket for the further explanation.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Perhaps I didn't explain it clearly.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 1. If a source does not implement the SupportsFilterPushDown
> >>>>>>>>> interface,
> >>>>>>>>>>>>>> the newly added configurations do not need to be added to
> >> either
> >>>>>>> the
> >>>>>>>>>>>>>> requiredOptions or optionalOptions.
> >>>>>>>>>>>>>> Similar to LookupOptions, if a source does not implement
> >>>>>>>>>>>>>> LookupTableSource, there is no need to add LookupOptions to
> >>>> either
> >>>>>>>>>>>>>> requiredOptions or optionalOptions.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> 2. "And these configs are specific to those sources, instead
> >> of
> >>>>>>>>> common
> >>>>>>>>>>>>>> configs."
> >>>>>>>>>>>>>> The newly introduced configurations define standardized
> names
> >>>> and
> >>>>>>>>>>> default
> >>>>>>>>>>>>>> values.
> >>>>>>>>>>>>>> They still belong to the configuration at the individual
> >> source
> >>>>>>>>> level.
> >>>>>>>>>>>>>> The purpose is to avoid scattered configuration items when
> >>>>>>> different
> >>>>>>>>>>>>>> sources implement the same logic.
> >>>>>>>>>>>>>> Whether a source should accept these configurations is
> >>>> determined
> >>>>>>> by
> >>>>>>>>>>> the
> >>>>>>>>>>>>>> source's Factory.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> 2023年10月31日 13:47,Becket Qin <becket....@gmail.com> 写道:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi Jiabao,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Please see the replies inline.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Introducing common configurations does not mean that all
> >>>> sources
> >>>>>>>>> must
> >>>>>>>>>>>>>>>> accept these configuration options.
> >>>>>>>>>>>>>>>> The configuration options supported by a source are
> >>>> determined by
> >>>>>>>>> the
> >>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory
> >> interface.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This is not true. Both required and optional options are
> >>>>>>> SUPPORTED.
> >>>>>>>>>>> That
> >>>>>>>>>>>>>>> means they are implemented and if one specifies an optional
> >>>> config
> >>>>>>>>> it
> >>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>> still take effect. An OptionalConfig is "Optional" because
> >> this
> >>>>>>>>>>>>>>> configuration has a default value. Hence, it is OK that
> users
> >>>> do
> >>>>>>> not
> >>>>>>>>>>>>>>> specify their own value. In another word, it is "optional"
> >> for
> >>>> the
> >>>>>>>>> end
> >>>>>>>>>>>>>>> users to set the config, but the implementation and support
> >> for
> >>>>>>> that
> >>>>>>>>>>>>>> config
> >>>>>>>>>>>>>>> is NOT optional. In case a source does not support a common
> >>>>>>> config,
> >>>>>>>>> an
> >>>>>>>>>>>>>>> exception must be thrown when the config is provided by the
> >> end
> >>>>>>>>> users.
> >>>>>>>>>>>>>>> However, the config we are talking about in this FLIP is a
> >>>> common
> >>>>>>>>>>> config
> >>>>>>>>>>>>>>> optional to implement, meaning that sometimes the claimed
> >>>> behavior
> >>>>>>>>>>> won't
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>> there even if users specify that config.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Similar to sources that do not implement the
> >> LookupTableSource
> >>>>>>>>>>> interface,
> >>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
> >>>>>>> interface
> >>>>>>>>>>> also
> >>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>> not need to accept newly introduced options.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> First of all, filter pushdown is a behavior of the query
> >>>>>>> optimizer,
> >>>>>>>>>>> not
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> behavior of Sources. The Sources tells the optimizer that
> it
> >>>> has
> >>>>>>> the
> >>>>>>>>>>>>>>> ability to accept pushed down filters by implementing the
> >>>>>>>>>>>>>>> SupportsFilterPushDown interface. And this is the only
> >> contract
> >>>>>>>>>>> between
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> Source and Optimizer regarding whether filters should be
> >> pushed
> >>>>>>>>> down.
> >>>>>>>>>>> As
> >>>>>>>>>>>>>>> long as a specific source implements this decorative
> >> interface,
> >>>>>>>>> filter
> >>>>>>>>>>>>>>> pushdown should always take place, i.e.
> >>>>>>>>>>>>>>> *SupportsFilterPushDown.applyFilters()* will be called.
> There
> >>>>>>> should
> >>>>>>>>>>> be
> >>>>>>>>>>>>>> no
> >>>>>>>>>>>>>>> other config to disable that call. However, Sources can
> >> decide
> >>>> how
> >>>>>>>>> to
> >>>>>>>>>>>>>>> behave based on their own configurations after
> >>>> *applyFilters()* is
> >>>>>>>>>>>>>> called.
> >>>>>>>>>>>>>>> And these configs are specific to those sources, instead of
> >>>> common
> >>>>>>>>>>>>>> configs.
> >>>>>>>>>>>>>>> Please see the examples I mentioned in the previous email.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Tue, Oct 31, 2023 at 10:27 AM Jiabao Sun <
> >>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>> .invalid>
> >>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Hi Becket,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Sorry, there was a typo in the second point. Let me
> correct
> >>>> it:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Introducing common configurations does not mean that all
> >>>> sources
> >>>>>>>>> must
> >>>>>>>>>>>>>>>> accept these configuration options.
> >>>>>>>>>>>>>>>> The configuration options supported by a source are
> >>>> determined by
> >>>>>>>>> the
> >>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory
> >> interface.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Similar to sources that do not implement the
> >> LookupTableSource
> >>>>>>>>>>>>>> interface,
> >>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown
> >>>>>>> interface
> >>>>>>>>>>> also
> >>>>>>>>>>>>>> do
> >>>>>>>>>>>>>>>> not need to accept newly introduced options.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 2023年10月31日 10:13,Jiabao Sun <jiabao....@xtransfer.cn
> >>>> .INVALID>
> >>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks Becket for the feedback.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 1. Currently, the SupportsFilterPushDown#applyFilters
> >> method
> >>>>>>>>>>> returns a
> >>>>>>>>>>>>>>>> result that includes acceptedFilters and remainingFilters.
> >> The
> >>>>>>>>> source
> >>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>> decide to push down some filters or not accept any of
> them.
> >>>>>>>>>>>>>>>>> 2. Introducing common configuration options does not mean
> >>>> that a
> >>>>>>>>>>> source
> >>>>>>>>>>>>>>>> that supports the SupportsFilterPushDown capability must
> >>>> accept
> >>>>>>>>> this
> >>>>>>>>>>>>>>>> configuration. Similar to LookupOptions, only sources that
> >>>>>>>>> implement
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>> LookupTableSource interface are necessary to accept these
> >>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>> options.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2023年10月31日 07:49,Becket Qin <becket....@gmail.com> 写道:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Hi Jiabao and Ruanhang,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Adding a configuration of
> source.filter-push-down.enabled
> >>>> as a
> >>>>>>>>>>> common
> >>>>>>>>>>>>>>>>>> source configuration seems problematic.
> >>>>>>>>>>>>>>>>>> 1. The config name is misleading. filter pushdown should
> >>>> only
> >>>>>>> be
> >>>>>>>>>>>>>>>> determined
> >>>>>>>>>>>>>>>>>> by whether the SupportsFilterPushdown interface is
> >>>> implemented
> >>>>>>> or
> >>>>>>>>>>> not.
> >>>>>>>>>>>>>>>>>> 2. The behavior of this configuration is only applicable
> >> to
> >>>>>>> some
> >>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>> implementations. Why is it a common configuration?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Here's my suggestion for design principles:
> >>>>>>>>>>>>>>>>>> 1. Only add source impl specific configuration to
> >>>> corresponding
> >>>>>>>>>>>>>> sources.
> >>>>>>>>>>>>>>>>>> 2. The configuration name should not overrule existing
> >>>> common
> >>>>>>>>>>>>>> contracts.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> For example, in the case of MySql source. There are
> >> several
> >>>>>>>>>>> options:
> >>>>>>>>>>>>>>>>>> 1. Have a configuration of
> >>>>>>>>> `*mysql.avoid.remote.full.table.scan`*.
> >>>>>>>>>>> If
> >>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>> configuration is set, and a filter pushdown does not hit
> >> an
> >>>>>>>>> index,
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> MySql source impl would not further pushdown the filter
> to
> >>>>>>> MySql
> >>>>>>>>>>>>>>>> servers.
> >>>>>>>>>>>>>>>>>> Note that this assumes the MySql source can retrieve the
> >>>> index
> >>>>>>>>>>>>>>>> information
> >>>>>>>>>>>>>>>>>> from the MySql servers.
> >>>>>>>>>>>>>>>>>> 2. If the MySql index information is not available to
> the
> >>>> MySql
> >>>>>>>>>>>>>> source,
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> configuration could be something like
> >>>>>>>>>>>>>>>> *`mysql.pushback.pushed.down.filters`*.
> >>>>>>>>>>>>>>>>>> Once set to true, MySql source would just add all the
> >>>> filters
> >>>>>>> to
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> RemainingFilters in the Result returned by
> >>>>>>>>>>>>>>>>>> *SupportsFilterPushdown.applyFilters().*
> >>>>>>>>>>>>>>>>>> 3. An alternative to option 2 is to have a `
> >>>>>>>>>>>>>>>>>> *mysql.apply.predicates.after.scan*`. When it is set to
> >>>> true,
> >>>>>>>>> MySql
> >>>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>> will not push the filter down to the MySql servers, but
> >>>> apply
> >>>>>>> the
> >>>>>>>>>>>>>>>> filters
> >>>>>>>>>>>>>>>>>> inside the MySql source itself.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> As you may see, the above configurations do not disable
> >>>> filter
> >>>>>>>>>>>>>> pushdown
> >>>>>>>>>>>>>>>>>> itself. They just allow various implementations of
> filter
> >>>>>>>>> pushdown.
> >>>>>>>>>>>>>> And
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> configuration name does not give any illusion that
> filter
> >>>>>>>>> pushdown
> >>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>> disabled.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Mon, Oct 30, 2023 at 11:58 PM Jiabao Sun <
> >>>>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>>>> .invalid>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks Hang for the suggestion.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think the configuration of TableSource is not closely
> >>>>>>> related
> >>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> SourceReader,
> >>>>>>>>>>>>>>>>>>> so I prefer to introduce a independent configuration
> >> class
> >>>>>>>>>>>>>>>>>>> TableSourceOptions in the flink-table-common module,
> >>>> similar
> >>>>>>> to
> >>>>>>>>>>>>>>>>>>> LookupOptions.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> For the second point, I suggest adding Java doc to the
> >>>>>>>>>>>>>>>> SupportsXXXPushDown
> >>>>>>>>>>>>>>>>>>> interfaces, providing detailed information on these
> >> options
> >>>>>>> that
> >>>>>>>>>>>>>> needs
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> be supported.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I have made updates in the FLIP document.
> >>>>>>>>>>>>>>>>>>> Please help check it again.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com>
> >> 写道:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Thanks for the improvements, Jiabao.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> There are some details that I am not sure about.
> >>>>>>>>>>>>>>>>>>>> 1. The new option `source.filter-push-down.enabled`
> will
> >>>> be
> >>>>>>>>>>> added to
> >>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>> class? I think it should be `SourceReaderOptions`.
> >>>>>>>>>>>>>>>>>>>> 2. How are the connector developers able to know and
> >>>> follow
> >>>>>>> the
> >>>>>>>>>>>>>> FLIP?
> >>>>>>>>>>>>>>>> Do
> >>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>> need an abstract base class or provide a default
> method?
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>> Hang
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid>
> >>>> 于2023年10月30日周一
> >>>>>>>>>>>>>> 14:45写道:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Hi, all,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Thanks for the lively discussion.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Based on the discussion, I have made some adjustments
> >> to
> >>>> the
> >>>>>>>>>>> FLIP
> >>>>>>>>>>>>>>>>>>> document:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 1. The name of the newly added option has been
> changed
> >> to
> >>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled".
> >>>>>>>>>>>>>>>>>>>>> 2. Considering compatibility with older versions, the
> >>>> newly
> >>>>>>>>>>> added
> >>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled" option needs to
> >> respect
> >>>>>>> the
> >>>>>>>>>>>>>>>>>>> optimizer's
> >>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
> >>>> option.
> >>>>>>>>>>>>>>>>>>>>> But there is a consideration to remove the old option
> >> in
> >>>>>>> Flink
> >>>>>>>>>>> 2.0.
> >>>>>>>>>>>>>>>>>>>>> 3. We can provide more options to disable other
> source
> >>>>>>>>> abilities
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>> side
> >>>>>>>>>>>>>>>>>>>>> effects, such as “source.aggregate.enabled” and
> >>>>>>>>>>>>>>>>>>> “source.projection.enabled"
> >>>>>>>>>>>>>>>>>>>>> This is not urgent and can be continuously
> introduced.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback again.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> 2023年10月29日 08:45,Becket Qin <becket....@gmail.com>
> >> 写道:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Thanks for digging into the git history, Jark. I
> agree
> >>>> it
> >>>>>>>>> makes
> >>>>>>>>>>>>>>>> sense
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> deprecate this API in 2.0.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu <
> >>>> imj...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Hi Becket,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> I checked the history of "
> >>>>>>>>>>>>>>>>>>>>>>>
> *table.optimizer.source.predicate-pushdown-enabled*",
> >>>>>>>>>>>>>>>>>>>>>>> it seems it was introduced since the legacy
> >>>>>>>>>>> FilterableTableSource
> >>>>>>>>>>>>>>>>>>>>>>> interface
> >>>>>>>>>>>>>>>>>>>>>>> which might be an experiential feature at that
> time.
> >> I
> >>>>>>> don't
> >>>>>>>>>>> see
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>> necessity
> >>>>>>>>>>>>>>>>>>>>>>> of this option at the moment. Maybe we can
> deprecate
> >>>> this
> >>>>>>>>>>> option
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> drop
> >>>>>>>>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>> in Flink 2.0[1] if it is not necessary anymore.
> This
> >>>> may
> >>>>>>>>> help
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>> simplify this discussion.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> [1]:
> >> https://issues.apache.org/jira/browse/FLINK-32383
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin <
> >>>>>>>>>>> becket....@gmail.com>
> >>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the proposal, Jiabao. My two cents
> below:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 1. If I understand correctly, the motivation of
> the
> >>>> FLIP
> >>>>>>> is
> >>>>>>>>>>>>>>>> mainly to
> >>>>>>>>>>>>>>>>>>>>>>>> make predicate pushdown optional on SOME of the
> >>>> Sources.
> >>>>>>> If
> >>>>>>>>>>> so,
> >>>>>>>>>>>>>>>>>>>>> intuitively
> >>>>>>>>>>>>>>>>>>>>>>>> the configuration should be Source specific
> instead
> >> of
> >>>>>>>>>>> general.
> >>>>>>>>>>>>>>>>>>>>> Otherwise,
> >>>>>>>>>>>>>>>>>>>>>>>> we will end up with general configurations that
> may
> >>>> not
> >>>>>>>>> take
> >>>>>>>>>>>>>>>> effect
> >>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>> some of the Source implementations. This violates
> >> the
> >>>>>>> basic
> >>>>>>>>>>> rule
> >>>>>>>>>>>>>>>> of a
> >>>>>>>>>>>>>>>>>>>>>>>> configuration - it does what it says, regardless
> of
> >>>> the
> >>>>>>>>>>>>>>>>>>> implementation.
> >>>>>>>>>>>>>>>>>>>>>>>> While configuration standardization is usually a
> >> good
> >>>>>>>>> thing,
> >>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>> break the basic rules.
> >>>>>>>>>>>>>>>>>>>>>>>> If we really want to have this general
> >> configuration,
> >>>> for
> >>>>>>>>> the
> >>>>>>>>>>>>>>>> sources
> >>>>>>>>>>>>>>>>>>>>>>>> this configuration does not apply, they should
> throw
> >>>> an
> >>>>>>>>>>>>>> exception
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>> make
> >>>>>>>>>>>>>>>>>>>>>>>> it clear that this configuration is not supported.
> >>>>>>> However,
> >>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>> seems
> >>>>>>>>>>>>>>>>>>>>> ugly.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 2. I think the actual motivation of this FLIP is
> >> about
> >>>>>>>>> "how a
> >>>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>>>>>>>> should implement predicate pushdown efficiently",
> >> not
> >>>>>>>>>>> "whether
> >>>>>>>>>>>>>>>>>>>>> predicate
> >>>>>>>>>>>>>>>>>>>>>>>> pushdown should be applied to the source." For
> >>>> example,
> >>>>>>> if
> >>>>>>>>> a
> >>>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>>>>> wants
> >>>>>>>>>>>>>>>>>>>>>>>> to avoid additional computing load in the external
> >>>>>>> system,
> >>>>>>>>> it
> >>>>>>>>>>>>>> can
> >>>>>>>>>>>>>>>>>>>>> always
> >>>>>>>>>>>>>>>>>>>>>>>> read the entire record and apply the predicates by
> >>>>>>> itself.
> >>>>>>>>>>>>>>>> However,
> >>>>>>>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>>>>>>>>> the Flink perspective, the predicate pushdown is
> >>>> applied,
> >>>>>>>>> it
> >>>>>>>>>>> is
> >>>>>>>>>>>>>>>> just
> >>>>>>>>>>>>>>>>>>>>>>>> implemented differently by the source. So the
> design
> >>>>>>>>>>> principle
> >>>>>>>>>>>>>>>> here
> >>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>> Flink only cares about whether a source supports
> >>>>>>> predicate
> >>>>>>>>>>>>>>>> pushdown
> >>>>>>>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>>>>> not,
> >>>>>>>>>>>>>>>>>>>>>>>> it does not care about the implementation
> >> efficiency /
> >>>>>>> side
> >>>>>>>>>>>>>>>> effect of
> >>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>> predicates pushdown. It is the Source
> >> implementation's
> >>>>>>>>>>>>>>>> responsibility
> >>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>> ensure the predicates pushdown is implemented
> >>>> efficiently
> >>>>>>>>> and
> >>>>>>>>>>>>>> does
> >>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>> impose excessive pressure on the external system.
> >> And
> >>>> it
> >>>>>>> is
> >>>>>>>>>>> OK
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>>>>>>>>> additional configurations to achieve this goal.
> >>>>>>> Obviously,
> >>>>>>>>>>> such
> >>>>>>>>>>>>>>>>>>>>>>>> configurations will be source specific in this
> case.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 3. Regarding the existing configurations of
> >>>>>>>>>>>>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled.
> >>>>>>>>>>>>>>>>>>>>>>>> *I am not sure why we need it. Supposedly, if a
> >> source
> >>>>>>>>>>>>>> implements
> >>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>> SupportsXXXPushDown interface, the optimizer
> should
> >>>> push
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> corresponding
> >>>>>>>>>>>>>>>>>>>>>>>> predicates to the Source. I am not sure in which
> >> case
> >>>>>>> this
> >>>>>>>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>> would be used. Any ideas @Jark Wu <
> imj...@gmail.com
> >>> ?
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun
> >>>>>>>>>>>>>>>>>>>>>>>> <jiabao....@xtransfer.cn.invalid> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for the detailed explanation.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I think that for users, we should respect
> >> conventions
> >>>>>>> over
> >>>>>>>>>>>>>>>>>>>>>>>>> configurations.
> >>>>>>>>>>>>>>>>>>>>>>>>> Conventions can be default values explicitly
> >>>> specified
> >>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>>>> configurations, or they can be behaviors that
> >> follow
> >>>>>>>>>>> previous
> >>>>>>>>>>>>>>>>>>>>> versions.
> >>>>>>>>>>>>>>>>>>>>>>>>> If the same code has different behaviors in
> >> different
> >>>>>>>>>>> versions,
> >>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>> would
> >>>>>>>>>>>>>>>>>>>>>>>>> be a very bad thing.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I agree that for regular users, it is not
> necessary
> >>>> to
> >>>>>>>>>>>>>> understand
> >>>>>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>>>>>>> the configurations related to Flink.
> >>>>>>>>>>>>>>>>>>>>>>>>> By following conventions, they can have a good
> >>>>>>> experience.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Let's get back to the practical situation and
> >>>> consider
> >>>>>>> it.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Case 1:
> >>>>>>>>>>>>>>>>>>>>>>>>> The user is not familiar with the purpose of the
> >>>>>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled
> >>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>> but
> >>>>>>>>>>>>>>>>>>>>> follows
> >>>>>>>>>>>>>>>>>>>>>>>>> the convention of allowing predicate pushdown to
> >> the
> >>>>>>>>> source
> >>>>>>>>>>> by
> >>>>>>>>>>>>>>>>>>>>> default.
> >>>>>>>>>>>>>>>>>>>>>>>>> Just understanding the
> >>>> source.predicate-pushdown-enabled
> >>>>>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>> and performing fine-grained toggle control will
> >> work
> >>>>>>> well.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Case 2:
> >>>>>>>>>>>>>>>>>>>>>>>>> The user understands the meaning of the
> >>>>>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled
> >>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> has set
> >>>>>>>>>>>>>>>>>>>>>>>>> its value to false.
> >>>>>>>>>>>>>>>>>>>>>>>>> We have reason to believe that the user
> understands
> >>>> the
> >>>>>>>>>>> meaning
> >>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> predicate pushdown configuration and the
> intention
> >>>> is to
> >>>>>>>>>>>>>> disable
> >>>>>>>>>>>>>>>>>>>>> predicate
> >>>>>>>>>>>>>>>>>>>>>>>>> pushdown (rather than whether or not to allow
> it).
> >>>>>>>>>>>>>>>>>>>>>>>>> The previous choice of globally disabling it is
> >>>> likely
> >>>>>>>>>>> because
> >>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>>>> couldn't be disabled on individual sources.
> >>>>>>>>>>>>>>>>>>>>>>>>> From this perspective, if we provide more
> >>>> fine-grained
> >>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>> support and provide detailed explanations of the
> >>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>> behaviors in
> >>>>>>>>>>>>>>>>>>>>>>>>> the documentation,
> >>>>>>>>>>>>>>>>>>>>>>>>> users can clearly understand the differences
> >> between
> >>>>>>> these
> >>>>>>>>>>> two
> >>>>>>>>>>>>>>>>>>>>>>>>> configurations and use them correctly.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Also, I don't agree that
> >>>>>>>>>>>>>>>>>>>>>>>>>
> table.optimizer.source.predicate-pushdown-enabled =
> >>>> true
> >>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>> source.predicate-pushdown-enabled = false means
> >> that
> >>>> the
> >>>>>>>>>>> local
> >>>>>>>>>>>>>>>>>>>>>>>>> configuration overrides the global configuration.
> >>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, both configurations are
> >> functioning
> >>>>>>>>>>> correctly.
> >>>>>>>>>>>>>>>>>>>>>>>>> The optimizer allows predicate pushdown to all
> >>>> sources,
> >>>>>>>>> but
> >>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>>>>>>> sources
> >>>>>>>>>>>>>>>>>>>>>>>>> can reject the filters pushed down by the
> >> optimizer.
> >>>>>>>>>>>>>>>>>>>>>>>>> This is natural, just like different components
> at
> >>>>>>>>> different
> >>>>>>>>>>>>>>>> levels
> >>>>>>>>>>>>>>>>>>>>> are
> >>>>>>>>>>>>>>>>>>>>>>>>> responsible for different tasks.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> The more serious issue is that if
> >>>>>>>>>>>>>>>>>>> "source.predicate-pushdown-enabled"
> >>>>>>>>>>>>>>>>>>>>>>>>> does not respect
> >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled”,
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>> "table.optimizer.source.predicate-pushdown-enabled"
> >>>>>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>> will be invalidated.
> >>>>>>>>>>>>>>>>>>>>>>>>> This means that regardless of whether
> >>>>>>>>>>>>>>>>>>>>>>>>>
> "table.optimizer.source.predicate-pushdown-enabled"
> >>>> is
> >>>>>>> set
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>> true
> >>>>>>>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>>>>>>>>> false, it will have no effect.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 22:24,Jane Chan <
> >> qingyue....@gmail.com>
> >>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the in-depth clarification. Here are
> my
> >>>>>>> cents
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> However,
> >>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are
> >> configurations
> >>>> for
> >>>>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> We cannot assume that every user would be
> >>>> interested in
> >>>>>>>>>>>>>>>>>>> understanding
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> internal components of Flink, such as the
> >> optimizer
> >>>> or
> >>>>>>>>>>>>>>>> connectors,
> >>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> specific configurations associated with each
> >>>> component.
> >>>>>>>>>>>>>> Instead,
> >>>>>>>>>>>>>>>>>>>>> users
> >>>>>>>>>>>>>>>>>>>>>>>>>> might be more concerned about knowing which
> >>>>>>> configuration
> >>>>>>>>>>>>>>>> enables
> >>>>>>>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>>>>>>>>>> disables the filter push-down feature for all
> >> source
> >>>>>>>>>>>>>> connectors,
> >>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>>>>>>> parameter provides the flexibility to override
> >> this
> >>>>>>>>>>> behavior
> >>>>>>>>>>>>>>>> for a
> >>>>>>>>>>>>>>>>>>>>>>>>> single
> >>>>>>>>>>>>>>>>>>>>>>>>>> source if needed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> So, from this perspective, I am inclined to
> divide
> >>>>>>> these
> >>>>>>>>>>> two
> >>>>>>>>>>>>>>>>>>>>> parameters
> >>>>>>>>>>>>>>>>>>>>>>>>>> based on the scope of their impact from the
> user's
> >>>>>>>>>>> perspective
> >>>>>>>>>>>>>>>>>>> (i.e.
> >>>>>>>>>>>>>>>>>>>>>>>>>> global-level or operator-level), rather than
> >>>>>>> categorizing
> >>>>>>>>>>> them
> >>>>>>>>>>>>>>>>>>> based
> >>>>>>>>>>>>>>>>>>>>>>>>> on the
> >>>>>>>>>>>>>>>>>>>>>>>>>> component hierarchy from a developer's point of
> >>>> view.
> >>>>>>>>>>>>>> Therefore,
> >>>>>>>>>>>>>>>>>>>>> based
> >>>>>>>>>>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>>>>>>>> this premise, it is intuitive and natural for
> >> users
> >>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>> understand fine-grained configuration options
> can
> >>>>>>>>> override
> >>>>>>>>>>>>>>>> global
> >>>>>>>>>>>>>>>>>>>>>>>>>> configurations.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled"
> >>>>>>> doesn't
> >>>>>>>>>>>>>>>> respect to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >> "table.optimizer.source.predicate-pushdown-enabled"
> >>>>>>> and
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> default
> >>>>>>>>>>>>>>>>>>>>>>>>> value
> >>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined
> as
> >>>> true,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>> "table.optimizer.source.predicate-pushdown-enabled" as
> >>>>>>>>>>> false
> >>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>>>>>>>> have no
> >>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be
> >>>> performed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
> >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in
> >> behavior
> >>>>>>>>>>> compared
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> previous
> >>>>>>>>>>>>>>>>>>>>>>>>>>> version.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> <1>If I understand correctly,
> >>>>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>>>>>>> is a
> >>>>>>>>>>>>>>>>>>>>>>>>>> connector option, which means the only way to
> >>>> configure
> >>>>>>>>> it
> >>>>>>>>>>> is
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>> explicitly
> >>>>>>>>>>>>>>>>>>>>>>>>>> specify it in DDL (no matter whether disable or
> >>>>>>> enable),
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> SET
> >>>>>>>>>>>>>>>>>>>>>>>>>> command is not applicable, so I think it's
> natural
> >>>> to
> >>>>>>>>> still
> >>>>>>>>>>>>>>>> respect
> >>>>>>>>>>>>>>>>>>>>>>>>> user's
> >>>>>>>>>>>>>>>>>>>>>>>>>> specification here. Otherwise, users might be
> more
> >>>>>>>>> confused
> >>>>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>> why
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> DDL does not work as expected, and the reason is
> >>>> just
> >>>>>>>>>>> because
> >>>>>>>>>>>>>>>> some
> >>>>>>>>>>>>>>>>>>>>>>>>> other
> >>>>>>>>>>>>>>>>>>>>>>>>>> "optimizer" configuration is set to a different
> >>>> value.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> <2> From the implementation side, I am inclined
> to
> >>>> keep
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> parameter's
> >>>>>>>>>>>>>>>>>>>>>>>>>> priority consistent for all conditions.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Let "global" denote
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >> "table.optimizer.source.predicate-pushdown-enabled",
> >>>>>>>>>>>>>>>>>>>>>>>>>> and let "per-source" denote
> >>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>> specific
> >>>>>>>>>>>>>>>>>>>>>>>>>> source T,  the following Truth table (based on
> the
> >>>>>>>>> current
> >>>>>>>>>>>>>>>> design)
> >>>>>>>>>>>>>>>>>>>>>>>>>> indicates the inconsistent behavior for
> >> "per-source
> >>>>>>>>>>> override
> >>>>>>>>>>>>>>>>>>> global".
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> .------------.---------------.-------------------
> >>>>>>>>>>>>>>>>>>>>>>>>>> ----.-------------------------------------.
> >>>>>>>>>>>>>>>>>>>>>>>>>> | global   | per-source | push-down for T |
> >>>> per-source
> >>>>>>>>>>>>>> override
> >>>>>>>>>>>>>>>>>>>>> global
> >>>>>>>>>>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>
> :-----------+--------------+-----------------------+------------------------------------:
> >>>>>>>>>>>>>>>>>>>>>>>>>> | true       | false         | false
> >>>>>>>>> | Y
> >>>>>>>>>>>>>>>>>>>>>>>>>>           |
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>
> :-----------+--------------+-----------------------+------------------------------------:
> >>>>>>>>>>>>>>>>>>>>>>>>>> | false     | true           | false
> >>>>>>>>> | N
> >>>>>>>>>>>>>>>>>>>>>>>>>>           |
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>
> .------------.---------------.-----------------------.-------------------------------------.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>> Jane
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun <
> >>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
> >>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I understand that the configuration of global
> >>>>>>>>> parallelism
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>> task
> >>>>>>>>>>>>>>>>>>>>>>>>>>> parallelism is at different granularities but
> >> with
> >>>> the
> >>>>>>>>>>> same
> >>>>>>>>>>>>>>>>>>>>>>>>> configuration.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> However,
> >>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled"
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are
> >> configurations
> >>>> for
> >>>>>>>>>>>>>>>> different
> >>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator).
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> From a user's perspective, there are two
> >> scenarios:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Disabling all filter pushdown
> >>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, setting
> >>>>>>>>>>>>>>>>>>>>>>>>>
> "table.optimizer.source.predicate-pushdown-enabled"
> >>>>>>>>>>>>>>>>>>>>>>>>>>> to false is sufficient to meet the requirement.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Disabling filter pushdown for specific
> sources
> >>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, there is no need to adjust
> the
> >>>> value
> >>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>> "table.optimizer.source.predicate-pushdown-enabled".
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Instead, the focus should be on the
> configuration
> >>>> of
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to meet the
> >>>>>>> requirement.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, users do not need to set
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>> "table.optimizer.source.predicate-pushdown-enabled" to
> >>>>>>>>>>> false
> >>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>> manually
> >>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter pushdown for specific sources.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if
> "scan.filter-push-down.enabled"
> >>>>>>> doesn't
> >>>>>>>>>>>>>>>> respect
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >> "table.optimizer.source.predicate-pushdown-enabled"
> >>>>>>> and
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>> default
> >>>>>>>>>>>>>>>>>>>>>>>>> value
> >>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined
> as
> >>>> true,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>> "table.optimizer.source.predicate-pushdown-enabled" as
> >>>>>>>>>>> false
> >>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>>>>>>>> have no
> >>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be
> >>>> performed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of
> >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in
> >> behavior
> >>>>>>>>>>> compared
> >>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> previous
> >>>>>>>>>>>>>>>>>>>>>>>>>>> version.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> The same SQL query that could successfully push
> >>>> down
> >>>>>>>>>>> filters
> >>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>>>>>>>>>>>> version but would no longer do so after the
> >>>> upgrade.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 17:10,Benchao Li <
> >>>> libenc...@apache.org>
> >>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jiabao for the detailed explanations,
> >> that
> >>>>>>>>> helps a
> >>>>>>>>>>>>>>>> lot, I
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> understand your rationale now.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong. Your perspective is
> >> from
> >>>>>>>>>>>>>> "developer",
> >>>>>>>>>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> means there is an optimizer and connector
> >>>> component,
> >>>>>>>>> and
> >>>>>>>>>>> if
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>> want
> >>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> enable this feature (pushing filters down into
> >>>>>>>>>>> connectors),
> >>>>>>>>>>>>>>>> you
> >>>>>>>>>>>>>>>>>>>>> must
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> enable it firstly in optimizer, and only then
> >>>>>>> connector
> >>>>>>>>>>> has
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> chance
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide to use it or not.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> My perspective is from "user" that (Why a user
> >>>> should
> >>>>>>>>>>> care
> >>>>>>>>>>>>>>>> about
> >>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> difference of optimizer/connector) , this is a
> >>>>>>> feature,
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>>> two
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> way to control it, one way is to config it
> >>>> job-level,
> >>>>>>>>> the
> >>>>>>>>>>>>>>>> other
> >>>>>>>>>>>>>>>>>>> one
> >>>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> in table properties. What a user expects is
> that
> >>>> they
> >>>>>>>>> can
> >>>>>>>>>>>>>>>>>>> control a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> feature in a tiered way, that setting it per
> >> job,
> >>>> and
> >>>>>>>>>>> then
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> fine-grained tune it per table.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> This is some kind of similar to other
> concepts,
> >>>> such
> >>>>>>> as
> >>>>>>>>>>>>>>>>>>>>> parallelism,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> users can set a job level default parallelism,
> >> and
> >>>>>>> then
> >>>>>>>>>>>>>>>>>>>>> fine-grained
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> tune it per operator. There may be more such
> >>>> debate
> >>>>>>> in
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>> future
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g., we can have a job level config about
> >> adding
> >>>>>>>>> key-by
> >>>>>>>>>>>>>>>> before
> >>>>>>>>>>>>>>>>>>>>>>>>> lookup
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> join, and also a hint/table property way to
> >>>>>>>>> fine-grained
> >>>>>>>>>>>>>>>> control
> >>>>>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> per lookup operator. Hence we'd better find a
> >>>> unified
> >>>>>>>>> way
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>> all
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> those similar kind of features.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid>
> >>>>>>>>>>> 于2023年10月25日周三
> >>>>>>>>>>>>>>>>>>>>> 15:27写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for further explanation.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> These two configurations correspond to
> >> different
> >>>>>>>>> levels.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" does not make
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" invalid.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The planner will still push down predicates
> to
> >>>> all
> >>>>>>>>>>> sources.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whether filter pushdown is allowed or not is
> >>>>>>>>> determined
> >>>>>>>>>>> by
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> specific
> >>>>>>>>>>>>>>>>>>>>>>>>>>> source's "scan.filter-push-down.enabled"
> >>>>>>> configuration.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, "table.optimizer.source.predicate"
> >> does
> >>>>>>>>>>> directly
> >>>>>>>>>>>>>>>> affect
> >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled”.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> When the planner disables predicate pushdown,
> >> the
> >>>>>>>>>>>>>>>> source-level
> >>>>>>>>>>>>>>>>>>>>>>>>> filter
> >>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown will also not be executed, even if the
> >>>> source
> >>>>>>>>>>> allows
> >>>>>>>>>>>>>>>>>>> filter
> >>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whatever, in point 1 and 2, our expectation
> is
> >>>>>>>>>>> consistent.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the 3rd point, I still think that the
> >>>>>>>>> planner-level
> >>>>>>>>>>>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over the source-level
> >>>> configuration.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may seem counterintuitive when we globally
> >>>>>>> disable
> >>>>>>>>>>>>>>>> predicate
> >>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown but allow filter pushdown at the
> source
> >>>>>>> level.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 14:35,Jane Chan <
> >>>> qingyue....@gmail.com
> >>>>>>>>
> >>>>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for clarifying this. While by
> >>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled
> >>>>>>>>>>>>>>>>>>>>>>>>>>> takes a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority" I meant that this value
> >> should
> >>>> be
> >>>>>>>>>>>>>> respected
> >>>>>>>>>>>>>>>>>>>>>>>>> whenever
> >>>>>>>>>>>>>>>>>>>>>>>>>>> it is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> set explicitly.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The conclusion that
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" =
> "true"
> >>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate
> >>>> pushdown,
> >>>>>>> but
> >>>>>>>>>>>>>>>>>>> individual
> >>>>>>>>>>>>>>>>>>>>>>>>>>> sources do
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This indicates that the option
> >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled =
> >>>>>>>>>>>>>>>>>>>>>>>>> false"
> >>>>>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an individual source connector does indeed
> >>>> override
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> global-level
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> planner settings to make a difference. And
> >> thus
> >>>>>>> "has
> >>>>>>>>> a
> >>>>>>>>>>>>>>>> higher
> >>>>>>>>>>>>>>>>>>>>>>>>>>> priority".
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> While for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" =
> >> "false"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the
> >>>> planner.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
> >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains
> >>>> consistent
> >>>>>>>>> with
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>>>>>>>>>>>> version as
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I still think
> "scan.filter-push-down.enabled"
> >>>>>>> should
> >>>>>>>>>>> also
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>>> respected
> >>>>>>>>>>>>>>>>>>>>>>>>>>> if
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is enabled for individual connectors.
> WDYT?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun <
> >>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the current proposal, we recommend
> >> keeping
> >>>> the
> >>>>>>>>>>>>>> default
> >>>>>>>>>>>>>>>>>>> value
> >>>>>>>>>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" as true,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and setting the the default value of newly
> >>>>>>>>> introduced
> >>>>>>>>>>>>>>>> option
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to true as
> >>>> well.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The main purpose of doing this is to
> maintain
> >>>>>>>>>>> consistency
> >>>>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>>>>>>>>> previous
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> versions, as whether to perform
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter pushdown in the old version solely
> >>>> depends
> >>>>>>> on
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" option.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That means by default, as long as a
> >> TableSource
> >>>>>>>>>>>>>> implements
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface, filter
> >>>> pushdown
> >>>>>>> is
> >>>>>>>>>>>>>>>> allowed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And it seems that we don't have much
> benefit
> >> in
> >>>>>>>>>>> changing
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> default
> >>>>>>>>>>>>>>>>>>>>>>>>>>> value
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "table.optimizer.source.predicate" to
> >> false.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding the priority of these two
> >>>>>>> configurations,
> >>>>>>>>> I
> >>>>>>>>>>>>>>>> believe
> >>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over
> >>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>> it
> >>>>>>>>>>>>>>>>>>>>>>>>> exhibits
> >>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following behavior.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. "table.optimizer.source.predicate" =
> >> "true"
> >>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "true"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is the default behavior, allowing
> filter
> >>>>>>>>> pushdown
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>> sources.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" =
> >> "true"
> >>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate
> >>>> pushdown,
> >>>>>>> but
> >>>>>>>>>>>>>>>>>>> individual
> >>>>>>>>>>>>>>>>>>>>>>>>>>> sources do
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" =
> >> "false"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the
> >>>> planner.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the
> >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains
> >>>> consistent
> >>>>>>>>> with
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>> old
> >>>>>>>>>>>>>>>>>>>>>>>>>>> version as
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From an implementation perspective, setting
> >> the
> >>>>>>>>>>> priority
> >>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" higher than
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" is
> >>>> difficult to
> >>>>>>>>>>>>>> achieve
> >>>>>>>>>>>>>>>>>>> now.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because the
> PushFilterIntoSourceScanRuleBase
> >> at
> >>>>>>> the
> >>>>>>>>>>>>>> planner
> >>>>>>>>>>>>>>>>>>>>> level
> >>>>>>>>>>>>>>>>>>>>>>>>>>> takes
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> precedence over the source-level
> >>>>>>> FilterPushDownSpec.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Only when the
> >> PushFilterIntoSourceScanRuleBase
> >>>> is
> >>>>>>>>>>>>>> enabled,
> >>>>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Source-level filter pushdown be performed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, in my opinion, there doesn't
> >>>> seem to
> >>>>>>>>> be
> >>>>>>>>>>>>>> much
> >>>>>>>>>>>>>>>>>>>>>>>>> benefit in
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setting a higher priority for
> >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled".
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may instead affect compatibility and
> >>>> increase
> >>>>>>>>>>>>>>>>>>> implementation
> >>>>>>>>>>>>>>>>>>>>>>>>>>> complexity.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li <
> >>>>>>> libenc...@apache.org
> >>>>>>>>>>
> >>>>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with Jane that fine-grained
> >>>>>>> configurations
> >>>>>>>>>>>>>> should
> >>>>>>>>>>>>>>>>>>> have
> >>>>>>>>>>>>>>>>>>>>>>>>> higher
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority than job level configurations.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For current proposal, we can achieve that:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" =
> >>>> "true"
> >>>>>>>>> to
> >>>>>>>>>>>>>>>> enable
> >>>>>>>>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
> >>>>>>> ""scan.filter-push-down.enabled" =
> >>>>>>>>>>>>>>>> "false"
> >>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>> disable
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" =
> >>>>>>> "false"
> >>>>>>>>> to
> >>>>>>>>>>>>>>>> disable
> >>>>>>>>>>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set
> >>>>>>> ""scan.filter-push-down.enabled" =
> >>>>>>>>>>>>>>>> "true" to
> >>>>>>>>>>>>>>>>>>>>>>>>> enable
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com>
> >>>> 于2023年10月24日周二
> >>>>>>>>>>>>>> 23:55写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
> >>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level
> than
> >>>> the
> >>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong, but I think the
> >>>>>>>>>>> fine-grained
> >>>>>>>>>>>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" should
> >> have a
> >>>>>>>>> higher
> >>>>>>>>>>>>>>>>>>> priority
> >>>>>>>>>>>>>>>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default value of
> >>>>>>>>> "table.optimizer.source.predicate"
> >>>>>>>>>>> is
> >>>>>>>>>>>>>>>> true.
> >>>>>>>>>>>>>>>>>>>>> As
> >>>>>>>>>>>>>>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> result,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> turning off filter push-down for a
> specific
> >>>>>>> source
> >>>>>>>>>>> will
> >>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> take
> >>>>>>>>>>>>>>>>>>>>>>>>>>> effect
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unless the default value of
> >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
> >>>>>>>>>>>>>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> changed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false, or, alternatively, let users
> >>>> manually
> >>>>>>>>> set
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" to
> false
> >>>>>>> first
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>> then
> >>>>>>>>>>>>>>>>>>>>>>>>>>> selectively
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter push-down for the desired
> >>>> sources,
> >>>>>>>>>>> which
> >>>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>> less
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> intuitive.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao
> Sun
> >> <
> >>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration
> >>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate"
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level
> than
> >>>> the
> >>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan <
> >>>>>>>>>>> qingyue....@gmail.com>
> >>>>>>>>>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this discussion. I
> >> have
> >>>> a
> >>>>>>>>> small
> >>>>>>>>>>>>>>>>>>> question
> >>>>>>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take
> >>>>>>> precedence
> >>>>>>>>>>> over
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" when
> >> the
> >>>>>>> two
> >>>>>>>>>>>>>>>> parameters
> >>>>>>>>>>>>>>>>>>>>>>>>> might
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conflict
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each other?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao
> >> Sun
> >>>> <
> >>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jark,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we only add configuration without
> >>>> adding
> >>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> enableFilterPushDown
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown
> >>>>>>> interface,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each connector would have to handle
> the
> >>>> same
> >>>>>>>>>>> logic
> >>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method to determine whether filter
> >>>> pushdown
> >>>>>>> is
> >>>>>>>>>>>>>> needed.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This would increase complexity and
> >> violate
> >>>>>>> the
> >>>>>>>>>>>>>>>> original
> >>>>>>>>>>>>>>>>>>>>>>>>> behavior
> >>>>>>>>>>>>>>>>>>>>>>>>>>> of
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters method.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, we only need to pass
> >> the
> >>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>>>> parameter in
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> newly added enableFilterPushDown
> method
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide whether to perform predicate
> >>>>>>>>> pushdown.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think this approach would be clearer
> >> and
> >>>>>>>>>>> simpler.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu <
> >>>> imj...@gmail.com
> >>>>>>>>
> >>>>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi JIabao,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think the current interface can
> >> already
> >>>>>>>>>>> satisfy
> >>>>>>>>>>>>>>>> your
> >>>>>>>>>>>>>>>>>>>>>>>>>>> requirements.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The connector can reject all the
> >> filters
> >>>> by
> >>>>>>>>>>>>>> returning
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> input
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filters
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as `Result#remainingFilters`.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So maybe we don't need to introduce a
> >> new
> >>>>>>>>>>> method to
> >>>>>>>>>>>>>>>>>>>>> disable
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown, but just introduce an
> option
> >>>> for
> >>>>>>> the
> >>>>>>>>>>>>>>>> specific
> >>>>>>>>>>>>>>>>>>>>>>>>>>> connector.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard
> >> Xu
> >>>> <
> >>>>>>>>>>>>>>>>>>>>> xbjt...@gmail.com
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this
> >>>>>>>>> discussion.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you add a section to explain
> the
> >>>>>>>>>>> difference
> >>>>>>>>>>>>>>>>>>> between
> >>>>>>>>>>>>>>>>>>>>>>>>>>> proposed
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector level config
> >>>>>>>>>>>>>>>> `scan.filter-push-down.enabled`
> >>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>>>>> existing
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> query
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level config
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>> `table.optimizer.source.predicate-pushdown-enabled` ?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonard
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun <
> >>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn
> >>>>>>>>>>>>>>>>>>>>>>>>>>> .INVALID>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 写道:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Devs,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion
> on
> >>>>>>>>>>> FLIP-377:
> >>>>>>>>>>>>>>>>>>> support
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable filter pushdown for
> Table/SQL
> >>>>>>>>>>> Sources[1].
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not
> >>>> expose
> >>>>>>>>>>>>>>>>>>> fine-grained
> >>>>>>>>>>>>>>>>>>>>>>>>>>> control
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users to enable or disable filter
> >>>> pushdown.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, filter pushdown has some
> >> side
> >>>>>>>>>>> effects,
> >>>>>>>>>>>>>>>> such
> >>>>>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>>>>>>>> additional
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computational pressure on external
> >>>> systems.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead
> >> to
> >>>>>>>>> issues
> >>>>>>>>>>>>>> such
> >>>>>>>>>>>>>>>> as
> >>>>>>>>>>>>>>>>>>>>> full
> >>>>>>>>>>>>>>>>>>>>>>>>>>> table
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scans,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which in turn can impact the
> stability
> >>>> of
> >>>>>>>>>>> external
> >>>>>>>>>>>>>>>>>>>>> systems.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with
> two
> >>>>>>>>> sources:
> >>>>>>>>>>>>>>>> Kafka
> >>>>>>>>>>>>>>>>>>>>> and a
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> database.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The database is sensitive to
> >> pressure,
> >>>> and
> >>>>>>>>> we
> >>>>>>>>>>>>>> want
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>> configure
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not perform filter pushdown to the
> >>>> database
> >>>>>>>>>>>>>> source.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, we still want to perform
> >>>> filter
> >>>>>>>>>>> pushdown
> >>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>> Kafka
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> decrease network IO.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I propose to support configuration
> to
> >>>>>>>>> disable
> >>>>>>>>>>>>>>>> filter
> >>>>>>>>>>>>>>>>>>>>> push
> >>>>>>>>>>>>>>>>>>>>>>>>>>> down for
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide
> >>>>>>> whether
> >>>>>>>>> to
> >>>>>>>>>>>>>>>> perform
> >>>>>>>>>>>>>>>>>>>>>>>>> filter
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback.
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>
> >>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> --
> >>>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>>
> >>
> >>
>
>
>

Reply via email to