Yes, that sounds reasonable to me. We can start with ALWAYS and NEVER, and add more policies as needed.
Thanks, Jiangjie (Becket) Qin On Mon, Dec 18, 2023 at 4:48 PM Jiabao Sun <jiabao....@xtransfer.cn.invalid> wrote: > Thanks Bucket, > > The jdbc.filter.handling.policy is good to me as it provides sufficient > extensibility for future filter pushdown optimizations. > However, currently, we don't have an implementation for the AUTO mode, and > it seems that the AUTO mode can easily be confused with the ALWAYS mode > because users don't have the opportunity to MANUALLY decide which filters > to push down. > > I suggest that we only introduce the ALWAYS and NEVER modes for now, and > we can consider introducing more flexible policies in the future, > such as INDEX_ONLY, NUMBERIC_ONLY and so on. > > WDYT? > > Best, > Jiabao > > > > > 2023年12月18日 16:27,Becket Qin <becket....@gmail.com> 写道: > > > > Hi Jiabao, > > > > Please see the reply inline. > > > > > >> The MySQL connector is currently in the flink-connector-jdbc repository > >> and is not a standalone connector. > >> Is it too unique to use "mysql" as the configuration option prefix? > > > > If the intended behavior makes sense to all the supported JDBC drivers, > we > > can make this a JDBC connector configuration. > > > > Also, I would like to ask about the difference in behavior between AUTO > and > >> ALWAYS. > >> It seems that we cannot guarantee the pushing down of all filters to the > >> external system under the ALWAYS > >> mode because not all filters in Flink SQL are supported by the external > >> system. > >> Should we throw an error when encountering a filter that cannot be > pushed > >> down in the ALWAYS mode? > > > > The idea of AUTO is to do efficiency-aware pushdowns. The source will > query > > the external system (MySQL, Oracle, SQL Server, etc) first to retrieve > the > > information of the table. With that information, the source will decide > > whether to further push a filter to the external system based on the > > efficiency. E.g. only push the indexed fields. In contrast, ALWAYS will > > just always push the supported filters to the external system, regardless > > of the efficiency. In case there are filters that are not supported, > > according to the current contract of SupportsFilterPushdown, these > > unsupported filters should just be returned by the > > *SupportsFilterPushdown.applyFilters()* method as remaining filters. > > Therefore, there is no need to throw exceptions here. This is likely the > > desired behavior for most users, IMO. If there are cases that users > really > > want to get alerted when a filter cannot be pushed to the external > system, > > we can add another value like "ENFORCED_ALWAYS", which behaves like > ALWAYS, > > but throws exceptions when a filter cannot be applied to the external > > system. But personally I don't see much value in doing this. > > > > Thanks, > > > > Jiangjie (Becket) Qin > > > > > > > > On Mon, Dec 18, 2023 at 3:54 PM Jiabao Sun <jiabao....@xtransfer.cn > .invalid> > > wrote: > > > >> Hi Becket, > >> > >> The MySQL connector is currently in the flink-connector-jdbc repository > >> and is not a standalone connector. > >> Is it too unique to use "mysql" as the configuration option prefix? > >> > >> Also, I would like to ask about the difference in behavior between AUTO > >> and ALWAYS. > >> It seems that we cannot guarantee the pushing down of all filters to the > >> external system under the ALWAYS > >> mode because not all filters in Flink SQL are supported by the external > >> system. > >> Should we throw an error when encountering a filter that cannot be > pushed > >> down in the ALWAYS mode? > >> > >> Thanks, > >> Jiabao > >> > >>> 2023年12月18日 15:34,Becket Qin <becket....@gmail.com> 写道: > >>> > >>> Hi JIabao, > >>> > >>> Thanks for updating the FLIP. Maybe I did not explain it clearly > enough. > >> My > >>> point is that given there are various good flavors of behaviors > handling > >>> filters pushed down, we should not have a common config of > >>> "ignore.filter.pushdown", because the behavior is not *common*. > >>> > >>> It looks like the original motivation of this FLIP is just for MySql. > >> Let's > >>> focus on what is the best solution for MySql connector here first. > After > >>> that, if people think the best behavior for MySql happens to be a > common > >>> one, we can then discuss whether that is worth being added to the base > >>> implementation of source. For MySQL, if we are going to introduce a > >> config > >>> to MySql, why not have something like "mysql.filter.handling.policy" > with > >>> value of AUTO / NEVER / ALWAYS? Isn't that better than > >>> "ignore.filter.pushdown"? > >>> > >>> Thanks, > >>> > >>> Jiangjie (Becket) Qin > >>> > >>> > >>> > >>> On Sun, Dec 17, 2023 at 11:30 PM Jiabao Sun <jiabao....@xtransfer.cn > >> .invalid> > >>> wrote: > >>> > >>>> Hi Becket, > >>>> > >>>> The FLIP document has been updated as well. > >>>> Please take a look when you have time. > >>>> > >>>> Thanks, > >>>> Jiabao > >>>> > >>>> > >>>>> 2023年12月17日 22:54,Jiabao Sun <jiabao....@xtransfer.cn.INVALID> 写道: > >>>>> > >>>>> Thanks Becket, > >>>>> > >>>>> I apologize for not being able to continue with this proposal due to > >>>> being too busy during this period. > >>>>> > >>>>> The viewpoints you shared about the design of Flink Source make sense > >> to > >>>> me > >>>>> The native configuration ‘ignore.filter.pushdown’ is good to me. > >>>>> Having a unified name or naming style can indeed prevent confusion > for > >>>> users regarding > >>>>> the inconsistent naming of this configuration across different > >>>> connectors. > >>>>> > >>>>> Currently, there are not many external connectors that support filter > >>>> pushdown. > >>>>> I propose that we first introduce it in flink-connector-jdbc and > >>>> flink-connector-mongodb. > >>>>> Do you think this is feasible? > >>>>> > >>>>> Best, > >>>>> Jiabao > >>>>> > >>>>> > >>>>>> 2023年11月16日 17:45,Becket Qin <becket....@gmail.com> 写道: > >>>>>> > >>>>>> Hi Jiabao, > >>>>>> > >>>>>> Arguments like "because Spark has it so Flink should also have it" > >> does > >>>> not > >>>>>> make sense. Different projects have different API flavors and > styles. > >>>> What > >>>>>> is really important is the rationale and the design principle behind > >> the > >>>>>> API. They should conform to the convention of the project. > >>>>>> > >>>>>> First of all, Spark Source API itself has a few issues and they > ended > >> up > >>>>>> introduce DataSource V2 in Spark 3.0, which added the decorative > >>>> interfaces > >>>>>> like SupportsPushdownXXX. Some of the configurations predating > >>>> DataSource > >>>>>> V2 may still be there. > >>>>>> > >>>>>> For the Spark configurations you mentioned, they are all the > >>>> configurations > >>>>>> for FileScanBuilder, which is equivalent to FileSource in Flink. > >>>> Currently, > >>>>>> regardless of the format (ORC, Parquet, Avro, etc), the FileSource > >>>> pushes > >>>>>> back all the filters to ensure correctness. The actual filters that > >> got > >>>>>> applied to the specific format might still be different. This > >>>>>> implementation is the same in FileScanBuilder.pushFilters() for > >> Spark. I > >>>>>> don't know why Spark got separate configurations for each format. > >> Maybe > >>>> it > >>>>>> is because the filters are actually implemented differently for > >>>> different > >>>>>> format. > >>>>>> > >>>>>> At least for the current implementation in FileScanBuilder, these > >>>>>> configurations can be merged to one configuration like > >>>>>> `apply.filters.to.format.enabled`. Note that this config, as well as > >> the > >>>>>> separate configs you mentioned, are just visible and used by the > >>>>>> FileScanBuilder. It determines whether the filters should be passed > >>>> down to > >>>>>> the format of the FileScanBuilder instance. Regardless of the value > of > >>>>>> these configs, FileScanBuilder.pushFilters() will always be called, > >> and > >>>>>> FileScanBuilder (as well as FileSource in Flink) will always push > back > >>>> all > >>>>>> the filters to the framework. > >>>>>> > >>>>>> A MySql source can have a very different way to handle this. For > >>>> example, A > >>>>>> MySql source A config in this case might be "my.apply.filters" with > >>>> three > >>>>>> different values: > >>>>>> - AUTO: The Source will issue a DESC Table request to understand > >>>> whether a > >>>>>> filter can be applied efficiently. And decide which filters can be > >>>> applied > >>>>>> and which cannot based on that. > >>>>>> - NEVER: Never apply filtering. It will always do a full table read > >> and > >>>>>> let Flink do the filtering. > >>>>>> - ALWAYS: Always apply the filtering to the MySql server. > >>>>>> > >>>>>> In the above examples of FileSource and MySql Source, I don't think > it > >>>> is a > >>>>>> good idea to shoehorn the behaviors into a naive config of > >>>>>> `ignore.filter.pushdown`. That is why I don't think this is a common > >>>> config. > >>>>>> > >>>>>> To recap, like I said, I do agree that in some cases, we may want to > >>>> behave > >>>>>> differently when filters are pushed down to the sources, even if a > >>>> source > >>>>>> implements SupportsFilterPushDown, but I don't think there is a > >> suitable > >>>>>> common config for this. The behavior is very likely source specific. > >>>>>> > >>>>>> Thanks, > >>>>>> > >>>>>> Jiangjie (Becket) Qin > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Thu, Nov 16, 2023 at 3:41 PM Jiabao Sun <jiabao....@xtransfer.cn > >>>> .invalid> > >>>>>> wrote: > >>>>>> > >>>>>>> Thanks Becket, > >>>>>>> > >>>>>>> I still believe that adding a configuration at the source level to > >>>> disable > >>>>>>> filter pushdown is needed. This demand exists in spark as well[1]. > >>>>>>> > >>>>>>> In Spark, most sources that support filter pushdown provide their > own > >>>>>>> corresponding configuration options to enable or disable filter > >>>> pushdown. > >>>>>>> For PRs[2-4] that support filter pushdown capability, they also > >> provide > >>>>>>> configuration options to disable this capability. > >>>>>>> > >>>>>>> I believe this configuration is applicable to most scenarios, and > >>>> there is > >>>>>>> no need to dwell on why this configuration option was not > introduced > >>>>>>> earlier than the SupportsFilterPushDown interface. > >>>>>>> > >>>>>>> spark.sql.parquet.filterPushdown > >>>>>>> spark.sql.orc.filterPushdown > >>>>>>> spark.sql.csv.filterPushdown.enabled > >>>>>>> spark.sql.json.filterPushdown.enabled > >>>>>>> spark.sql.avro.filterPushdown.enabled > >>>>>>> JDBC Option: pushDownPredicate > >>>>>>> > >>>>>>> We can see that the lack of consistency is caused by each connector > >>>>>>> introducing different configuration options for the same behavior. > >>>>>>> This is one of the motivations for advocating the introduction of a > >>>>>>> unified configuration name. > >>>>>>> > >>>>>>> [1] https://issues.apache.org/jira/browse/SPARK-24288 > >>>>>>> [2] https://github.com/apache/spark/pull/27366 > >>>>>>> [3]https://github.com/apache/spark/pull/26973 > >>>>>>> [4] https://github.com/apache/spark/pull/29145 > >>>>>>> > >>>>>>> Best, > >>>>>>> Jiabao > >>>>>>> > >>>>>>>> 2023年11月16日 08:10,Becket Qin <becket....@gmail.com> 写道: > >>>>>>>> > >>>>>>>> Hi Jiabao, > >>>>>>>> > >>>>>>>> While we can always fix the formality of the config, a more > >>>> fundamental > >>>>>>>> issue here is whether this configuration is common enough. > >> Personally > >>>> I > >>>>>>> am > >>>>>>>> still not convinced it is. > >>>>>>>> > >>>>>>>> Remember we don't have a common implementation for > >>>> SupportsFilterPushdown > >>>>>>>> itself. Why does a potential behavior of the > >>>>>>>> SupportsFilterPushdown.applyFilters() method deserve a common > >>>>>>>> configuration? A common implementation should always come first, > >> then > >>>> its > >>>>>>>> configuration becomes a common configuration as a natural result. > >> But > >>>>>>> here > >>>>>>>> we are trying to add an impl to a configuration just to fix its > >>>>>>> formality. > >>>>>>>> > >>>>>>>> I agree that there might be a few Source implementations that may > >>>> want to > >>>>>>>> avoid additional burdens on the remote system in some > circumstances. > >>>> And > >>>>>>>> these circumstances are very specific: > >>>>>>>> 1. The source talks to a remote service that can help perform the > >>>> actual > >>>>>>>> filtering. > >>>>>>>> 2. The filtering done by the remote service is inefficient for > some > >>>>>>> reason > >>>>>>>> (e.g. missing index) > >>>>>>>> 3. The external service does not want to perform the inefficient > >>>>>>> filtering > >>>>>>>> for some reason (e.g. it is a shared service with others) > >>>>>>>> > >>>>>>>> There are multiple approaches to address the issue. Pushing back > the > >>>>>>>> filters is just one way of achieving this. So here we are talking > >>>> about a > >>>>>>>> config for one of the possible solutions to a scenario with all > the > >>>> above > >>>>>>>> situations. I don't think there is enough justification for the > >>>> config to > >>>>>>>> be common. > >>>>>>>> > >>>>>>>> There is always this trade-off between the proliferation of public > >>>>>>>> interfaces and the API standardization. As an extreme example, we > >> can > >>>>>>> make > >>>>>>>> our public API a union of all the configs potentially used in all > >> the > >>>>>>> cases > >>>>>>>> in the name of standardization. Apparently this won't work. So > there > >>>> must > >>>>>>>> be a bar here and this bar might be somewhat subjective. For this > >>>> FLIP, > >>>>>>>> personally I don't think the config meets my bar for the reason > >> stated > >>>>>>>> above. > >>>>>>>> > >>>>>>>> Therefore, my suggestion remains the same. Keep the config as a > >> Source > >>>>>>>> implementation specific configuration. > >>>>>>>> > >>>>>>>> Thanks, > >>>>>>>> > >>>>>>>> Jiangjie (Becket) Qin > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>>>> On Thu, Nov 16, 2023 at 12:36 AM Jiabao Sun < > >> jiabao....@xtransfer.cn > >>>>>>> .invalid> > >>>>>>>> wrote: > >>>>>>>> > >>>>>>>>> Thanks Becket for the feedback, > >>>>>>>>> > >>>>>>>>> Regarding concerns about common configurations, I think we can > >>>> introduce > >>>>>>>>> FiltersApplier to unify the behavior of various connectors. > >>>>>>>>> > >>>>>>>>> public static class FiltersApplier { > >>>>>>>>> > >>>>>>>>> private final ReadableConfig config; > >>>>>>>>> private final Function<List<ResolvedExpression>, Result> action; > >>>>>>>>> > >>>>>>>>> private FiltersApplier( > >>>>>>>>> ReadableConfig config, > >>>>>>>>> Function<List<ResolvedExpression>, Result> action) { > >>>>>>>>> this.config = config; > >>>>>>>>> this.action = action; > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) { > >>>>>>>>> if (config.get(ENABLE_FILTER_PUSH_DOWN)) { > >>>>>>>>> return action.apply(filters); > >>>>>>>>> } else { > >>>>>>>>> return Result.of(Collections.emptyList(), filters); > >>>>>>>>> } > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> public static FiltersApplier of( > >>>>>>>>> ReadableConfig config, > >>>>>>>>> Function<List<ResolvedExpression>, Result> action) { > >>>>>>>>> return new FiltersApplier(config, action); > >>>>>>>>> } > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> For connectors implementation: > >>>>>>>>> > >>>>>>>>> @Override > >>>>>>>>> public Result applyFilters(List<ResolvedExpression> filters) { > >>>>>>>>> return FiltersApplier.of(config, > >>>>>>>>> f -> Result.of(new ArrayList<>(filters), > >>>>>>>>> Collections.emptyList())); > >>>>>>>>> } > >>>>>>>>> > >>>>>>>>> As for the name, whether it is "source.filter-push-down.enabled" > or > >>>>>>>>> "source.ignore-pushed-down-filters.enabled", I think both are > okay. > >>>>>>>>> > >>>>>>>>> Do you think this change is feasible? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Best, > >>>>>>>>> Jiabao > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> 2023年11月15日 23:44,Becket Qin <becket....@gmail.com> 写道: > >>>>>>>>>> > >>>>>>>>>> Hi Jiabao, > >>>>>>>>>> > >>>>>>>>>> Yes, I still have concerns. > >>>>>>>>>> > >>>>>>>>>> The FLIP violates the following two principles regarding > >>>> configuration: > >>>>>>>>>> > >>>>>>>>>> 1.* A config of a class should never negate the semantic of a > >>>>>>> decorative > >>>>>>>>>> interface implemented by that class. * > >>>>>>>>>> A decorative interface is a public contract with other > components, > >>>>>>> while > >>>>>>>>> a > >>>>>>>>>> config is only internal to the class itself. The configurations > >> for > >>>> the > >>>>>>>>>> Sources are not (and should never be) visible or understood to > >>>>>>>>>> other components (e.g. optimizer). A configuration of a Source > >> only > >>>>>>>>>> controls the behavior of that Source, provided it is not > violating > >>>> the > >>>>>>>>> API > >>>>>>>>>> contract / semantic defined by the decorative interface. So > when a > >>>>>>> Source > >>>>>>>>>> implementation implements SupportsFilterPushdown, this is a > clear > >>>>>>> public > >>>>>>>>>> contract with Flink that filters should be pushed down to that > >>>> Source. > >>>>>>>>>> Therefore, for the same source, there should not be a > >> configuration > >>>>>>>>>> "source.filter-push-down.enabled" which stops the filters from > >> being > >>>>>>>>> pushed > >>>>>>>>>> down to that Source. However, that specific source > implementation > >>>> can > >>>>>>>>> have > >>>>>>>>>> its own config to control its internal behavior, e.g. > >>>>>>>>>> "ignore-pushed-down-filters.enabled" which may push back all the > >>>> pushed > >>>>>>>>>> down filters back to the Flink optimizer. > >>>>>>>>>> > >>>>>>>>>> 2. When we are talking about "common configs", in fact we are > >>>> talking > >>>>>>>>> about > >>>>>>>>>> "configs for common (abstract) implementation classes". With > that > >>>> as a > >>>>>>>>>> context, *a common config should always be backed by a common > >>>>>>>>>> implementation class, so that consistent behavior can be > >>>> guaranteed. * > >>>>>>>>>> The LookupOptions you mentioned are configurations defined for > >>>> classes > >>>>>>>>>> DefaultLookupCache / PeriodicCacheReloadTrigger / > >>>>>>>>> TimedCacheReloadTrigger. > >>>>>>>>>> These configs are considered as "common" only because the > >>>>>>> implementation > >>>>>>>>>> classes using them are common building blocks for lookup table > >>>>>>>>>> implementations. It would not make sense to have a dangling > config > >>>> in > >>>>>>> the > >>>>>>>>>> LookupOptions without the underlying common implementation > class, > >>>> but > >>>>>>>>> only > >>>>>>>>>> relies on a specific source to implement the stated behavior. > >>>>>>>>>> As a bad example, there is this outlier config "max-retries" in > >>>>>>>>>> LookupOptions, which I don't think should be here. This is > because > >>>> the > >>>>>>>>>> retry behavior can be very implementation specific. For example, > >>>> there > >>>>>>>>> can > >>>>>>>>>> be many different flavors of retry related configurations, > >>>>>>> retry-backoff, > >>>>>>>>>> retry-timeout, retry-async, etc. Why only max-retry is put here? > >>>> should > >>>>>>>>> all > >>>>>>>>>> of them be put here? If we put all such kinds of configs in the > >>>> common > >>>>>>>>>> configs for "standardization and unification", the number of > >> "common > >>>>>>>>>> configs" can easily go crazy. And I don't see material benefits > of > >>>>>>> doing > >>>>>>>>>> that. So here I don't think the configuration "max-retry" should > >> be > >>>> in > >>>>>>>>>> LookupOptions, because it is not backed by any common > >> implementation > >>>>>>>>>> classes. If max-retry is implemented in the HBase source, it > >> should > >>>>>>> stay > >>>>>>>>>> there. For the same reason, the config proposed in this FLIP > >>>> (probably > >>>>>>>>> with > >>>>>>>>>> a name less confusing for the first reason mentioned above) > >> should > >>>>>>> stay > >>>>>>>>> in > >>>>>>>>>> the specific Source implementation. > >>>>>>>>>> > >>>>>>>>>> For the two reasons above, I am -1 to what the FLIP currently > >>>> proposes. > >>>>>>>>>> > >>>>>>>>>> I think the right way to address the motivation here is still to > >>>> have a > >>>>>>>>>> config like "ignore-pushed-down-filters.enabled" for the > specific > >>>>>>> source > >>>>>>>>>> implementation. Please let me know if this solves the problem > you > >>>> are > >>>>>>>>>> facing. > >>>>>>>>>> > >>>>>>>>>> Thanks, > >>>>>>>>>> > >>>>>>>>>> Jiangjie (Becket) Qin > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> On Wed, Nov 15, 2023 at 11:52 AM Jiabao Sun < > >>>> jiabao....@xtransfer.cn > >>>>>>>>> .invalid> > >>>>>>>>>> wrote: > >>>>>>>>>> > >>>>>>>>>>> Hi Becket, > >>>>>>>>>>> > >>>>>>>>>>> The purpose of introducing this configuration is that not all > >>>> filter > >>>>>>>>>>> pushdowns can improve overall performance. > >>>>>>>>>>> If the filter can hit the external index, then pushdown is > >>>> definitely > >>>>>>>>>>> worth it, as it can not only improve query time but also > decrease > >>>>>>>>> network > >>>>>>>>>>> overhead. > >>>>>>>>>>> However, for filters that do not hit the external index, it may > >>>>>>>>> increase a > >>>>>>>>>>> lot of performance overhead on the external system. > >>>>>>>>>>> > >>>>>>>>>>> Undeniably, if the connector can make accurate decisions for > good > >>>> and > >>>>>>>>> bad > >>>>>>>>>>> filters, we may not need to introduce this configuration option > >> to > >>>>>>>>> disable > >>>>>>>>>>> pushing down filters to the external system. > >>>>>>>>>>> However, it is currently not easy to achieve. > >>>>>>>>>>> > >>>>>>>>>>> IMO, supporting filter pushdown does not mean that always > filter > >>>>>>>>> pushdown > >>>>>>>>>>> is better. > >>>>>>>>>>> In the absence of automatic decision-making, I think we should > >>>> leave > >>>>>>>>> this > >>>>>>>>>>> decision to users. > >>>>>>>>>>> > >>>>>>>>>>> The newly introduced configuration option is similar to > >>>> LookupOptions, > >>>>>>>>>>> providing unified naming and default values to avoid confusion > >>>> caused > >>>>>>> by > >>>>>>>>>>> inconsistent naming in different connectors for users. > >>>>>>>>>>> Setting the default value to true allows it to maintain > >>>> compatibility > >>>>>>>>> with > >>>>>>>>>>> the default behavior of "always pushdown". > >>>>>>>>>>> > >>>>>>>>>>> Do you have any other concerns about this proposal? Please let > me > >>>>>>> know. > >>>>>>>>>>> > >>>>>>>>>>> Thanks, > >>>>>>>>>>> Jiabao > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>>> 2023年10月31日 17:29,Jiabao Sun <jiabao....@xtransfer.cn > .INVALID> > >>>> 写道: > >>>>>>>>>>>> > >>>>>>>>>>>> Hi Becket, > >>>>>>>>>>>> > >>>>>>>>>>>> Actually, for FileSystemSource, it is not always desired, only > >> OCR > >>>>>>> file > >>>>>>>>>>> formats support filter pushdown. > >>>>>>>>>>>> > >>>>>>>>>>>> We can disable predicate pushdown for FileSystemSource by > >> setting > >>>>>>>>>>> 'table.optimizer.source.predicate-pushdown-enabled' to false. > >>>>>>>>>>>> I think we can also disable filter pushdown at a more granular > >>>> level > >>>>>>>>>>> through fine-grained configuration. > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>> Best, > >>>>>>>>>>>> Jiabao > >>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>>>> 2023年10月31日 16:50,Becket Qin <becket....@gmail.com> 写道: > >>>>>>>>>>>>> > >>>>>>>>>>>>> Hi Jiabao, > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks for the explanation. Maybe it's easier to explain with > >> an > >>>>>>>>>>> example. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Let's take FileSystemTableSource as an example. Currently it > >>>>>>>>> implements > >>>>>>>>>>>>> SupportsFilterPushDown interface. With your proposal, does it > >>>> have > >>>>>>> to > >>>>>>>>>>>>> support `source.filter-push-down.enabled` as well? But this > >>>>>>>>>>> configuration > >>>>>>>>>>>>> does not quite make sense for the FileSystemTableSource > because > >>>>>>> filter > >>>>>>>>>>>>> pushdown is always desired. However, because this > configuration > >>>> is a > >>>>>>>>>>> part > >>>>>>>>>>>>> of the SupportsFilterPushDown interface (which sounds > confusing > >>>> to > >>>>>>>>> begin > >>>>>>>>>>>>> with), the FileSystemTableSource can only do one of the > >>>> following: > >>>>>>>>>>>>> > >>>>>>>>>>>>> 1. Ignore the user configuration to always apply the pushed > >> down > >>>>>>>>>>> filters - > >>>>>>>>>>>>> this is an apparent anti-pattern because a configuration > should > >>>>>>> always > >>>>>>>>>>> do > >>>>>>>>>>>>> what it says. > >>>>>>>>>>>>> 2. Throw an exception telling users that this configuration > is > >>>> not > >>>>>>>>>>>>> applicable to the FileSystemTableSource. > >>>>>>>>>>>>> 3. Implement this configuration to push back the pushed down > >>>>>>> filters, > >>>>>>>>>>> even > >>>>>>>>>>>>> though this is never desired. > >>>>>>>>>>>>> > >>>>>>>>>>>>> None of the above options looks awkward. I am curious what > your > >>>>>>>>>>> solution is > >>>>>>>>>>>>> here? > >>>>>>>>>>>>> > >>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>> > >>>>>>>>>>>>> Jiangjie (Becket) Qin > >>>>>>>>>>>>> > >>>>>>>>>>>>> On Tue, Oct 31, 2023 at 3:11 PM Jiabao Sun < > >>>> jiabao....@xtransfer.cn > >>>>>>>>>>> .invalid> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>> > >>>>>>>>>>>>>> Thanks Becket for the further explanation. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Perhaps I didn't explain it clearly. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 1. If a source does not implement the SupportsFilterPushDown > >>>>>>>>> interface, > >>>>>>>>>>>>>> the newly added configurations do not need to be added to > >> either > >>>>>>> the > >>>>>>>>>>>>>> requiredOptions or optionalOptions. > >>>>>>>>>>>>>> Similar to LookupOptions, if a source does not implement > >>>>>>>>>>>>>> LookupTableSource, there is no need to add LookupOptions to > >>>> either > >>>>>>>>>>>>>> requiredOptions or optionalOptions. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> 2. "And these configs are specific to those sources, instead > >> of > >>>>>>>>> common > >>>>>>>>>>>>>> configs." > >>>>>>>>>>>>>> The newly introduced configurations define standardized > names > >>>> and > >>>>>>>>>>> default > >>>>>>>>>>>>>> values. > >>>>>>>>>>>>>> They still belong to the configuration at the individual > >> source > >>>>>>>>> level. > >>>>>>>>>>>>>> The purpose is to avoid scattered configuration items when > >>>>>>> different > >>>>>>>>>>>>>> sources implement the same logic. > >>>>>>>>>>>>>> Whether a source should accept these configurations is > >>>> determined > >>>>>>> by > >>>>>>>>>>> the > >>>>>>>>>>>>>> source's Factory. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> 2023年10月31日 13:47,Becket Qin <becket....@gmail.com> 写道: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi Jiabao, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Please see the replies inline. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Introducing common configurations does not mean that all > >>>> sources > >>>>>>>>> must > >>>>>>>>>>>>>>>> accept these configuration options. > >>>>>>>>>>>>>>>> The configuration options supported by a source are > >>>> determined by > >>>>>>>>> the > >>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory > >> interface. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> This is not true. Both required and optional options are > >>>>>>> SUPPORTED. > >>>>>>>>>>> That > >>>>>>>>>>>>>>> means they are implemented and if one specifies an optional > >>>> config > >>>>>>>>> it > >>>>>>>>>>>>>> will > >>>>>>>>>>>>>>> still take effect. An OptionalConfig is "Optional" because > >> this > >>>>>>>>>>>>>>> configuration has a default value. Hence, it is OK that > users > >>>> do > >>>>>>> not > >>>>>>>>>>>>>>> specify their own value. In another word, it is "optional" > >> for > >>>> the > >>>>>>>>> end > >>>>>>>>>>>>>>> users to set the config, but the implementation and support > >> for > >>>>>>> that > >>>>>>>>>>>>>> config > >>>>>>>>>>>>>>> is NOT optional. In case a source does not support a common > >>>>>>> config, > >>>>>>>>> an > >>>>>>>>>>>>>>> exception must be thrown when the config is provided by the > >> end > >>>>>>>>> users. > >>>>>>>>>>>>>>> However, the config we are talking about in this FLIP is a > >>>> common > >>>>>>>>>>> config > >>>>>>>>>>>>>>> optional to implement, meaning that sometimes the claimed > >>>> behavior > >>>>>>>>>>> won't > >>>>>>>>>>>>>> be > >>>>>>>>>>>>>>> there even if users specify that config. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Similar to sources that do not implement the > >> LookupTableSource > >>>>>>>>>>> interface, > >>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown > >>>>>>> interface > >>>>>>>>>>> also > >>>>>>>>>>>>>> do > >>>>>>>>>>>>>>>> not need to accept newly introduced options. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> First of all, filter pushdown is a behavior of the query > >>>>>>> optimizer, > >>>>>>>>>>> not > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> behavior of Sources. The Sources tells the optimizer that > it > >>>> has > >>>>>>> the > >>>>>>>>>>>>>>> ability to accept pushed down filters by implementing the > >>>>>>>>>>>>>>> SupportsFilterPushDown interface. And this is the only > >> contract > >>>>>>>>>>> between > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>> Source and Optimizer regarding whether filters should be > >> pushed > >>>>>>>>> down. > >>>>>>>>>>> As > >>>>>>>>>>>>>>> long as a specific source implements this decorative > >> interface, > >>>>>>>>> filter > >>>>>>>>>>>>>>> pushdown should always take place, i.e. > >>>>>>>>>>>>>>> *SupportsFilterPushDown.applyFilters()* will be called. > There > >>>>>>> should > >>>>>>>>>>> be > >>>>>>>>>>>>>> no > >>>>>>>>>>>>>>> other config to disable that call. However, Sources can > >> decide > >>>> how > >>>>>>>>> to > >>>>>>>>>>>>>>> behave based on their own configurations after > >>>> *applyFilters()* is > >>>>>>>>>>>>>> called. > >>>>>>>>>>>>>>> And these configs are specific to those sources, instead of > >>>> common > >>>>>>>>>>>>>> configs. > >>>>>>>>>>>>>>> Please see the examples I mentioned in the previous email. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Jiangjie (Becket) Qin > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, Oct 31, 2023 at 10:27 AM Jiabao Sun < > >>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>> .invalid> > >>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hi Becket, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Sorry, there was a typo in the second point. Let me > correct > >>>> it: > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Introducing common configurations does not mean that all > >>>> sources > >>>>>>>>> must > >>>>>>>>>>>>>>>> accept these configuration options. > >>>>>>>>>>>>>>>> The configuration options supported by a source are > >>>> determined by > >>>>>>>>> the > >>>>>>>>>>>>>>>> requiredOptions and optionalOptions in the Factory > >> interface. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Similar to sources that do not implement the > >> LookupTableSource > >>>>>>>>>>>>>> interface, > >>>>>>>>>>>>>>>> sources that do not implement the SupportsFilterPushDown > >>>>>>> interface > >>>>>>>>>>> also > >>>>>>>>>>>>>> do > >>>>>>>>>>>>>>>> not need to accept newly introduced options. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> 2023年10月31日 10:13,Jiabao Sun <jiabao....@xtransfer.cn > >>>> .INVALID> > >>>>>>>>> 写道: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Thanks Becket for the feedback. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> 1. Currently, the SupportsFilterPushDown#applyFilters > >> method > >>>>>>>>>>> returns a > >>>>>>>>>>>>>>>> result that includes acceptedFilters and remainingFilters. > >> The > >>>>>>>>> source > >>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>> decide to push down some filters or not accept any of > them. > >>>>>>>>>>>>>>>>> 2. Introducing common configuration options does not mean > >>>> that a > >>>>>>>>>>> source > >>>>>>>>>>>>>>>> that supports the SupportsFilterPushDown capability must > >>>> accept > >>>>>>>>> this > >>>>>>>>>>>>>>>> configuration. Similar to LookupOptions, only sources that > >>>>>>>>> implement > >>>>>>>>>>> the > >>>>>>>>>>>>>>>> LookupTableSource interface are necessary to accept these > >>>>>>>>>>> configuration > >>>>>>>>>>>>>>>> options. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> 2023年10月31日 07:49,Becket Qin <becket....@gmail.com> 写道: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Hi Jiabao and Ruanhang, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Adding a configuration of > source.filter-push-down.enabled > >>>> as a > >>>>>>>>>>> common > >>>>>>>>>>>>>>>>>> source configuration seems problematic. > >>>>>>>>>>>>>>>>>> 1. The config name is misleading. filter pushdown should > >>>> only > >>>>>>> be > >>>>>>>>>>>>>>>> determined > >>>>>>>>>>>>>>>>>> by whether the SupportsFilterPushdown interface is > >>>> implemented > >>>>>>> or > >>>>>>>>>>> not. > >>>>>>>>>>>>>>>>>> 2. The behavior of this configuration is only applicable > >> to > >>>>>>> some > >>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>> implementations. Why is it a common configuration? > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Here's my suggestion for design principles: > >>>>>>>>>>>>>>>>>> 1. Only add source impl specific configuration to > >>>> corresponding > >>>>>>>>>>>>>> sources. > >>>>>>>>>>>>>>>>>> 2. The configuration name should not overrule existing > >>>> common > >>>>>>>>>>>>>> contracts. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> For example, in the case of MySql source. There are > >> several > >>>>>>>>>>> options: > >>>>>>>>>>>>>>>>>> 1. Have a configuration of > >>>>>>>>> `*mysql.avoid.remote.full.table.scan`*. > >>>>>>>>>>> If > >>>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>>>> configuration is set, and a filter pushdown does not hit > >> an > >>>>>>>>> index, > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> MySql source impl would not further pushdown the filter > to > >>>>>>> MySql > >>>>>>>>>>>>>>>> servers. > >>>>>>>>>>>>>>>>>> Note that this assumes the MySql source can retrieve the > >>>> index > >>>>>>>>>>>>>>>> information > >>>>>>>>>>>>>>>>>> from the MySql servers. > >>>>>>>>>>>>>>>>>> 2. If the MySql index information is not available to > the > >>>> MySql > >>>>>>>>>>>>>> source, > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> configuration could be something like > >>>>>>>>>>>>>>>> *`mysql.pushback.pushed.down.filters`*. > >>>>>>>>>>>>>>>>>> Once set to true, MySql source would just add all the > >>>> filters > >>>>>>> to > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> RemainingFilters in the Result returned by > >>>>>>>>>>>>>>>>>> *SupportsFilterPushdown.applyFilters().* > >>>>>>>>>>>>>>>>>> 3. An alternative to option 2 is to have a ` > >>>>>>>>>>>>>>>>>> *mysql.apply.predicates.after.scan*`. When it is set to > >>>> true, > >>>>>>>>> MySql > >>>>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>> will not push the filter down to the MySql servers, but > >>>> apply > >>>>>>> the > >>>>>>>>>>>>>>>> filters > >>>>>>>>>>>>>>>>>> inside the MySql source itself. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> As you may see, the above configurations do not disable > >>>> filter > >>>>>>>>>>>>>> pushdown > >>>>>>>>>>>>>>>>>> itself. They just allow various implementations of > filter > >>>>>>>>> pushdown. > >>>>>>>>>>>>>> And > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>> configuration name does not give any illusion that > filter > >>>>>>>>> pushdown > >>>>>>>>>>> is > >>>>>>>>>>>>>>>>>> disabled. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> On Mon, Oct 30, 2023 at 11:58 PM Jiabao Sun < > >>>>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>>>> .invalid> > >>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks Hang for the suggestion. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I think the configuration of TableSource is not closely > >>>>>>> related > >>>>>>>>> to > >>>>>>>>>>>>>>>>>>> SourceReader, > >>>>>>>>>>>>>>>>>>> so I prefer to introduce a independent configuration > >> class > >>>>>>>>>>>>>>>>>>> TableSourceOptions in the flink-table-common module, > >>>> similar > >>>>>>> to > >>>>>>>>>>>>>>>>>>> LookupOptions. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> For the second point, I suggest adding Java doc to the > >>>>>>>>>>>>>>>> SupportsXXXPushDown > >>>>>>>>>>>>>>>>>>> interfaces, providing detailed information on these > >> options > >>>>>>> that > >>>>>>>>>>>>>> needs > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> be supported. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> I have made updates in the FLIP document. > >>>>>>>>>>>>>>>>>>> Please help check it again. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> 2023年10月30日 17:23,Hang Ruan <ruanhang1...@gmail.com> > >> 写道: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Thanks for the improvements, Jiabao. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> There are some details that I am not sure about. > >>>>>>>>>>>>>>>>>>>> 1. The new option `source.filter-push-down.enabled` > will > >>>> be > >>>>>>>>>>> added to > >>>>>>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>>>>>> class? I think it should be `SourceReaderOptions`. > >>>>>>>>>>>>>>>>>>>> 2. How are the connector developers able to know and > >>>> follow > >>>>>>> the > >>>>>>>>>>>>>> FLIP? > >>>>>>>>>>>>>>>> Do > >>>>>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>> need an abstract base class or provide a default > method? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>> Hang > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> > >>>> 于2023年10月30日周一 > >>>>>>>>>>>>>> 14:45写道: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Hi, all, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Thanks for the lively discussion. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Based on the discussion, I have made some adjustments > >> to > >>>> the > >>>>>>>>>>> FLIP > >>>>>>>>>>>>>>>>>>> document: > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> 1. The name of the newly added option has been > changed > >> to > >>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled". > >>>>>>>>>>>>>>>>>>>>> 2. Considering compatibility with older versions, the > >>>> newly > >>>>>>>>>>> added > >>>>>>>>>>>>>>>>>>>>> "source.filter-push-down.enabled" option needs to > >> respect > >>>>>>> the > >>>>>>>>>>>>>>>>>>> optimizer's > >>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" > >>>> option. > >>>>>>>>>>>>>>>>>>>>> But there is a consideration to remove the old option > >> in > >>>>>>> Flink > >>>>>>>>>>> 2.0. > >>>>>>>>>>>>>>>>>>>>> 3. We can provide more options to disable other > source > >>>>>>>>> abilities > >>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>> side > >>>>>>>>>>>>>>>>>>>>> effects, such as “source.aggregate.enabled” and > >>>>>>>>>>>>>>>>>>> “source.projection.enabled" > >>>>>>>>>>>>>>>>>>>>> This is not urgent and can be continuously > introduced. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback again. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> 2023年10月29日 08:45,Becket Qin <becket....@gmail.com> > >> 写道: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Thanks for digging into the git history, Jark. I > agree > >>>> it > >>>>>>>>> makes > >>>>>>>>>>>>>>>> sense > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>> deprecate this API in 2.0. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Cheers, > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On Fri, Oct 27, 2023 at 5:47 PM Jark Wu < > >>>> imj...@gmail.com> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Hi Becket, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> I checked the history of " > >>>>>>>>>>>>>>>>>>>>>>> > *table.optimizer.source.predicate-pushdown-enabled*", > >>>>>>>>>>>>>>>>>>>>>>> it seems it was introduced since the legacy > >>>>>>>>>>> FilterableTableSource > >>>>>>>>>>>>>>>>>>>>>>> interface > >>>>>>>>>>>>>>>>>>>>>>> which might be an experiential feature at that > time. > >> I > >>>>>>> don't > >>>>>>>>>>> see > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>> necessity > >>>>>>>>>>>>>>>>>>>>>>> of this option at the moment. Maybe we can > deprecate > >>>> this > >>>>>>>>>>> option > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>> drop > >>>>>>>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>> in Flink 2.0[1] if it is not necessary anymore. > This > >>>> may > >>>>>>>>> help > >>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>> simplify this discussion. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>> Jark > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> [1]: > >> https://issues.apache.org/jira/browse/FLINK-32383 > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> On Thu, 26 Oct 2023 at 10:14, Becket Qin < > >>>>>>>>>>> becket....@gmail.com> > >>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the proposal, Jiabao. My two cents > below: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> 1. If I understand correctly, the motivation of > the > >>>> FLIP > >>>>>>> is > >>>>>>>>>>>>>>>> mainly to > >>>>>>>>>>>>>>>>>>>>>>>> make predicate pushdown optional on SOME of the > >>>> Sources. > >>>>>>> If > >>>>>>>>>>> so, > >>>>>>>>>>>>>>>>>>>>> intuitively > >>>>>>>>>>>>>>>>>>>>>>>> the configuration should be Source specific > instead > >> of > >>>>>>>>>>> general. > >>>>>>>>>>>>>>>>>>>>> Otherwise, > >>>>>>>>>>>>>>>>>>>>>>>> we will end up with general configurations that > may > >>>> not > >>>>>>>>> take > >>>>>>>>>>>>>>>> effect > >>>>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>> some of the Source implementations. This violates > >> the > >>>>>>> basic > >>>>>>>>>>> rule > >>>>>>>>>>>>>>>> of a > >>>>>>>>>>>>>>>>>>>>>>>> configuration - it does what it says, regardless > of > >>>> the > >>>>>>>>>>>>>>>>>>> implementation. > >>>>>>>>>>>>>>>>>>>>>>>> While configuration standardization is usually a > >> good > >>>>>>>>> thing, > >>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>> should > >>>>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>>> break the basic rules. > >>>>>>>>>>>>>>>>>>>>>>>> If we really want to have this general > >> configuration, > >>>> for > >>>>>>>>> the > >>>>>>>>>>>>>>>> sources > >>>>>>>>>>>>>>>>>>>>>>>> this configuration does not apply, they should > throw > >>>> an > >>>>>>>>>>>>>> exception > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>> make > >>>>>>>>>>>>>>>>>>>>>>>> it clear that this configuration is not supported. > >>>>>>> However, > >>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>> seems > >>>>>>>>>>>>>>>>>>>>> ugly. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> 2. I think the actual motivation of this FLIP is > >> about > >>>>>>>>> "how a > >>>>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>>>>>>>> should implement predicate pushdown efficiently", > >> not > >>>>>>>>>>> "whether > >>>>>>>>>>>>>>>>>>>>> predicate > >>>>>>>>>>>>>>>>>>>>>>>> pushdown should be applied to the source." For > >>>> example, > >>>>>>> if > >>>>>>>>> a > >>>>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>>>>> wants > >>>>>>>>>>>>>>>>>>>>>>>> to avoid additional computing load in the external > >>>>>>> system, > >>>>>>>>> it > >>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>>>>>>> always > >>>>>>>>>>>>>>>>>>>>>>>> read the entire record and apply the predicates by > >>>>>>> itself. > >>>>>>>>>>>>>>>> However, > >>>>>>>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>>>>> the Flink perspective, the predicate pushdown is > >>>> applied, > >>>>>>>>> it > >>>>>>>>>>> is > >>>>>>>>>>>>>>>> just > >>>>>>>>>>>>>>>>>>>>>>>> implemented differently by the source. So the > design > >>>>>>>>>>> principle > >>>>>>>>>>>>>>>> here > >>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>> Flink only cares about whether a source supports > >>>>>>> predicate > >>>>>>>>>>>>>>>> pushdown > >>>>>>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>>>>>>> not, > >>>>>>>>>>>>>>>>>>>>>>>> it does not care about the implementation > >> efficiency / > >>>>>>> side > >>>>>>>>>>>>>>>> effect of > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> predicates pushdown. It is the Source > >> implementation's > >>>>>>>>>>>>>>>> responsibility > >>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> ensure the predicates pushdown is implemented > >>>> efficiently > >>>>>>>>> and > >>>>>>>>>>>>>> does > >>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>>> impose excessive pressure on the external system. > >> And > >>>> it > >>>>>>> is > >>>>>>>>>>> OK > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> have > >>>>>>>>>>>>>>>>>>>>>>>> additional configurations to achieve this goal. > >>>>>>> Obviously, > >>>>>>>>>>> such > >>>>>>>>>>>>>>>>>>>>>>>> configurations will be source specific in this > case. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> 3. Regarding the existing configurations of > >>>>>>>>>>>>>>>>>>>>> *table.optimizer.source.predicate-pushdown-enabled. > >>>>>>>>>>>>>>>>>>>>>>>> *I am not sure why we need it. Supposedly, if a > >> source > >>>>>>>>>>>>>> implements > >>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>> SupportsXXXPushDown interface, the optimizer > should > >>>> push > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> corresponding > >>>>>>>>>>>>>>>>>>>>>>>> predicates to the Source. I am not sure in which > >> case > >>>>>>> this > >>>>>>>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>> would be used. Any ideas @Jark Wu < > imj...@gmail.com > >>> ? > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 11:55 PM Jiabao Sun > >>>>>>>>>>>>>>>>>>>>>>>> <jiabao....@xtransfer.cn.invalid> wrote: > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for the detailed explanation. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> I think that for users, we should respect > >> conventions > >>>>>>> over > >>>>>>>>>>>>>>>>>>>>>>>>> configurations. > >>>>>>>>>>>>>>>>>>>>>>>>> Conventions can be default values explicitly > >>>> specified > >>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>> configurations, or they can be behaviors that > >> follow > >>>>>>>>>>> previous > >>>>>>>>>>>>>>>>>>>>> versions. > >>>>>>>>>>>>>>>>>>>>>>>>> If the same code has different behaviors in > >> different > >>>>>>>>>>> versions, > >>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>> would > >>>>>>>>>>>>>>>>>>>>>>>>> be a very bad thing. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> I agree that for regular users, it is not > necessary > >>>> to > >>>>>>>>>>>>>> understand > >>>>>>>>>>>>>>>>>>> all > >>>>>>>>>>>>>>>>>>>>>>>>> the configurations related to Flink. > >>>>>>>>>>>>>>>>>>>>>>>>> By following conventions, they can have a good > >>>>>>> experience. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Let's get back to the practical situation and > >>>> consider > >>>>>>> it. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Case 1: > >>>>>>>>>>>>>>>>>>>>>>>>> The user is not familiar with the purpose of the > >>>>>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled > >>>>>>>>>>> configuration > >>>>>>>>>>>>>>>> but > >>>>>>>>>>>>>>>>>>>>> follows > >>>>>>>>>>>>>>>>>>>>>>>>> the convention of allowing predicate pushdown to > >> the > >>>>>>>>> source > >>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>> default. > >>>>>>>>>>>>>>>>>>>>>>>>> Just understanding the > >>>> source.predicate-pushdown-enabled > >>>>>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>> and performing fine-grained toggle control will > >> work > >>>>>>> well. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Case 2: > >>>>>>>>>>>>>>>>>>>>>>>>> The user understands the meaning of the > >>>>>>>>>>>>>>>>>>>>>>>>> table.optimizer.source.predicate-pushdown-enabled > >>>>>>>>>>> configuration > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>> has set > >>>>>>>>>>>>>>>>>>>>>>>>> its value to false. > >>>>>>>>>>>>>>>>>>>>>>>>> We have reason to believe that the user > understands > >>>> the > >>>>>>>>>>> meaning > >>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> predicate pushdown configuration and the > intention > >>>> is to > >>>>>>>>>>>>>> disable > >>>>>>>>>>>>>>>>>>>>> predicate > >>>>>>>>>>>>>>>>>>>>>>>>> pushdown (rather than whether or not to allow > it). > >>>>>>>>>>>>>>>>>>>>>>>>> The previous choice of globally disabling it is > >>>> likely > >>>>>>>>>>> because > >>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>>>> couldn't be disabled on individual sources. > >>>>>>>>>>>>>>>>>>>>>>>>> From this perspective, if we provide more > >>>> fine-grained > >>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>> support and provide detailed explanations of the > >>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>> behaviors in > >>>>>>>>>>>>>>>>>>>>>>>>> the documentation, > >>>>>>>>>>>>>>>>>>>>>>>>> users can clearly understand the differences > >> between > >>>>>>> these > >>>>>>>>>>> two > >>>>>>>>>>>>>>>>>>>>>>>>> configurations and use them correctly. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Also, I don't agree that > >>>>>>>>>>>>>>>>>>>>>>>>> > table.optimizer.source.predicate-pushdown-enabled = > >>>> true > >>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>> source.predicate-pushdown-enabled = false means > >> that > >>>> the > >>>>>>>>>>> local > >>>>>>>>>>>>>>>>>>>>>>>>> configuration overrides the global configuration. > >>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, both configurations are > >> functioning > >>>>>>>>>>> correctly. > >>>>>>>>>>>>>>>>>>>>>>>>> The optimizer allows predicate pushdown to all > >>>> sources, > >>>>>>>>> but > >>>>>>>>>>>>>> some > >>>>>>>>>>>>>>>>>>>>> sources > >>>>>>>>>>>>>>>>>>>>>>>>> can reject the filters pushed down by the > >> optimizer. > >>>>>>>>>>>>>>>>>>>>>>>>> This is natural, just like different components > at > >>>>>>>>> different > >>>>>>>>>>>>>>>> levels > >>>>>>>>>>>>>>>>>>>>> are > >>>>>>>>>>>>>>>>>>>>>>>>> responsible for different tasks. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> The more serious issue is that if > >>>>>>>>>>>>>>>>>>> "source.predicate-pushdown-enabled" > >>>>>>>>>>>>>>>>>>>>>>>>> does not respect > >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled”, > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>> "table.optimizer.source.predicate-pushdown-enabled" > >>>>>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>> will be invalidated. > >>>>>>>>>>>>>>>>>>>>>>>>> This means that regardless of whether > >>>>>>>>>>>>>>>>>>>>>>>>> > "table.optimizer.source.predicate-pushdown-enabled" > >>>> is > >>>>>>> set > >>>>>>>>>>> to > >>>>>>>>>>>>>>>> true > >>>>>>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>>>>>>>>>>> false, it will have no effect. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 22:24,Jane Chan < > >> qingyue....@gmail.com> > >>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao, > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for the in-depth clarification. Here are > my > >>>>>>> cents > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> However, > >>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" > >>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are > >> configurations > >>>> for > >>>>>>>>>>>>>>>> different > >>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator). > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> We cannot assume that every user would be > >>>> interested in > >>>>>>>>>>>>>>>>>>> understanding > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>> internal components of Flink, such as the > >> optimizer > >>>> or > >>>>>>>>>>>>>>>> connectors, > >>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>> specific configurations associated with each > >>>> component. > >>>>>>>>>>>>>> Instead, > >>>>>>>>>>>>>>>>>>>>> users > >>>>>>>>>>>>>>>>>>>>>>>>>> might be more concerned about knowing which > >>>>>>> configuration > >>>>>>>>>>>>>>>> enables > >>>>>>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>>>>>>>>>>>> disables the filter push-down feature for all > >> source > >>>>>>>>>>>>>> connectors, > >>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>>>>>>>>>>>> parameter provides the flexibility to override > >> this > >>>>>>>>>>> behavior > >>>>>>>>>>>>>>>> for a > >>>>>>>>>>>>>>>>>>>>>>>>> single > >>>>>>>>>>>>>>>>>>>>>>>>>> source if needed. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> So, from this perspective, I am inclined to > divide > >>>>>>> these > >>>>>>>>>>> two > >>>>>>>>>>>>>>>>>>>>> parameters > >>>>>>>>>>>>>>>>>>>>>>>>>> based on the scope of their impact from the > user's > >>>>>>>>>>> perspective > >>>>>>>>>>>>>>>>>>> (i.e. > >>>>>>>>>>>>>>>>>>>>>>>>>> global-level or operator-level), rather than > >>>>>>> categorizing > >>>>>>>>>>> them > >>>>>>>>>>>>>>>>>>> based > >>>>>>>>>>>>>>>>>>>>>>>>> on the > >>>>>>>>>>>>>>>>>>>>>>>>>> component hierarchy from a developer's point of > >>>> view. > >>>>>>>>>>>>>> Therefore, > >>>>>>>>>>>>>>>>>>>>> based > >>>>>>>>>>>>>>>>>>>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>>>> this premise, it is intuitive and natural for > >> users > >>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>> understand fine-grained configuration options > can > >>>>>>>>> override > >>>>>>>>>>>>>>>> global > >>>>>>>>>>>>>>>>>>>>>>>>>> configurations. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if "scan.filter-push-down.enabled" > >>>>>>> doesn't > >>>>>>>>>>>>>>>> respect to > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >> "table.optimizer.source.predicate-pushdown-enabled" > >>>>>>> and > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> default > >>>>>>>>>>>>>>>>>>>>>>>>> value > >>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined > as > >>>> true, > >>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> "table.optimizer.source.predicate-pushdown-enabled" as > >>>>>>>>>>> false > >>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>>>>>>>> have no > >>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be > >>>> performed. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of > >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>>>>>>>>>> as > >>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in > >> behavior > >>>>>>>>>>> compared > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> previous > >>>>>>>>>>>>>>>>>>>>>>>>>>> version. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> <1>If I understand correctly, > >>>>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>>>>>>> is a > >>>>>>>>>>>>>>>>>>>>>>>>>> connector option, which means the only way to > >>>> configure > >>>>>>>>> it > >>>>>>>>>>> is > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>> explicitly > >>>>>>>>>>>>>>>>>>>>>>>>>> specify it in DDL (no matter whether disable or > >>>>>>> enable), > >>>>>>>>>>> and > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> SET > >>>>>>>>>>>>>>>>>>>>>>>>>> command is not applicable, so I think it's > natural > >>>> to > >>>>>>>>> still > >>>>>>>>>>>>>>>> respect > >>>>>>>>>>>>>>>>>>>>>>>>> user's > >>>>>>>>>>>>>>>>>>>>>>>>>> specification here. Otherwise, users might be > more > >>>>>>>>> confused > >>>>>>>>>>>>>>>> about > >>>>>>>>>>>>>>>>>>> why > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>> DDL does not work as expected, and the reason is > >>>> just > >>>>>>>>>>> because > >>>>>>>>>>>>>>>> some > >>>>>>>>>>>>>>>>>>>>>>>>> other > >>>>>>>>>>>>>>>>>>>>>>>>>> "optimizer" configuration is set to a different > >>>> value. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> <2> From the implementation side, I am inclined > to > >>>> keep > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>> parameter's > >>>>>>>>>>>>>>>>>>>>>>>>>> priority consistent for all conditions. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Let "global" denote > >>>>>>>>>>>>>>>>>>>>>>>>> > >> "table.optimizer.source.predicate-pushdown-enabled", > >>>>>>>>>>>>>>>>>>>>>>>>>> and let "per-source" denote > >>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>>> specific > >>>>>>>>>>>>>>>>>>>>>>>>>> source T, the following Truth table (based on > the > >>>>>>>>> current > >>>>>>>>>>>>>>>> design) > >>>>>>>>>>>>>>>>>>>>>>>>>> indicates the inconsistent behavior for > >> "per-source > >>>>>>>>>>> override > >>>>>>>>>>>>>>>>>>> global". > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > .------------.---------------.------------------- > >>>>>>>>>>>>>>>>>>>>>>>>>> ----.-------------------------------------. > >>>>>>>>>>>>>>>>>>>>>>>>>> | global | per-source | push-down for T | > >>>> per-source > >>>>>>>>>>>>>> override > >>>>>>>>>>>>>>>>>>>>> global > >>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>> > >> > :-----------+--------------+-----------------------+------------------------------------: > >>>>>>>>>>>>>>>>>>>>>>>>>> | true | false | false > >>>>>>>>> | Y > >>>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>> > >> > :-----------+--------------+-----------------------+------------------------------------: > >>>>>>>>>>>>>>>>>>>>>>>>>> | false | true | false > >>>>>>>>> | N > >>>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>> > >> > .------------.---------------.-----------------------.-------------------------------------. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>> Jane > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 6:22 PM Jiabao Sun < > >>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>>>>>>>>>>>>> .invalid> > >>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> I understand that the configuration of global > >>>>>>>>> parallelism > >>>>>>>>>>> and > >>>>>>>>>>>>>>>> task > >>>>>>>>>>>>>>>>>>>>>>>>>>> parallelism is at different granularities but > >> with > >>>> the > >>>>>>>>>>> same > >>>>>>>>>>>>>>>>>>>>>>>>> configuration. > >>>>>>>>>>>>>>>>>>>>>>>>>>> However, > >>>>>>>>>>> "table.optimizer.source.predicate-pushdown-enabled" > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" are > >> configurations > >>>> for > >>>>>>>>>>>>>>>> different > >>>>>>>>>>>>>>>>>>>>>>>>>>> components(optimizer and source operator). > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> From a user's perspective, there are two > >> scenarios: > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> 1. Disabling all filter pushdown > >>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, setting > >>>>>>>>>>>>>>>>>>>>>>>>> > "table.optimizer.source.predicate-pushdown-enabled" > >>>>>>>>>>>>>>>>>>>>>>>>>>> to false is sufficient to meet the requirement. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> 2. Disabling filter pushdown for specific > sources > >>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, there is no need to adjust > the > >>>> value > >>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> "table.optimizer.source.predicate-pushdown-enabled". > >>>>>>>>>>>>>>>>>>>>>>>>>>> Instead, the focus should be on the > configuration > >>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to meet the > >>>>>>> requirement. > >>>>>>>>>>>>>>>>>>>>>>>>>>> In this case, users do not need to set > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> "table.optimizer.source.predicate-pushdown-enabled" to > >>>>>>>>>>> false > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>> manually > >>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter pushdown for specific sources. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, if > "scan.filter-push-down.enabled" > >>>>>>> doesn't > >>>>>>>>>>>>>>>> respect > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >> "table.optimizer.source.predicate-pushdown-enabled" > >>>>>>> and > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>> default > >>>>>>>>>>>>>>>>>>>>>>>>> value > >>>>>>>>>>>>>>>>>>>>>>>>>>> of "scan.filter-push-down.enabled" is defined > as > >>>> true, > >>>>>>>>>>>>>>>>>>>>>>>>>>> it means that just modifying > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> "table.optimizer.source.predicate-pushdown-enabled" as > >>>>>>>>>>> false > >>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>>>>>>>> have no > >>>>>>>>>>>>>>>>>>>>>>>>>>> effect, and filter pushdown will still be > >>>> performed. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> If we define the default value of > >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>>>>>>>>>> as > >>>>>>>>>>>>>>>>>>>>>>>>>>> false, it would introduce a difference in > >> behavior > >>>>>>>>>>> compared > >>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> previous > >>>>>>>>>>>>>>>>>>>>>>>>>>> version. > >>>>>>>>>>>>>>>>>>>>>>>>>>> The same SQL query that could successfully push > >>>> down > >>>>>>>>>>> filters > >>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> old > >>>>>>>>>>>>>>>>>>>>>>>>>>> version but would no longer do so after the > >>>> upgrade. > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 17:10,Benchao Li < > >>>> libenc...@apache.org> > >>>>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jiabao for the detailed explanations, > >> that > >>>>>>>>> helps a > >>>>>>>>>>>>>>>> lot, I > >>>>>>>>>>>>>>>>>>>>>>>>>>>> understand your rationale now. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong. Your perspective is > >> from > >>>>>>>>>>>>>> "developer", > >>>>>>>>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>>>>>>>>>>>>>> means there is an optimizer and connector > >>>> component, > >>>>>>>>> and > >>>>>>>>>>> if > >>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>> want > >>>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>> enable this feature (pushing filters down into > >>>>>>>>>>> connectors), > >>>>>>>>>>>>>>>> you > >>>>>>>>>>>>>>>>>>>>> must > >>>>>>>>>>>>>>>>>>>>>>>>>>>> enable it firstly in optimizer, and only then > >>>>>>> connector > >>>>>>>>>>> has > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> chance > >>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide to use it or not. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> My perspective is from "user" that (Why a user > >>>> should > >>>>>>>>>>> care > >>>>>>>>>>>>>>>> about > >>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>> difference of optimizer/connector) , this is a > >>>>>>> feature, > >>>>>>>>>>> and > >>>>>>>>>>>>>>>> has > >>>>>>>>>>>>>>>>>>> two > >>>>>>>>>>>>>>>>>>>>>>>>>>>> way to control it, one way is to config it > >>>> job-level, > >>>>>>>>> the > >>>>>>>>>>>>>>>> other > >>>>>>>>>>>>>>>>>>> one > >>>>>>>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>>>>>> in table properties. What a user expects is > that > >>>> they > >>>>>>>>> can > >>>>>>>>>>>>>>>>>>> control a > >>>>>>>>>>>>>>>>>>>>>>>>>>>> feature in a tiered way, that setting it per > >> job, > >>>> and > >>>>>>>>>>> then > >>>>>>>>>>>>>>>>>>>>>>>>>>>> fine-grained tune it per table. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> This is some kind of similar to other > concepts, > >>>> such > >>>>>>> as > >>>>>>>>>>>>>>>>>>>>> parallelism, > >>>>>>>>>>>>>>>>>>>>>>>>>>>> users can set a job level default parallelism, > >> and > >>>>>>> then > >>>>>>>>>>>>>>>>>>>>> fine-grained > >>>>>>>>>>>>>>>>>>>>>>>>>>>> tune it per operator. There may be more such > >>>> debate > >>>>>>> in > >>>>>>>>>>> the > >>>>>>>>>>>>>>>> future > >>>>>>>>>>>>>>>>>>>>>>>>>>>> e.g., we can have a job level config about > >> adding > >>>>>>>>> key-by > >>>>>>>>>>>>>>>> before > >>>>>>>>>>>>>>>>>>>>>>>>> lookup > >>>>>>>>>>>>>>>>>>>>>>>>>>>> join, and also a hint/table property way to > >>>>>>>>> fine-grained > >>>>>>>>>>>>>>>> control > >>>>>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>>>>>>> per lookup operator. Hence we'd better find a > >>>> unified > >>>>>>>>> way > >>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>> all > >>>>>>>>>>>>>>>>>>>>>>>>>>>> those similar kind of features. > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao Sun <jiabao....@xtransfer.cn.invalid> > >>>>>>>>>>> 于2023年10月25日周三 > >>>>>>>>>>>>>>>>>>>>> 15:27写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane for further explanation. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> These two configurations correspond to > >> different > >>>>>>>>> levels. > >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" does not make > >>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" invalid. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> The planner will still push down predicates > to > >>>> all > >>>>>>>>>>> sources. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whether filter pushdown is allowed or not is > >>>>>>>>> determined > >>>>>>>>>>> by > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> specific > >>>>>>>>>>>>>>>>>>>>>>>>>>> source's "scan.filter-push-down.enabled" > >>>>>>> configuration. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, "table.optimizer.source.predicate" > >> does > >>>>>>>>>>> directly > >>>>>>>>>>>>>>>> affect > >>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled”. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> When the planner disables predicate pushdown, > >> the > >>>>>>>>>>>>>>>> source-level > >>>>>>>>>>>>>>>>>>>>>>>>> filter > >>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown will also not be executed, even if the > >>>> source > >>>>>>>>>>> allows > >>>>>>>>>>>>>>>>>>> filter > >>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Whatever, in point 1 and 2, our expectation > is > >>>>>>>>>>> consistent. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the 3rd point, I still think that the > >>>>>>>>> planner-level > >>>>>>>>>>>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over the source-level > >>>> configuration. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may seem counterintuitive when we globally > >>>>>>> disable > >>>>>>>>>>>>>>>> predicate > >>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown but allow filter pushdown at the > source > >>>>>>> level. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 14:35,Jane Chan < > >>>> qingyue....@gmail.com > >>>>>>>> > >>>>>>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for clarifying this. While by > >>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled > >>>>>>>>>>>>>>>>>>>>>>>>>>> takes a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority" I meant that this value > >> should > >>>> be > >>>>>>>>>>>>>> respected > >>>>>>>>>>>>>>>>>>>>>>>>> whenever > >>>>>>>>>>>>>>>>>>>>>>>>>>> it is > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> set explicitly. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The conclusion that > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" = > "true" > >>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate > >>>> pushdown, > >>>>>>> but > >>>>>>>>>>>>>>>>>>> individual > >>>>>>>>>>>>>>>>>>>>>>>>>>> sources do > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This indicates that the option > >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled = > >>>>>>>>>>>>>>>>>>>>>>>>> false" > >>>>>>>>>>>>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an individual source connector does indeed > >>>> override > >>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> global-level > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> planner settings to make a difference. And > >> thus > >>>>>>> "has > >>>>>>>>> a > >>>>>>>>>>>>>>>> higher > >>>>>>>>>>>>>>>>>>>>>>>>>>> priority". > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> While for > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" = > >> "false" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the > >>>> planner. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the > >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains > >>>> consistent > >>>>>>>>> with > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>> old > >>>>>>>>>>>>>>>>>>>>>>>>>>> version as > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I still think > "scan.filter-push-down.enabled" > >>>>>>> should > >>>>>>>>>>> also > >>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>> respected > >>>>>>>>>>>>>>>>>>>>>>>>>>> if > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it is enabled for individual connectors. > WDYT? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 25, 2023 at 1:27 PM Jiabao Sun < > >>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Benchao for the feedback. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For the current proposal, we recommend > >> keeping > >>>> the > >>>>>>>>>>>>>> default > >>>>>>>>>>>>>>>>>>> value > >>>>>>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" as true, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and setting the the default value of newly > >>>>>>>>> introduced > >>>>>>>>>>>>>>>> option > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" to true as > >>>> well. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The main purpose of doing this is to > maintain > >>>>>>>>>>> consistency > >>>>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>>>>>>>>>> previous > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> versions, as whether to perform > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filter pushdown in the old version solely > >>>> depends > >>>>>>> on > >>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" option. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> That means by default, as long as a > >> TableSource > >>>>>>>>>>>>>> implements > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> SupportsFilterPushDown interface, filter > >>>> pushdown > >>>>>>> is > >>>>>>>>>>>>>>>> allowed. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> And it seems that we don't have much > benefit > >> in > >>>>>>>>>>> changing > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> default > >>>>>>>>>>>>>>>>>>>>>>>>>>> value > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> of "table.optimizer.source.predicate" to > >> false. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regarding the priority of these two > >>>>>>> configurations, > >>>>>>>>> I > >>>>>>>>>>>>>>>> believe > >>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> takes precedence over > >>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>> and > >>>>>>>>>>>>>>>> it > >>>>>>>>>>>>>>>>>>>>>>>>> exhibits > >>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> following behavior. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 1. "table.optimizer.source.predicate" = > >> "true" > >>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "true" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This is the default behavior, allowing > filter > >>>>>>>>> pushdown > >>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>>> sources. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2. "table.optimizer.source.predicate" = > >> "true" > >>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" = "false" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Allow the planner to perform predicate > >>>> pushdown, > >>>>>>> but > >>>>>>>>>>>>>>>>>>> individual > >>>>>>>>>>>>>>>>>>>>>>>>>>> sources do > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not enable filter pushdown. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 3. "table.optimizer.source.predicate" = > >> "false" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Predicate pushdown is not allowed for the > >>>> planner. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Regardless of the value of the > >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration, filter pushdown is disabled. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> In this scenario, the behavior remains > >>>> consistent > >>>>>>>>> with > >>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>> old > >>>>>>>>>>>>>>>>>>>>>>>>>>> version as > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> well. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> From an implementation perspective, setting > >> the > >>>>>>>>>>> priority > >>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" higher than > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" is > >>>> difficult to > >>>>>>>>>>>>>> achieve > >>>>>>>>>>>>>>>>>>> now. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Because the > PushFilterIntoSourceScanRuleBase > >> at > >>>>>>> the > >>>>>>>>>>>>>> planner > >>>>>>>>>>>>>>>>>>>>> level > >>>>>>>>>>>>>>>>>>>>>>>>>>> takes > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> precedence over the source-level > >>>>>>> FilterPushDownSpec. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Only when the > >> PushFilterIntoSourceScanRuleBase > >>>> is > >>>>>>>>>>>>>> enabled, > >>>>>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Source-level filter pushdown be performed. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Additionally, in my opinion, there doesn't > >>>> seem to > >>>>>>>>> be > >>>>>>>>>>>>>> much > >>>>>>>>>>>>>>>>>>>>>>>>> benefit in > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> setting a higher priority for > >>>>>>>>>>>>>>>> "scan.filter-push-down.enabled". > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It may instead affect compatibility and > >>>> increase > >>>>>>>>>>>>>>>>>>> implementation > >>>>>>>>>>>>>>>>>>>>>>>>>>> complexity. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月25日 11:56,Benchao Li < > >>>>>>> libenc...@apache.org > >>>>>>>>>> > >>>>>>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I agree with Jane that fine-grained > >>>>>>> configurations > >>>>>>>>>>>>>> should > >>>>>>>>>>>>>>>>>>> have > >>>>>>>>>>>>>>>>>>>>>>>>> higher > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> priority than job level configurations. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> For current proposal, we can achieve that: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = > >>>> "true" > >>>>>>>>> to > >>>>>>>>>>>>>>>> enable > >>>>>>>>>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set > >>>>>>> ""scan.filter-push-down.enabled" = > >>>>>>>>>>>>>>>> "false" > >>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>> disable > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> - Set "table.optimizer.source.predicate" = > >>>>>>> "false" > >>>>>>>>> to > >>>>>>>>>>>>>>>> disable > >>>>>>>>>>>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default, and set > >>>>>>> ""scan.filter-push-down.enabled" = > >>>>>>>>>>>>>>>> "true" to > >>>>>>>>>>>>>>>>>>>>>>>>> enable > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it per table source > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane Chan <qingyue....@gmail.com> > >>>> 于2023年10月24日周二 > >>>>>>>>>>>>>> 23:55写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration > >>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level > than > >>>> the > >>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Correct me if I'm wrong, but I think the > >>>>>>>>>>> fine-grained > >>>>>>>>>>>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" should > >> have a > >>>>>>>>> higher > >>>>>>>>>>>>>>>>>>> priority > >>>>>>>>>>>>>>>>>>>>>>>>>>> because > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> default value of > >>>>>>>>> "table.optimizer.source.predicate" > >>>>>>>>>>> is > >>>>>>>>>>>>>>>> true. > >>>>>>>>>>>>>>>>>>>>> As > >>>>>>>>>>>>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> result, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> turning off filter push-down for a > specific > >>>>>>> source > >>>>>>>>>>> will > >>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>> take > >>>>>>>>>>>>>>>>>>>>>>>>>>> effect > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> unless the default value of > >>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" > >>>>>>>>>>>>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> changed > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to false, or, alternatively, let users > >>>> manually > >>>>>>>>> set > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" to > false > >>>>>>> first > >>>>>>>>>>> and > >>>>>>>>>>>>>>>> then > >>>>>>>>>>>>>>>>>>>>>>>>>>> selectively > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> enable filter push-down for the desired > >>>> sources, > >>>>>>>>>>> which > >>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>>>>> less > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> intuitive. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 6:05 PM Jiabao > Sun > >> < > >>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jane, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I believe that the configuration > >>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> has a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> higher priority at the planner level > than > >>>> the > >>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>> at > >>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> and it seems easy to implement now. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 17:36,Jane Chan < > >>>>>>>>>>> qingyue....@gmail.com> > >>>>>>>>>>>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Jiabao, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks for driving this discussion. I > >> have > >>>> a > >>>>>>>>> small > >>>>>>>>>>>>>>>>>>> question > >>>>>>>>>>>>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>>>>>>>>>>>>> will > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "scan.filter-push-down.enabled" take > >>>>>>> precedence > >>>>>>>>>>> over > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> "table.optimizer.source.predicate" when > >> the > >>>>>>> two > >>>>>>>>>>>>>>>> parameters > >>>>>>>>>>>>>>>>>>>>>>>>> might > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> conflict > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each other? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jane > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Oct 24, 2023 at 5:05 PM Jiabao > >> Sun > >>>> < > >>>>>>>>>>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> .invalid> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks Jark, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> If we only add configuration without > >>>> adding > >>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>> enableFilterPushDown > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method in the SupportsFilterPushDown > >>>>>>> interface, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> each connector would have to handle > the > >>>> same > >>>>>>>>>>> logic > >>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> method to determine whether filter > >>>> pushdown > >>>>>>> is > >>>>>>>>>>>>>> needed. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> This would increase complexity and > >> violate > >>>>>>> the > >>>>>>>>>>>>>>>> original > >>>>>>>>>>>>>>>>>>>>>>>>> behavior > >>>>>>>>>>>>>>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> applyFilters method. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On the contrary, we only need to pass > >> the > >>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>>>> parameter in > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> newly added enableFilterPushDown > method > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to decide whether to perform predicate > >>>>>>>>> pushdown. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think this approach would be clearer > >> and > >>>>>>>>>>> simpler. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> WDYT? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 16:58,Jark Wu < > >>>> imj...@gmail.com > >>>>>>>> > >>>>>>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi JIabao, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I think the current interface can > >> already > >>>>>>>>>>> satisfy > >>>>>>>>>>>>>>>> your > >>>>>>>>>>>>>>>>>>>>>>>>>>> requirements. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The connector can reject all the > >> filters > >>>> by > >>>>>>>>>>>>>> returning > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> input > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> filters > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> as `Result#remainingFilters`. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So maybe we don't need to introduce a > >> new > >>>>>>>>>>> method to > >>>>>>>>>>>>>>>>>>>>> disable > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown, but just introduce an > option > >>>> for > >>>>>>> the > >>>>>>>>>>>>>>>> specific > >>>>>>>>>>>>>>>>>>>>>>>>>>> connector. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jark > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Tue, 24 Oct 2023 at 16:38, Leonard > >> Xu > >>>> < > >>>>>>>>>>>>>>>>>>>>> xbjt...@gmail.com > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Thanks @Jiabao for kicking off this > >>>>>>>>> discussion. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Could you add a section to explain > the > >>>>>>>>>>> difference > >>>>>>>>>>>>>>>>>>> between > >>>>>>>>>>>>>>>>>>>>>>>>>>> proposed > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> connector level config > >>>>>>>>>>>>>>>> `scan.filter-push-down.enabled` > >>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>>> existing > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> query > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> level config > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>> `table.optimizer.source.predicate-pushdown-enabled` ? > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Leonard > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 2023年10月24日 下午4:18,Jiabao Sun < > >>>>>>>>>>>>>>>>>>> jiabao....@xtransfer.cn > >>>>>>>>>>>>>>>>>>>>>>>>>>> .INVALID> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> 写道: > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Hi Devs, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I would like to start a discussion > on > >>>>>>>>>>> FLIP-377: > >>>>>>>>>>>>>>>>>>> support > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> configuration > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> disable filter pushdown for > Table/SQL > >>>>>>>>>>> Sources[1]. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Currently, Flink Table/SQL does not > >>>> expose > >>>>>>>>>>>>>>>>>>> fine-grained > >>>>>>>>>>>>>>>>>>>>>>>>>>> control > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> users to enable or disable filter > >>>> pushdown. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, filter pushdown has some > >> side > >>>>>>>>>>> effects, > >>>>>>>>>>>>>>>> such > >>>>>>>>>>>>>>>>>>> as > >>>>>>>>>>>>>>>>>>>>>>>>>>> additional > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> computational pressure on external > >>>> systems. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Moreover, Improper queries can lead > >> to > >>>>>>>>> issues > >>>>>>>>>>>>>> such > >>>>>>>>>>>>>>>> as > >>>>>>>>>>>>>>>>>>>>> full > >>>>>>>>>>>>>>>>>>>>>>>>>>> table > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> scans, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> which in turn can impact the > stability > >>>> of > >>>>>>>>>>> external > >>>>>>>>>>>>>>>>>>>>> systems. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have an SQL query with > two > >>>>>>>>> sources: > >>>>>>>>>>>>>>>> Kafka > >>>>>>>>>>>>>>>>>>>>> and a > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> database. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> The database is sensitive to > >> pressure, > >>>> and > >>>>>>>>> we > >>>>>>>>>>>>>> want > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>> configure > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> it to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> not perform filter pushdown to the > >>>> database > >>>>>>>>>>>>>> source. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> However, we still want to perform > >>>> filter > >>>>>>>>>>> pushdown > >>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>>> Kafka > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> source > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> decrease network IO. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I propose to support configuration > to > >>>>>>>>> disable > >>>>>>>>>>>>>>>> filter > >>>>>>>>>>>>>>>>>>>>> push > >>>>>>>>>>>>>>>>>>>>>>>>>>> down for > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Table/SQL sources to let user decide > >>>>>>> whether > >>>>>>>>> to > >>>>>>>>>>>>>>>> perform > >>>>>>>>>>>>>>>>>>>>>>>>> filter > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pushdown. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to your feedback. > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>> > >> > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=276105768 > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jiabao > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> -- > >>>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Best, > >>>>>>>>>>>>>>>>>>>>>>>>>>>> Benchao Li > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>> > >>>>>>> > >>>> > >>>> > >> > >> > > >