Based on an offline discussion with Becket Qin, I added *fieldIndices *back which is the field index of the nested field at every level to the *NestedFieldReferenceExpression *in FLIP-356 <https://cwiki.apache.org/confluence/display/FLINK/FLIP-356%3A+Support+Nested+Fields+Filter+Pushdown> *. *2 reasons to do it:
1. Agree with using *fieldIndices *as the only contract to refer to the column from the underlying datasource. 2. To keep it consistent with *FieldReferenceExpression* Having said that, I see that with *projection pushdown, *index of the fields are used whereas with *filter pushdown (*based on scanning few tablesources) *FieldReferenceExpression*'s name is used for eg: even in the Flink's *FileSystemTableSource, IcebergSource, JDBCDatsource*. This way, I feel the contract is not quite clear and explicit. Wanted to understand other's thoughts as well. Regards Venkata krishnan On Tue, Sep 5, 2023 at 5:34 PM Becket Qin <becket....@gmail.com> wrote: > Hi Venkata, > > > > Also I made minor changes to the *NestedFieldReferenceExpression, > *instead > > of *fieldIndexArray* we can just do away with *fieldNames *array that > > includes fieldName at every level for the nested field. > > > I don't think keeping only the field names array would work. At the end of > the day, the contract between Flink SQL and the connectors is based on the > indexes, not the names. Technically speaking, the connectors only emit a > bunch of RowData which is based on positions. The field names are added by > the SQL framework via the DDL for those RowData. In this sense, the > connectors may not be aware of the field names in Flink DDL at all. The > common language between Flink SQL and source is just positions. This is > also why ProjectionPushDown would work by only relying on the indexes, not > the field names. So I think the field index array is a must have here in > the NestedFieldReferenceExpression. > > Thanks, > > Jiangjie (Becket) Qin > > On Fri, Sep 1, 2023 at 8:12 AM Venkatakrishnan Sowrirajan < > vsowr...@asu.edu> > wrote: > > > Gentle ping on the vote for FLIP-356: Support Nested fields filter > pushdown > > < > https://urldefense.com/v3/__https://www.mail-archive.com/dev@flink.apache.org/msg69289.html__;!!IKRxdwAv5BmarQ!bOW26WlafOQQcb32eWtUiXBAl0cTCK1C6iYhDI2f_z__eczudAWmTRvjDiZg6gzlXmPXrDV4KJS5cFxagFE$ > >. > > > > Regards > > Venkata krishnan > > > > > > On Tue, Aug 29, 2023 at 9:18 PM Venkatakrishnan Sowrirajan < > > vsowr...@asu.edu> > > wrote: > > > > > Sure, will reference this discussion to resume where we started as part > > of > > > the flip to refactor SupportsProjectionPushDown. > > > > > > On Tue, Aug 29, 2023, 7:22 PM Jark Wu <imj...@gmail.com> wrote: > > > > > >> I'm fine with this. `ReferenceExpression` and > > `SupportsProjectionPushDown` > > >> can be another FLIP. However, could you summarize the design of this > > part > > >> in the future part of the FLIP? This can be easier to get started with > > in > > >> the future. > > >> > > >> > > >> Best, > > >> Jark > > >> > > >> > > >> On Wed, 30 Aug 2023 at 02:45, Venkatakrishnan Sowrirajan < > > >> vsowr...@asu.edu> > > >> wrote: > > >> > > >> > Thanks Jark. Sounds good. > > >> > > > >> > One more thing, earlier in my summary I mentioned, > > >> > > > >> > Introduce a new *ReferenceExpression* (or *BaseReferenceExpression*) > > >> > > abstract class which will be extended by both > > >> *FieldReferenceExpression* > > >> > > and *NestedFieldReferenceExpression* (to be introduced as part of > > >> this > > >> > > FLIP) > > >> > > > >> > This can be punted for now and can be handled as part of refactoring > > >> > SupportsProjectionPushDown. > > >> > > > >> > Also I made minor changes to the *NestedFieldReferenceExpression, > > >> *instead > > >> > of *fieldIndexArray* we can just do away with *fieldNames *array > that > > >> > includes fieldName at every level for the nested field. > > >> > > > >> > Updated the FLIP-357 > > >> > < > > >> > > > >> > > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!YAk6kV4CYvUSPfpoUDQRs6VlbmJXVX8KOKqFxKbNDkUWKzShvwpkLRGkAV1tgV3EqClNrjGS-Ij86Q$ > > >> > > > > >> > wiki as well. > > >> > > > >> > Regards > > >> > Venkata krishnan > > >> > > > >> > > > >> > On Tue, Aug 29, 2023 at 5:21 AM Jark Wu <imj...@gmail.com> wrote: > > >> > > > >> > > Hi Venkata, > > >> > > > > >> > > Your summary looks good to me. +1 to start a vote. > > >> > > > > >> > > I think we don't need "inputIndex" in > > NestedFieldReferenceExpression. > > >> > > Actually, I think it is also not needed in > FieldReferenceExpression, > > >> > > and we should try to remove it (another topic). The RexInputRef in > > >> > Calcite > > >> > > also doesn't require an inputIndex because the field index should > > >> > represent > > >> > > index of the field in the underlying row type. Field references > > >> shouldn't > > >> > > be > > >> > > aware of the number of inputs. > > >> > > > > >> > > Best, > > >> > > Jark > > >> > > > > >> > > > > >> > > On Tue, 29 Aug 2023 at 02:24, Venkatakrishnan Sowrirajan < > > >> > vsowr...@asu.edu > > >> > > > > > >> > > wrote: > > >> > > > > >> > > > Hi Jinsong, > > >> > > > > > >> > > > Thanks for your comments. > > >> > > > > > >> > > > What is inputIndex in NestedFieldReferenceExpression? > > >> > > > > > >> > > > > > >> > > > I haven't looked at it before. Do you mean, given that it is now > > >> only > > >> > > used > > >> > > > to push filters it won't be subsequently used in further > > >> > > > planning/optimization and therefore it is not required at this > > time? > > >> > > > > > >> > > > So if NestedFieldReferenceExpression doesn't need inputIndex, is > > >> there > > >> > > > > a need to introduce a base class `ReferenceExpression`? > > >> > > > > > >> > > > For SupportsFilterPushDown itself, *ReferenceExpression* base > > class > > >> is > > >> > > not > > >> > > > needed. But there were discussions around cleaning up and > > >> standardizing > > >> > > the > > >> > > > API for Supports*PushDown. SupportsProjectionPushDown currently > > >> pushes > > >> > > the > > >> > > > projects as a 2-d array, instead it would be better to use the > > >> standard > > >> > > API > > >> > > > which seems to be the *ResolvedExpression*. For > > >> > > SupportsProjectionPushDown > > >> > > > either FieldReferenceExpression (top level fields) or > > >> > > > NestedFieldReferenceExpression (nested fields) is enough, in > order > > >> to > > >> > > > provide a single API that handles both top level and nested > > fields, > > >> > > > ReferenceExpression will be introduced as a base class. > > >> > > > > > >> > > > Eventually, *SupportsProjectionPushDown#applyProjections* would > > >> evolve > > >> > as > > >> > > > applyProjection(List<ReferenceExpression> projectedFields) and > > >> nested > > >> > > > fields would be pushed only if *supportsNestedProjections* > returns > > >> > true. > > >> > > > > > >> > > > Regards > > >> > > > Venkata krishnan > > >> > > > > > >> > > > > > >> > > > On Sun, Aug 27, 2023 at 11:12 PM Jingsong Li < > > >> jingsongl...@gmail.com> > > >> > > > wrote: > > >> > > > > > >> > > > > So if NestedFieldReferenceExpression doesn't need inputIndex, > is > > >> > there > > >> > > > > a need to introduce a base class `ReferenceExpression`? > > >> > > > > > > >> > > > > Best, > > >> > > > > Jingsong > > >> > > > > > > >> > > > > On Mon, Aug 28, 2023 at 2:09 PM Jingsong Li < > > >> jingsongl...@gmail.com> > > >> > > > > wrote: > > >> > > > > > > > >> > > > > > Hi thanks all for your discussion. > > >> > > > > > > > >> > > > > > What is inputIndex in NestedFieldReferenceExpression? > > >> > > > > > > > >> > > > > > I know inputIndex has special usage in > > FieldReferenceExpression, > > >> > but > > >> > > > > > it is only for Join operators, and it is only for SQL > > >> optimization. > > >> > > It > > >> > > > > > looks like there is no requirement for Nested. > > >> > > > > > > > >> > > > > > Best, > > >> > > > > > Jingsong > > >> > > > > > > > >> > > > > > On Mon, Aug 28, 2023 at 1:13 PM Venkatakrishnan Sowrirajan > > >> > > > > > <vsowr...@asu.edu> wrote: > > >> > > > > > > > > >> > > > > > > Thanks for all the feedback and discussion everyone. Looks > > >> like > > >> > we > > >> > > > have > > >> > > > > > > reached a consensus here. > > >> > > > > > > > > >> > > > > > > Just to summarize: > > >> > > > > > > > > >> > > > > > > 1. Introduce a new *ReferenceExpression* (or > > >> > > > *BaseReferenceExpression*) > > >> > > > > > > abstract class which will be extended by both > > >> > > > > *FieldReferenceExpression* > > >> > > > > > > and *NestedFieldReferenceExpression* (to be introduced as > > >> part of > > >> > > > this > > >> > > > > FLIP) > > >> > > > > > > 2. No need of *supportsNestedFilters *check as the current > > >> > > > > > > *SupportsFilterPushDown* should already ignore unknown > > >> > expressions > > >> > > ( > > >> > > > > > > *NestedFieldReferenceExpression* for example) and return > > them > > >> as > > >> > > > > > > *remainingFilters. > > >> > > > > > > *Maybe this should be clarified explicitly in the Javadoc > of > > >> > > > > > > *SupportsFilterPushDown. > > >> > > > > > > *I will file a separate JIRA to fix the documentation. > > >> > > > > > > 3. Refactor *SupportsProjectionPushDown* to use > > >> > > *ReferenceExpression > > >> > > > > *instead > > >> > > > > > > of existing 2-d arrays to consolidate and be consistent > with > > >> > other > > >> > > > > > > Supports*PushDown APIs - *outside the scope of this FLIP* > > >> > > > > > > 4. Similarly *SupportsAggregatePushDown* should also be > > >> evolved > > >> > > > > whenever > > >> > > > > > > nested fields support is added to use the > > >> *ReferenceExpression - > > >> > > > > **outside > > >> > > > > > > the scope of this FLIP* > > >> > > > > > > > > >> > > > > > > Does this sound good? Please let me know if I have missed > > >> > anything > > >> > > > > here. If > > >> > > > > > > there are no concerns, I will start a vote tomorrow. I > will > > >> also > > >> > > get > > >> > > > > the > > >> > > > > > > FLIP-356 wiki updated. Thanks everyone once again! > > >> > > > > > > > > >> > > > > > > Regards > > >> > > > > > > Venkata krishnan > > >> > > > > > > > > >> > > > > > > > > >> > > > > > > On Thu, Aug 24, 2023 at 8:19 PM Becket Qin < > > >> becket....@gmail.com > > >> > > > > >> > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Hi Jark, > > >> > > > > > > > > > >> > > > > > > > How about having a separate > > NestedFieldReferenceExpression, > > >> and > > >> > > > > > > > > abstracting a common base class "ReferenceExpression" > > for > > >> > > > > > > > > NestedFieldReferenceExpression and > > >> FieldReferenceExpression? > > >> > > This > > >> > > > > makes > > >> > > > > > > > > unifying expressions in > > >> > > > > > > > > > > >> > > > > > > >> > > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression> > > >> > > > > > > > > ...)" > > >> > > > > > > > > possible. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > I'd be fine with this. It at least provides a consistent > > API > > >> > > style > > >> > > > / > > >> > > > > > > > formality. > > >> > > > > > > > > > >> > > > > > > > Re: Yunhong, > > >> > > > > > > > > > >> > > > > > > > 3. Finally, I think we need to look at the costs and > > >> benefits > > >> > of > > >> > > > > unifying > > >> > > > > > > > > the SupportsFilterPushDown and > > SupportsProjectionPushDown > > >> (or > > >> > > > > others) > > >> > > > > > > > from > > >> > > > > > > > > the perspective of interface implementers. A stable > API > > >> can > > >> > > > reduce > > >> > > > > user > > >> > > > > > > > > development and change costs, if the current API can > > fully > > >> > meet > > >> > > > the > > >> > > > > > > > > functional requirements at the framework level, I > > personal > > >> > > > suggest > > >> > > > > > > > reducing > > >> > > > > > > > > the impact on connector developers. > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > I agree that the cost and benefit should be measured. > And > > >> the > > >> > > > > measurement > > >> > > > > > > > should be in the long term instead of short term. That > is > > >> why > > >> > we > > >> > > > > always > > >> > > > > > > > need to align on the ideal end state first. > > >> > > > > > > > Meeting functionality requirements is the bare minimum > bar > > >> for > > >> > an > > >> > > > > API. > > >> > > > > > > > Simplicity, intuitiveness, robustness and evolvability > are > > >> also > > >> > > > > important. > > >> > > > > > > > In addition, for projects with many APIs, such as > Flink, a > > >> > > > > consistent API > > >> > > > > > > > style is also critical for the user adoption as well as > > bug > > >> > > > > avoidance. It > > >> > > > > > > > is very helpful for the community to agree on some API > > >> design > > >> > > > > conventions / > > >> > > > > > > > principles. > > >> > > > > > > > For example, in this particular case, via our > discussion, > > >> > > hopefully > > >> > > > > we sort > > >> > > > > > > > of established the following API design conventions / > > >> > principles > > >> > > > for > > >> > > > > all > > >> > > > > > > > the Supports*PushDown interfaces. > > >> > > > > > > > > > >> > > > > > > > 1. By default, expressions should be used if applicable > > >> instead > > >> > > of > > >> > > > > other > > >> > > > > > > > representations. > > >> > > > > > > > 2. In general, the pushdown method should not assume all > > the > > >> > > > > pushdowns will > > >> > > > > > > > succeed. So the applyX() method should return a boolean > or > > >> > > List<X>, > > >> > > > > to > > >> > > > > > > > handle the cases that some of the pushdowns cannot be > > >> fulfilled > > >> > > by > > >> > > > > the > > >> > > > > > > > implementation. > > >> > > > > > > > > > >> > > > > > > > Establishing such conventions and principles demands > > careful > > >> > > > > thinking for > > >> > > > > > > > the aspects I mentioned earlier in addition to the API > > >> > > > > functionalities. > > >> > > > > > > > This helps lower the bar of understanding, reduces the > > >> chance > > >> > of > > >> > > > > having > > >> > > > > > > > loose ends in the API, and will benefit all the > > >> participants in > > >> > > the > > >> > > > > project > > >> > > > > > > > over time. I think this is the right way to achieve real > > API > > >> > > > > stability. > > >> > > > > > > > Otherwise, we may end up chasing our tails to find ways > > not > > >> to > > >> > > > > change the > > >> > > > > > > > existing non-ideal APIs. > > >> > > > > > > > > > >> > > > > > > > Thanks, > > >> > > > > > > > > > >> > > > > > > > Jiangjie (Becket) Qin > > >> > > > > > > > > > >> > > > > > > > On Fri, Aug 25, 2023 at 9:33 AM yh z < > > >> zhengyunhon...@gmail.com > > >> > > > > >> > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > Hi, Venkat, > > >> > > > > > > > > > > >> > > > > > > > > Thanks for the FLIP, it sounds good to support nested > > >> fields > > >> > > > filter > > >> > > > > > > > > pushdown. Based on the design of flip and the above > > >> options, > > >> > I > > >> > > > > would like > > >> > > > > > > > > to make a few suggestions: > > >> > > > > > > > > > > >> > > > > > > > > 1. At present, introducing > > NestedFieldReferenceExpression > > >> > > looks > > >> > > > > like a > > >> > > > > > > > > better solution, which can fully meet our requirements > > >> while > > >> > > > > reducing > > >> > > > > > > > > modifications to base class FieldReferenceExpression. > In > > >> the > > >> > > long > > >> > > > > run, I > > >> > > > > > > > > tend to abstract a basic class for > > >> > > NestedFieldReferenceExpression > > >> > > > > and > > >> > > > > > > > > FieldReferenceExpression as u suggested. > > >> > > > > > > > > > > >> > > > > > > > > 2. Personally, I don't recommend introducing > > >> > > > > *supportsNestedFilters() in > > >> > > > > > > > > supportsFilterPushdown. We just need to better declare > > the > > >> > > return > > >> > > > > value > > >> > > > > > > > of > > >> > > > > > > > > the method *applyFilters. > > >> > > > > > > > > > > >> > > > > > > > > 3. Finally, I think we need to look at the costs and > > >> benefits > > >> > > of > > >> > > > > unifying > > >> > > > > > > > > the SupportsFilterPushDown and > > SupportsProjectionPushDown > > >> (or > > >> > > > > others) > > >> > > > > > > > from > > >> > > > > > > > > the perspective of interface implementers. A stable > API > > >> can > > >> > > > reduce > > >> > > > > user > > >> > > > > > > > > development and change costs, if the current API can > > fully > > >> > meet > > >> > > > the > > >> > > > > > > > > functional requirements at the framework level, I > > personal > > >> > > > suggest > > >> > > > > > > > reducing > > >> > > > > > > > > the impact on connector developers. > > >> > > > > > > > > > > >> > > > > > > > > Regards, > > >> > > > > > > > > Yunhong Zheng (Swuferhong) > > >> > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > Venkatakrishnan Sowrirajan <vsowr...@asu.edu> > > >> 于2023年8月25日周五 > > >> > > > > 01:25写道: > > >> > > > > > > > > > > >> > > > > > > > > > To keep it backwards compatible, introduce another > API > > >> > > > > *applyAggregates > > >> > > > > > > > > > *with > > >> > > > > > > > > > *List<ReferenceExpression> *when nested field > support > > is > > >> > > added > > >> > > > > and > > >> > > > > > > > > > deprecate the current API. This will by default > throw > > an > > >> > > > > exception. In > > >> > > > > > > > > > flink planner, *applyAggregates *with nested fields > > and > > >> if > > >> > it > > >> > > > > throws > > >> > > > > > > > > > exception then *applyAggregates* without nested > > fields. > > >> > > > > > > > > > > > >> > > > > > > > > > Regards > > >> > > > > > > > > > Venkata krishnan > > >> > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > On Thu, Aug 24, 2023 at 10:13 AM Venkatakrishnan > > >> > Sowrirajan < > > >> > > > > > > > > > vsowr...@asu.edu> wrote: > > >> > > > > > > > > > > > >> > > > > > > > > > > Jark, > > >> > > > > > > > > > > > > >> > > > > > > > > > > How about having a separate > > >> > NestedFieldReferenceExpression, > > >> > > > and > > >> > > > > > > > > > >> abstracting a common base class > > "ReferenceExpression" > > >> > for > > >> > > > > > > > > > >> NestedFieldReferenceExpression and > > >> > > FieldReferenceExpression? > > >> > > > > This > > >> > > > > > > > > makes > > >> > > > > > > > > > >> unifying expressions in > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > >> > > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression> > > >> > > > > > > > > > >> ...)" > > >> > > > > > > > > > >> possible. > > >> > > > > > > > > > > > > >> > > > > > > > > > > This should be fine for > *SupportsProjectionPushDown* > > >> and > > >> > > > > > > > > > > *SupportsFilterPushDown*. One concern in the case > of > > >> > > > > > > > > > > *SupportsAggregatePushDown* with nested fields > > support > > >> > (to > > >> > > be > > >> > > > > added > > >> > > > > > > > in > > >> > > > > > > > > > > the future), with this proposal, the API will > become > > >> > > > backwards > > >> > > > > > > > > > incompatible > > >> > > > > > > > > > > as the *args *for the aggregate function is > > >> > > > > > > > > > *List<FieldReferenceExpression> > > >> > > > > > > > > > > *that needs to change to > > *List<ReferenceExpression>*. > > >> > > > > > > > > > > > > >> > > > > > > > > > > Regards > > >> > > > > > > > > > > Venkata krishnan > > >> > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > On Thu, Aug 24, 2023 at 1:18 AM Jark Wu < > > >> > imj...@gmail.com> > > >> > > > > wrote: > > >> > > > > > > > > > > > > >> > > > > > > > > > >> Hi Becket, > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> I think it is the second case, that a > > >> > > > > FieldReferenceExpression is > > >> > > > > > > > > > >> constructed > > >> > > > > > > > > > >> by the framework and passed to the connector > > >> (interfaces > > >> > > > > listed by > > >> > > > > > > > > > >> Venkata[1] > > >> > > > > > > > > > >> and Catalog#listPartitionsByFilter). Besides, > > >> > > understanding > > >> > > > > the > > >> > > > > > > > nested > > >> > > > > > > > > > >> field > > >> > > > > > > > > > >> is optional for users/connectors (just treat it > as > > an > > >> > > > unknown > > >> > > > > > > > > expression > > >> > > > > > > > > > >> if > > >> > > > > > > > > > >> the > > >> > > > > > > > > > >> connector doesn't want to support it). > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> If we extend FieldReferenceExpression, in the > case > > of > > >> > > "where > > >> > > > > > > > > col.nested > > >> > > > > > > > > > > > > >> > > > > > > > > > >> 10", > > >> > > > > > > > > > >> for the connectors already supported > filter/delete > > >> > > pushdown, > > >> > > > > they > > >> > > > > > > > may > > >> > > > > > > > > > >> wrongly > > >> > > > > > > > > > >> pushdown "col > 10" instead of "nested > 10" > > because > > >> > they > > >> > > > > still > > >> > > > > > > > treat > > >> > > > > > > > > > >> FieldReferenceExpression as a top-level column. > > This > > >> > > problem > > >> > > > > can be > > >> > > > > > > > > > >> resolved > > >> > > > > > > > > > >> by introducing an additional > > >> "supportedNestedPushdown" > > >> > for > > >> > > > > each > > >> > > > > > > > > > interface, > > >> > > > > > > > > > >> but that method is not elegant and is hard to > > remove > > >> in > > >> > > the > > >> > > > > future, > > >> > > > > > > > > and > > >> > > > > > > > > > >> this could > > >> > > > > > > > > > >> be avoided if we have a separate > > >> > > > > NestedFieldReferenceExpression. > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> If we want to extend FieldReferenceExpression, we > > >> have > > >> > to > > >> > > > add > > >> > > > > > > > > > protections > > >> > > > > > > > > > >> for every related API in one shot. Besides, > > >> > > > > FieldReferenceExpression > > >> > > > > > > > > is > > >> > > > > > > > > > a > > >> > > > > > > > > > >> fundamental class in the planner, we have to go > > >> through > > >> > > all > > >> > > > > the code > > >> > > > > > > > > > that > > >> > > > > > > > > > >> is using it to make sure it properly handling it > if > > >> it > > >> > is > > >> > > a > > >> > > > > nested > > >> > > > > > > > > field > > >> > > > > > > > > > >> which > > >> > > > > > > > > > >> is a big effort for the community. > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> If we were designing this API on day 1, I fully > > >> support > > >> > > > > merging them > > >> > > > > > > > > in > > >> > > > > > > > > > a > > >> > > > > > > > > > >> FieldReferenceExpression. But in this case, I'm > > >> thinking > > >> > > > > about how > > >> > > > > > > > to > > >> > > > > > > > > > >> provide > > >> > > > > > > > > > >> users with a smooth migration path, and allow the > > >> > > community > > >> > > > to > > >> > > > > > > > > gradually > > >> > > > > > > > > > >> put efforts into evolving the API, and not block > > the > > >> > > "Nested > > >> > > > > Fields > > >> > > > > > > > > > Filter > > >> > > > > > > > > > >> Pushdown" > > >> > > > > > > > > > >> requirement. > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> How about having a separate > > >> > > NestedFieldReferenceExpression, > > >> > > > > and > > >> > > > > > > > > > >> abstracting a common base class > > "ReferenceExpression" > > >> > for > > >> > > > > > > > > > >> NestedFieldReferenceExpression and > > >> > > FieldReferenceExpression? > > >> > > > > This > > >> > > > > > > > > makes > > >> > > > > > > > > > >> unifying expressions in > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > > > > > > >> > > "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression> > > >> > > > > > > > > > >> ...)" > > >> > > > > > > > > > >> possible. > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> Best, > > >> > > > > > > > > > >> Jark > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> On Thu, 24 Aug 2023 at 07:00, Venkatakrishnan > > >> > Sowrirajan < > > >> > > > > > > > > > >> vsowr...@asu.edu> > > >> > > > > > > > > > >> wrote: > > >> > > > > > > > > > >> > > >> > > > > > > > > > >> > Becket and Jark, > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > Deprecate all the other > > >> > > > > > > > > > >> > > methods except tryApplyFilters() and > > >> > > > > tryApplyProjections(). > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > For *SupportsProjectionPushDown*, we still > need a > > >> > > > > > > > > > >> > *supportsNestedProjections* API on the table > > >> source as > > >> > > > some > > >> > > > > of the > > >> > > > > > > > > > table > > >> > > > > > > > > > >> > sources might not be able to handle nested > fields > > >> and > > >> > > > > therefore > > >> > > > > > > > the > > >> > > > > > > > > > >> Flink > > >> > > > > > > > > > >> > planner should not push down the nested > > >> projections or > > >> > > > else > > >> > > > > the > > >> > > > > > > > > > >> > *applyProjection > > >> > > > > > > > > > >> > *API has to be appropriately changed to return > > >> > > > > > > > > > >> > *unconvertibleProjections *similar > > >> > > > > > > > > > >> > to *SupportsFilterPushDown*. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > Or we have to introduce two different > > >> > applyProjections() > > >> > > > > > > > > > >> > > methods for FieldReferenceExpression / > > >> > > > > > > > > > NestedFieldReferenceExpression > > >> > > > > > > > > > >> > > respectively. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > Agree this is not preferred. Given that > > >> > > > > *supportNestedProjections > > >> > > > > > > > > > >> *cannot > > >> > > > > > > > > > >> > be deprecated/removed based on the current API > > >> form, > > >> > > > > extending > > >> > > > > > > > > > >> > *FieldReferenceExpression* to support nested > > fields > > >> > > should > > >> > > > > be > > >> > > > > > > > okay. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > Another alternative could be to change > > >> > *applyProjections > > >> > > > > *to take > > >> > > > > > > > > > >> > List<ResolvedExpression> and on the connector > > side > > >> > they > > >> > > > > choose to > > >> > > > > > > > > > handle > > >> > > > > > > > > > >> > *FieldReferenceExpression* and > > >> > > > > *NestedFieldReferenceExpression *as > > >> > > > > > > > > > >> > applicable and return the remainingProjections. > > In > > >> the > > >> > > > case > > >> > > > > of > > >> > > > > > > > > nested > > >> > > > > > > > > > >> field > > >> > > > > > > > > > >> > projections not supported, it should return > them > > >> back > > >> > > but > > >> > > > > only > > >> > > > > > > > > > >> projecting > > >> > > > > > > > > > >> > the top level fields. IMO, this is also *not > > >> > preferred*. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > *SupportsAggregatePushDown* > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > *AggregateExpression *currently takes in a list > > of > > >> > > > > > > > > > >> > *FieldReferenceExpression* as args for the > > >> aggregate > > >> > > > > function, if > > >> > > > > > > > in > > >> > > > > > > > > > >> future > > >> > > > > > > > > > >> > *SupportsAggregatePushDown* adds support for > > >> aggregate > > >> > > > > pushdown on > > >> > > > > > > > > > >> nested > > >> > > > > > > > > > >> > fields then the AggregateExpression API also > has > > to > > >> > > change > > >> > > > > if a > > >> > > > > > > > new > > >> > > > > > > > > > >> > NestedFieldReferenceExpression is introduced > for > > >> > nested > > >> > > > > fields. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > If we add a > > >> > > > > > > > > > >> > > flag for each new filter, > > >> > > > > > > > > > >> > > the interface will be filled with lots of > flags > > >> > (e.g., > > >> > > > > > > > > > >> supportsBetween, > > >> > > > > > > > > > >> > > supportsIN) > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > In an ideal situation, I completely agree with > > you. > > >> > But > > >> > > in > > >> > > > > the > > >> > > > > > > > > current > > >> > > > > > > > > > >> > state, *supportsNestedFilters* can act as a > > bridge > > >> to > > >> > > > reach > > >> > > > > the > > >> > > > > > > > > > eventual > > >> > > > > > > > > > >> > desired state which is to have a clean and > > >> consistent > > >> > > set > > >> > > > > of APIs > > >> > > > > > > > > > >> > throughout all Supports*PushDown. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > Also shared some thoughts on the end state API > > >> > > > > > > > > > >> > < > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://urldefense.com/v3/__https://docs.google.com/document/d/1stLRPKOcxlEv8eHblkrOh0Zf5PLM-h76WMhEINHOyPY/edit?usp=sharing__;!!IKRxdwAv5BmarQ!ZZ2nS1PYlXLnEGFcikS3NsYG7tMaV3wU_z7FmvihNwQBmoLZk2WmcpuRWszK0FFmsInh9A6cndkJrQ$ > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > with extension to the > *FieldReferenceExpression* > > to > > >> > > > support > > >> > > > > nested > > >> > > > > > > > > > >> fields. > > >> > > > > > > > > > >> > Please take a look. > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > Regards > > >> > > > > > > > > > >> > Venkata krishnan > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > On Tue, Aug 22, 2023 at 5:02 PM Becket Qin < > > >> > > > > becket....@gmail.com> > > >> > > > > > > > > > >> wrote: > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > Hi Jark, > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > Regarding the migration path, it would be > > useful > > >> to > > >> > > > > scrutinize > > >> > > > > > > > the > > >> > > > > > > > > > use > > >> > > > > > > > > > >> > case > > >> > > > > > > > > > >> > > of FiledReferenceExpression and > > >> ResolvedExpressions. > > >> > > > > There are > > >> > > > > > > > two > > >> > > > > > > > > > >> kinds > > >> > > > > > > > > > >> > of > > >> > > > > > > > > > >> > > use cases: > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > 1. A ResolvedExpression is constructed by the > > >> user > > >> > or > > >> > > > > connector > > >> > > > > > > > / > > >> > > > > > > > > > >> plugin > > >> > > > > > > > > > >> > > developers. > > >> > > > > > > > > > >> > > 2. A ResolvedExpression is constructed by the > > >> > > framework > > >> > > > > and > > >> > > > > > > > passed > > >> > > > > > > > > > to > > >> > > > > > > > > > >> > user > > >> > > > > > > > > > >> > > or connector / plugin developers. > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > For the first case, both of the approaches > > >> provide > > >> > the > > >> > > > > same > > >> > > > > > > > > > migration > > >> > > > > > > > > > >> > > experience. > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > For the second case, generally speaking, > > >> introducing > > >> > > > > > > > > > >> > > NestedFieldReferenceExpression and extending > > >> > > > > > > > > > FieldReferenceExpression > > >> > > > > > > > > > >> > would > > >> > > > > > > > > > >> > > have the same impact for backwards > > compatibility. > > >> > > > > > > > > > >> SupportsFilterPushDown > > >> > > > > > > > > > >> > is > > >> > > > > > > > > > >> > > a special case here because understanding the > > >> filter > > >> > > > > expressions > > >> > > > > > > > > is > > >> > > > > > > > > > >> > > optional for the source implementation. In > > other > > >> use > > >> > > > > cases, if > > >> > > > > > > > > > >> > > understanding the reference to a nested field > > is > > >> a > > >> > > must > > >> > > > > have, > > >> > > > > > > > the > > >> > > > > > > > > > user > > >> > > > > > > > > > >> > code > > >> > > > > > > > > > >> > > has to be changed, regardless of which > approach > > >> we > > >> > > take > > >> > > > to > > >> > > > > > > > support > > >> > > > > > > > > > >> nested > > >> > > > > > > > > > >> > > fields. > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > Therefore, I think we have to check each > public > > >> API > > >> > > > where > > >> > > > > the > > >> > > > > > > > > nested > > >> > > > > > > > > > >> > field > > >> > > > > > > > > > >> > > reference is exposed. If we have many public > > APIs > > >> > > where > > >> > > > > > > > > > understanding > > >> > > > > > > > > > >> > > nested fields is optional for the user / > > plugin > > >> / > > >> > > > > connector > > >> > > > > > > > > > >> developers, > > >> > > > > > > > > > >> > > having a separate > > NestedFieldReferenceExpression > > >> > would > > >> > > > > have a > > >> > > > > > > > more > > >> > > > > > > > > > >> smooth > > >> > > > > > > > > > >> > > migration. Otherwise, there seems to be no > > >> > difference > > >> > > > > between > > >> > > > > > > > the > > >> > > > > > > > > > two > > >> > > > > > > > > > >> > > approaches. > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > Migration path aside, the main reason I > prefer > > >> > > extending > > >> > > > > > > > > > >> > > FieldReferenceExpression over a new > > >> > > > > > > > NestedFieldReferenceExpression > > >> > > > > > > > > > is > > >> > > > > > > > > > >> > > because this makes the > > SupportsProjectionPushDown > > >> > > > > interface > > >> > > > > > > > > simpler. > > >> > > > > > > > > > >> > > Otherwise, we have to treat it as a special > > case > > >> > that > > >> > > > > does not > > >> > > > > > > > > match > > >> > > > > > > > > > >> the > > >> > > > > > > > > > >> > > overall API style. Or we have to introduce > two > > >> > > different > > >> > > > > > > > > > >> > applyProjections() > > >> > > > > > > > > > >> > > methods for FieldReferenceExpression / > > >> > > > > > > > > > NestedFieldReferenceExpression > > >> > > > > > > > > > >> > > respectively. This issue further extends to > > >> > > > > implementation in > > >> > > > > > > > > > >> addition to > > >> > > > > > > > > > >> > > public API. A single FieldReferenceExpression > > >> might > > >> > > help > > >> > > > > > > > simplify > > >> > > > > > > > > > the > > >> > > > > > > > > > >> > > implementation code a little bit. For > example, > > >> in a > > >> > > > > recursive > > >> > > > > > > > > > >> processing > > >> > > > > > > > > > >> > of > > >> > > > > > > > > > >> > > a row with nested rows, we may not need to > > switch > > >> > > > between > > >> > > > > > > > > > >> > > FieldReferenceExpression and > > >> > > > > NestedFieldReferenceExpression > > >> > > > > > > > > > depending > > >> > > > > > > > > > >> on > > >> > > > > > > > > > >> > > whether the record being processed is a top > > level > > >> > > record > > >> > > > > or > > >> > > > > > > > nested > > >> > > > > > > > > > >> > record. > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > Thanks, > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > Jiangjie (Becket) Qin > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > On Tue, Aug 22, 2023 at 11:43 PM Jark Wu < > > >> > > > > imj...@gmail.com> > > >> > > > > > > > > wrote: > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > Hi Becket, > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > I totally agree we should try to have a > > >> consistent > > >> > > API > > >> > > > > for a > > >> > > > > > > > > final > > >> > > > > > > > > > >> > state. > > >> > > > > > > > > > >> > > > The only concern I have mentioned is the > > >> "smooth" > > >> > > > > migration > > >> > > > > > > > > path. > > >> > > > > > > > > > >> > > > The FiledReferenceExpression is widely used > > in > > >> > many > > >> > > > > public > > >> > > > > > > > APIs, > > >> > > > > > > > > > >> > > > not only in the SupportsFilterPushDown. > Yes, > > we > > >> > can > > >> > > > > change > > >> > > > > > > > every > > >> > > > > > > > > > >> > > > methods in 2-steps, but is it good to > change > > >> API > > >> > > back > > >> > > > > and > > >> > > > > > > > forth > > >> > > > > > > > > > for > > >> > > > > > > > > > >> > this? > > >> > > > > > > > > > >> > > > Personally, I'm fine with a separate > > >> > > > > > > > > > NestedFieldReferenceExpression > > >> > > > > > > > > > >> > > class. > > >> > > > > > > > > > >> > > > TBH, I prefer the separated way because it > > >> makes > > >> > the > > >> > > > > reference > > >> > > > > > > > > > >> > expression > > >> > > > > > > > > > >> > > > more clear and concise. > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > Best, > > >> > > > > > > > > > >> > > > Jark > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > On Tue, 22 Aug 2023 at 16:53, Becket Qin < > > >> > > > > > > > becket....@gmail.com> > > >> > > > > > > > > > >> wrote: > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > Thanks for the reply, Jark. > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > I think it will be helpful to understand > > the > > >> > final > > >> > > > > state we > > >> > > > > > > > > want > > >> > > > > > > > > > >> to > > >> > > > > > > > > > >> > > > > eventually achieve first, then we can > > discuss > > >> > the > > >> > > > > steps > > >> > > > > > > > > towards > > >> > > > > > > > > > >> that > > >> > > > > > > > > > >> > > > final > > >> > > > > > > > > > >> > > > > state. > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > It looks like there are two proposed end > > >> states > > >> > > now: > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > 1. Have a separate > > >> > NestedFieldReferenceExpression > > >> > > > > class; > > >> > > > > > > > keep > > >> > > > > > > > > > >> > > > > SupportsFilterPushDown and > > >> > > > SupportsProjectionPushDown > > >> > > > > the > > >> > > > > > > > > same. > > >> > > > > > > > > > >> It is > > >> > > > > > > > > > >> > > > just > > >> > > > > > > > > > >> > > > > a one step change. > > >> > > > > > > > > > >> > > > > - Regarding the > > >> > supportsNestedFilterPushDown() > > >> > > > > method, if > > >> > > > > > > > > our > > >> > > > > > > > > > >> > > contract > > >> > > > > > > > > > >> > > > > with the connector developer today is > "The > > >> > > > > implementation > > >> > > > > > > > > should > > >> > > > > > > > > > >> > ignore > > >> > > > > > > > > > >> > > > > unrecognized expressions by putting them > > into > > >> > the > > >> > > > > remaining > > >> > > > > > > > > > >> filters, > > >> > > > > > > > > > >> > > > > instead of throwing exceptions". Then > there > > >> is > > >> > no > > >> > > > > need for > > >> > > > > > > > > this > > >> > > > > > > > > > >> > > method. I > > >> > > > > > > > > > >> > > > > am not sure about the current contract. > We > > >> > should > > >> > > > > probably > > >> > > > > > > > > make > > >> > > > > > > > > > it > > >> > > > > > > > > > >> > > clear > > >> > > > > > > > > > >> > > > in > > >> > > > > > > > > > >> > > > > the interface Java doc. > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > 2. Extend the existing > > >> FiledReferenceExpression > > >> > > > class > > >> > > > > to > > >> > > > > > > > > support > > >> > > > > > > > > > >> > nested > > >> > > > > > > > > > >> > > > > fields; SupportsFilterPushDown only has > one > > >> > method > > >> > > > of > > >> > > > > > > > > > >> > > > > applyFilters(List<ResolvedExpression>); > > >> > > > > > > > > > SupportsProjectionPushDown > > >> > > > > > > > > > >> > only > > >> > > > > > > > > > >> > > > has > > >> > > > > > > > > > >> > > > > one method of > > >> > > > > > > > applyProjections(List<FieldReferenceExpression>, > > >> > > > > > > > > > >> > > DataType). > > >> > > > > > > > > > >> > > > > It could just be two steps if we are not > > too > > >> > > > obsessed > > >> > > > > with > > >> > > > > > > > the > > >> > > > > > > > > > >> exact > > >> > > > > > > > > > >> > > > names > > >> > > > > > > > > > >> > > > > of "applyFilters" and "applyProjections". > > >> More > > >> > > > > specifically, > > >> > > > > > > > > it > > >> > > > > > > > > > >> takes > > >> > > > > > > > > > >> > > two > > >> > > > > > > > > > >> > > > > steps to achieve this final state: > > >> > > > > > > > > > >> > > > > a. introduce a new method > > >> > > > > > > > > > >> > tryApplyFilters(List<ResolvedExpression>) > > >> > > > > > > > > > >> > > > to > > >> > > > > > > > > > >> > > > > SupportsFilterPushDown, which may have > > >> > > > > > > > > FiledReferenceExpression > > >> > > > > > > > > > >> with > > >> > > > > > > > > > >> > > > nested > > >> > > > > > > > > > >> > > > > fields. The default implementation throws > > an > > >> > > > > exception. The > > >> > > > > > > > > > >> runtime > > >> > > > > > > > > > >> > > will > > >> > > > > > > > > > >> > > > > first call tryApplyFilters() with nested > > >> fields. > > >> > > In > > >> > > > > case of > > >> > > > > > > > > > >> > exception, > > >> > > > > > > > > > >> > > it > > >> > > > > > > > > > >> > > > > calls the existing applyFilters() without > > >> > > including > > >> > > > > the > > >> > > > > > > > nested > > >> > > > > > > > > > >> > filters. > > >> > > > > > > > > > >> > > > > Similarly, in SupportsProjectionPushDown, > > >> > > introduce > > >> > > > a > > >> > > > > > > > > > >> > > > > > > >> tryApplyProjections<List<NestedFieldReference> > > >> > > > method > > >> > > > > > > > > returning > > >> > > > > > > > > > a > > >> > > > > > > > > > >> > > Result. > > >> > > > > > > > > > >> > > > > The Result also contains the accepted and > > >> > > > unapplicable > > >> > > > > > > > > > >> projections. > > >> > > > > > > > > > >> > The > > >> > > > > > > > > > >> > > > > default implementation also throws an > > >> exception. > > >> > > > > Deprecate > > >> > > > > > > > all > > >> > > > > > > > > > the > > >> > > > > > > > > > >> > > other > > >> > > > > > > > > > >> > > > > methods except tryApplyFilters() and > > >> > > > > tryApplyProjections(). > > >> > > > > > > > > > >> > > > > b. remove the deprecated methods in > the > > >> next > > >> > > > major > > >> > > > > > > > version > > >> > > > > > > > > > >> bump. > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > Now the question is putting the migration > > >> steps > > >> > > > > aside, which > > >> > > > > > > > > end > > >> > > > > > > > > > >> > state > > >> > > > > > > > > > >> > > do > > >> > > > > > > > > > >> > > > > we prefer? While the first end state is > > >> > acceptable > > >> > > > > for me, > > >> > > > > > > > > > >> > personally, > > >> > > > > > > > > > >> > > I > > >> > > > > > > > > > >> > > > > prefer the latter if we are designing > from > > >> > > scratch. > > >> > > > > It is > > >> > > > > > > > > clean, > > >> > > > > > > > > > >> > > > consistent > > >> > > > > > > > > > >> > > > > and intuitive. Given the size of Flink, > > >> keeping > > >> > > APIs > > >> > > > > in the > > >> > > > > > > > > same > > >> > > > > > > > > > >> > style > > >> > > > > > > > > > >> > > > over > > >> > > > > > > > > > >> > > > > time is important. The migration is also > > not > > >> > that > > >> > > > > > > > complicated. > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > Thanks, > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > Jiangjie (Becket) Qin > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > On Tue, Aug 22, 2023 at 2:23 PM Jark Wu < > > >> > > > > imj...@gmail.com> > > >> > > > > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > Hi Venkat, > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > Thanks for the proposal. > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > I have some minor comments about the > > FLIP. > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > 1. I think we don't need to > > >> > > > > > > > > > >> > > > > > add > > >> > > SupportsFilterPushDown#supportsNestedFilters() > > >> > > > > method, > > >> > > > > > > > > > >> > > > > > because connectors can skip nested > > filters > > >> by > > >> > > > > putting them > > >> > > > > > > > > in > > >> > > > > > > > > > >> > > > > > Result#remainingFilters(). > > >> > > > > > > > > > >> > > > > > And this is backward-compatible because > > >> > unknown > > >> > > > > > > > expressions > > >> > > > > > > > > > were > > >> > > > > > > > > > >> > > added > > >> > > > > > > > > > >> > > > to > > >> > > > > > > > > > >> > > > > > the remaining filters. > > >> > > > > > > > > > >> > > > > > Planner should push predicate > expressions > > >> as > > >> > > more > > >> > > > as > > >> > > > > > > > > possible. > > >> > > > > > > > > > >> If > > >> > > > > > > > > > >> > we > > >> > > > > > > > > > >> > > > add > > >> > > > > > > > > > >> > > > > a > > >> > > > > > > > > > >> > > > > > flag for each new filter, > > >> > > > > > > > > > >> > > > > > the interface will be filled with lots > of > > >> > flags > > >> > > > > (e.g., > > >> > > > > > > > > > >> > > supportsBetween, > > >> > > > > > > > > > >> > > > > > supportsIN). > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > 2. > > >> > > NestedFieldReferenceExpression#nestedFieldName > > >> > > > > should > > >> > > > > > > > be > > >> > > > > > > > > an > > >> > > > > > > > > > >> > array > > >> > > > > > > > > > >> > > of > > >> > > > > > > > > > >> > > > > > field names? > > >> > > > > > > > > > >> > > > > > Each string represents a field name > part > > of > > >> > the > > >> > > > > field > > >> > > > > > > > path. > > >> > > > > > > > > > Just > > >> > > > > > > > > > >> > keep > > >> > > > > > > > > > >> > > > > > aligning with `nestedFieldIndexArray`. > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > 3. My concern about making > > >> > > > FieldReferenceExpression > > >> > > > > > > > support > > >> > > > > > > > > > >> nested > > >> > > > > > > > > > >> > > > fields > > >> > > > > > > > > > >> > > > > > is the compatibility. > > >> > > > > > > > > > >> > > > > > It is a public API and users/connectors > > are > > >> > > > already > > >> > > > > using > > >> > > > > > > > > it. > > >> > > > > > > > > > >> > People > > >> > > > > > > > > > >> > > > > > assumed it is a top-level column > > >> > > > > > > > > > >> > > > > > reference, and applied logic on it. But > > >> that's > > >> > > not > > >> > > > > true > > >> > > > > > > > now > > >> > > > > > > > > > and > > >> > > > > > > > > > >> > this > > >> > > > > > > > > > >> > > > may > > >> > > > > > > > > > >> > > > > > lead to unexpected errors. > > >> > > > > > > > > > >> > > > > > Having a separate > > >> > NestedFieldReferenceExpression > > >> > > > > sounds > > >> > > > > > > > > safer > > >> > > > > > > > > > to > > >> > > > > > > > > > >> > me. > > >> > > > > > > > > > >> > > > > Mixing > > >> > > > > > > > > > >> > > > > > them in a class may > > >> > > > > > > > > > >> > > > > > confuse users what's the meaning of > > >> > > > getFieldName() > > >> > > > > and > > >> > > > > > > > > > >> > > > getFieldIndex(). > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > Regarding using > > >> NestedFieldReferenceExpression > > >> > > in > > >> > > > > > > > > > >> > > > > > SupportsProjectionPushDown, do you > > >> > > > > > > > > > >> > > > > > have any concerns @Timo Walther < > > >> > > > twal...@apache.org> > > >> > > > > ? > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > Best, > > >> > > > > > > > > > >> > > > > > Jark > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > On Tue, 22 Aug 2023 at 05:55, > > >> Venkatakrishnan > > >> > > > > Sowrirajan < > > >> > > > > > > > > > >> > > > > vsowr...@asu.edu > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > Sounds like a great suggestion, > Becket. > > >> +1. > > >> > > > Agree > > >> > > > > with > > >> > > > > > > > > > >> cleaning > > >> > > > > > > > > > >> > up > > >> > > > > > > > > > >> > > > the > > >> > > > > > > > > > >> > > > > > APIs > > >> > > > > > > > > > >> > > > > > > and making it consistent in all the > > >> pushdown > > >> > > > APIs. > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > Your suggested approach seems fine to > > me, > > >> > > unless > > >> > > > > anyone > > >> > > > > > > > > else > > >> > > > > > > > > > >> has > > >> > > > > > > > > > >> > > any > > >> > > > > > > > > > >> > > > > > other > > >> > > > > > > > > > >> > > > > > > concerns. Just have couple of > > clarifying > > >> > > > > questions: > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > 1. Do you think we should standardize > > the > > >> > APIs > > >> > > > > across > > >> > > > > > > > all > > >> > > > > > > > > > the > > >> > > > > > > > > > >> > > > pushdown > > >> > > > > > > > > > >> > > > > > > supports like > > SupportsPartitionPushdown, > > >> > > > > > > > > > >> SupportsDynamicFiltering > > >> > > > > > > > > > >> > > etc > > >> > > > > > > > > > >> > > > > in > > >> > > > > > > > > > >> > > > > > > the end state? > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > The current proposal works if we do > not > > >> want > > >> > > to > > >> > > > > migrate > > >> > > > > > > > > > >> > > > > > > > SupportsFilterPushdown to also use > > >> > > > > > > > > > >> > NestedFieldReferenceExpression > > >> > > > > > > > > > >> > > > in > > >> > > > > > > > > > >> > > > > > the > > >> > > > > > > > > > >> > > > > > > > long term. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > Did you mean > *FieldReferenceExpression* > > >> > > instead > > >> > > > of > > >> > > > > > > > > > >> > > > > > > *NestedFieldReferenceExpression*? > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > 2. Extend the > FieldReferenceExpression > > to > > >> > > > support > > >> > > > > nested > > >> > > > > > > > > > >> fields. > > >> > > > > > > > > > >> > > > > > > > - Change the index field type > > from > > >> int > > >> > > to > > >> > > > > int[]. > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > - Add a new method int[] > > >> > > > getFieldIndexArray(). > > >> > > > > > > > > > >> > > > > > > > - Deprecate the int > > getFieldIndex() > > >> > > > method, > > >> > > > > the > > >> > > > > > > > code > > >> > > > > > > > > > >> will > > >> > > > > > > > > > >> > be > > >> > > > > > > > > > >> > > > > > removed > > >> > > > > > > > > > >> > > > > > > in > > >> > > > > > > > > > >> > > > > > > > the next major version bump. > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > I assume getFieldIndex would return > > >> > > > > fieldIndexArray[0], > > >> > > > > > > > > > right? > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > Thanks > > >> > > > > > > > > > >> > > > > > > Venkat > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > On Fri, Aug 18, 2023 at 4:47 PM > Becket > > >> Qin < > > >> > > > > > > > > > >> becket....@gmail.com > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > Thanks for the proposal, Venkata. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > The current proposal works if we do > > not > > >> > want > > >> > > > to > > >> > > > > > > > migrate > > >> > > > > > > > > > >> > > > > > > > SupportsFilterPushdown to also use > > >> > > > > > > > > > >> > NestedFieldReferenceExpression > > >> > > > > > > > > > >> > > > in > > >> > > > > > > > > > >> > > > > > the > > >> > > > > > > > > > >> > > > > > > > long term. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > Did you mean > *FieldReferenceExpression* > > >> > > instead > > >> > > > of > > >> > > > > > > > > > >> > > > > > > *NestedFieldReferenceExpression*? > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Otherwise, the alternative solution > > >> > briefly > > >> > > > > mentioned > > >> > > > > > > > in > > >> > > > > > > > > > the > > >> > > > > > > > > > >> > > > rejected > > >> > > > > > > > > > >> > > > > > > > alternatives would be the > following: > > >> > > > > > > > > > >> > > > > > > > Phase 1: > > >> > > > > > > > > > >> > > > > > > > 1. Introduce a > > supportsNestedFilters() > > >> > > method > > >> > > > > to the > > >> > > > > > > > > > >> > > > > > > SupportsFilterPushdown > > >> > > > > > > > > > >> > > > > > > > interface. (same as current > > proposal). > > >> > > > > > > > > > >> > > > > > > > 2. Extend the > > FieldReferenceExpression > > >> to > > >> > > > > support > > >> > > > > > > > nested > > >> > > > > > > > > > >> > fields. > > >> > > > > > > > > > >> > > > > > > > - Change the index field type > > from > > >> int > > >> > > to > > >> > > > > int[]. > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > - Add a new method int[] > > >> > > > getFieldIndexArray(). > > >> > > > > > > > > > >> > > > > > > > - Deprecate the int > > getFieldIndex() > > >> > > > method, > > >> > > > > the > > >> > > > > > > > code > > >> > > > > > > > > > >> will > > >> > > > > > > > > > >> > be > > >> > > > > > > > > > >> > > > > > removed > > >> > > > > > > > > > >> > > > > > > in > > >> > > > > > > > > > >> > > > > > > > the next major version bump. > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > 3. In the SupportsProjectionPushDown > > >> > interface > > >> > > > > > > > > > >> > > > > > > > - add a new method > > >> > > > > > > > > > >> > > > > > > >> applyProjection(List<FieldReferenceExpression>, > > >> > > > > > > > > > >> > > > > > > > DataType), with default > > implementation > > >> > > > invoking > > >> > > > > > > > > > >> > > > > > applyProjection(int[][], > > >> > > > > > > > > > >> > > > > > > > DataType) > > >> > > > > > > > > > >> > > > > > > > - deprecate the current > > >> > > > > applyProjection(int[][], > > >> > > > > > > > > > >> DataType) > > >> > > > > > > > > > >> > > > method > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Phase 2 (in the next major version > > >> bump) > > >> > > > > > > > > > >> > > > > > > > 1. remove the deprecated methods. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Phase 3 (optional) > > >> > > > > > > > > > >> > > > > > > > 1. deprecate and remove the > > >> > > > > supportsNestedFilters() / > > >> > > > > > > > > > >> > > > > > > > supportsNestedProjection() methods > > from > > >> > the > > >> > > > > > > > > > >> > > SupportsFilterPushDown > > >> > > > > > > > > > >> > > > / > > >> > > > > > > > > > >> > > > > > > > SupportsProjectionPushDown > > interfaces. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Personally I prefer this > alternative. > > >> It > > >> > > takes > > >> > > > > longer > > >> > > > > > > > to > > >> > > > > > > > > > >> finish > > >> > > > > > > > > > >> > > the > > >> > > > > > > > > > >> > > > > > work, > > >> > > > > > > > > > >> > > > > > > > but the API eventually becomes > clean > > >> and > > >> > > > > consistent. > > >> > > > > > > > > But I > > >> > > > > > > > > > >> can > > >> > > > > > > > > > >> > > live > > >> > > > > > > > > > >> > > > > > with > > >> > > > > > > > > > >> > > > > > > > the current proposal. > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Thanks, > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > Jiangjie (Becket) Qin > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > On Sat, Aug 19, 2023 at 12:09 AM > > >> > > > Venkatakrishnan > > >> > > > > > > > > > Sowrirajan > > >> > > > > > > > > > >> < > > >> > > > > > > > > > >> > > > > > > > vsowr...@asu.edu> wrote: > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > Gentle ping for reviews/feedback. > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > On Tue, Aug 15, 2023, 5:37 PM > > >> > > > Venkatakrishnan > > >> > > > > > > > > > Sowrirajan < > > >> > > > > > > > > > >> > > > > > > > vsowr...@asu.edu > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > wrote: > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Hi All, > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > I am opening this thread to > > discuss > > >> > > > > FLIP-356: > > >> > > > > > > > > Support > > >> > > > > > > > > > >> > Nested > > >> > > > > > > > > > >> > > > > Fields > > >> > > > > > > > > > >> > > > > > > > > > Filter Pushdown. The FLIP can > be > > >> found > > >> > > at > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!clxXJwshKpn559SAkQiieqgGe0ZduXCzUKCmYLtFIbQLmrmEEgdmuEIM8ZM1M3O_uGqOploU4ailqGpukAg$ > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > This FLIP adds support for > > pushing > > >> > down > > >> > > > > nested > > >> > > > > > > > > fields > > >> > > > > > > > > > >> > filters > > >> > > > > > > > > > >> > > > to > > >> > > > > > > > > > >> > > > > > the > > >> > > > > > > > > > >> > > > > > > > > > underlying TableSource. In our > > data > > >> > > lake, > > >> > > > > we find > > >> > > > > > > > a > > >> > > > > > > > > > lot > > >> > > > > > > > > > >> of > > >> > > > > > > > > > >> > > > > datasets > > >> > > > > > > > > > >> > > > > > > > have > > >> > > > > > > > > > >> > > > > > > > > > nested fields and also user > > queries > > >> > with > > >> > > > > filters > > >> > > > > > > > > > >> defined on > > >> > > > > > > > > > >> > > the > > >> > > > > > > > > > >> > > > > > > nested > > >> > > > > > > > > > >> > > > > > > > > > fields. This would drastically > > >> improve > > >> > > the > > >> > > > > > > > > performance > > >> > > > > > > > > > >> for > > >> > > > > > > > > > >> > > > those > > >> > > > > > > > > > >> > > > > > sets > > >> > > > > > > > > > >> > > > > > > > of > > >> > > > > > > > > > >> > > > > > > > > > queries. > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Appreciate any comments or > > feedback > > >> > you > > >> > > > may > > >> > > > > have > > >> > > > > > > > on > > >> > > > > > > > > > this > > >> > > > > > > > > > >> > > > > proposal. > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > Regards > > >> > > > > > > > > > >> > > > > > > > > > Venkata krishnan > > >> > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > > >> > > > > > > > > >> > > > > > > > > > >> > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > > > > >> > > > > > >> > > > > > > > > > >> > > > > >> > > > > > > > > > >> > > > >> > > > > > > > > > >> > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > >> > > > > > > >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > > > > >