Based on an offline discussion with Becket Qin, I added *fieldIndices
*back which
is the field index of the nested field at every level to the
*NestedFieldReferenceExpression
*in FLIP-356
<https://cwiki.apache.org/confluence/display/FLINK/FLIP-356%3A+Support+Nested+Fields+Filter+Pushdown>
*. *2 reasons to do it:

1. Agree with using *fieldIndices *as the only contract to refer to the
column from the underlying datasource.
2. To keep it consistent with *FieldReferenceExpression*

Having said that, I see that with *projection pushdown, *index of the
fields are used whereas with *filter pushdown (*based on scanning few
tablesources) *FieldReferenceExpression*'s name is used for eg: even in the
Flink's *FileSystemTableSource, IcebergSource, JDBCDatsource*. This way, I
feel the contract is not quite clear and explicit. Wanted to understand
other's thoughts as well.

Regards
Venkata krishnan


On Tue, Sep 5, 2023 at 5:34 PM Becket Qin <becket....@gmail.com> wrote:

> Hi Venkata,
>
>
> > Also I made minor changes to the *NestedFieldReferenceExpression,
> *instead
> > of *fieldIndexArray* we can just do away with *fieldNames *array that
> > includes fieldName at every level for the nested field.
>
>
> I don't think keeping only the field names array would work. At the end of
> the day, the contract between Flink SQL and the connectors is based on the
> indexes, not the names. Technically speaking, the connectors only emit a
> bunch of RowData which is based on positions. The field names are added by
> the SQL framework via the DDL for those RowData. In this sense, the
> connectors may not be aware of the field names in Flink DDL at all. The
> common language between Flink SQL and source is just positions. This is
> also why ProjectionPushDown would work by only relying on the indexes, not
> the field names. So I think the field index array is a must have here in
> the NestedFieldReferenceExpression.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Fri, Sep 1, 2023 at 8:12 AM Venkatakrishnan Sowrirajan <
> vsowr...@asu.edu>
> wrote:
>
> > Gentle ping on the vote for FLIP-356: Support Nested fields filter
> pushdown
> > <
> https://urldefense.com/v3/__https://www.mail-archive.com/dev@flink.apache.org/msg69289.html__;!!IKRxdwAv5BmarQ!bOW26WlafOQQcb32eWtUiXBAl0cTCK1C6iYhDI2f_z__eczudAWmTRvjDiZg6gzlXmPXrDV4KJS5cFxagFE$
> >.
> >
> > Regards
> > Venkata krishnan
> >
> >
> > On Tue, Aug 29, 2023 at 9:18 PM Venkatakrishnan Sowrirajan <
> > vsowr...@asu.edu>
> > wrote:
> >
> > > Sure, will reference this discussion to resume where we started as part
> > of
> > > the flip to refactor SupportsProjectionPushDown.
> > >
> > > On Tue, Aug 29, 2023, 7:22 PM Jark Wu <imj...@gmail.com> wrote:
> > >
> > >> I'm fine with this. `ReferenceExpression` and
> > `SupportsProjectionPushDown`
> > >> can be another FLIP. However, could you summarize the design of this
> > part
> > >> in the future part of the FLIP? This can be easier to get started with
> > in
> > >> the future.
> > >>
> > >>
> > >> Best,
> > >> Jark
> > >>
> > >>
> > >> On Wed, 30 Aug 2023 at 02:45, Venkatakrishnan Sowrirajan <
> > >> vsowr...@asu.edu>
> > >> wrote:
> > >>
> > >> > Thanks Jark. Sounds good.
> > >> >
> > >> > One more thing, earlier in my summary I mentioned,
> > >> >
> > >> > Introduce a new *ReferenceExpression* (or *BaseReferenceExpression*)
> > >> > > abstract class which will be extended by both
> > >> *FieldReferenceExpression*
> > >> > >  and *NestedFieldReferenceExpression* (to be introduced as part of
> > >> this
> > >> > > FLIP)
> > >> >
> > >> > This can be punted for now and can be handled as part of refactoring
> > >> > SupportsProjectionPushDown.
> > >> >
> > >> > Also I made minor changes to the *NestedFieldReferenceExpression,
> > >> *instead
> > >> > of *fieldIndexArray* we can just do away with *fieldNames *array
> that
> > >> > includes fieldName at every level for the nested field.
> > >> >
> > >> > Updated the FLIP-357
> > >> > <
> > >> >
> > >>
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!YAk6kV4CYvUSPfpoUDQRs6VlbmJXVX8KOKqFxKbNDkUWKzShvwpkLRGkAV1tgV3EqClNrjGS-Ij86Q$
> > >> > >
> > >> > wiki as well.
> > >> >
> > >> > Regards
> > >> > Venkata krishnan
> > >> >
> > >> >
> > >> > On Tue, Aug 29, 2023 at 5:21 AM Jark Wu <imj...@gmail.com> wrote:
> > >> >
> > >> > > Hi Venkata,
> > >> > >
> > >> > > Your summary looks good to me. +1 to start a vote.
> > >> > >
> > >> > > I think we don't need "inputIndex" in
> > NestedFieldReferenceExpression.
> > >> > > Actually, I think it is also not needed in
> FieldReferenceExpression,
> > >> > > and we should try to remove it (another topic). The RexInputRef in
> > >> > Calcite
> > >> > > also doesn't require an inputIndex because the field index should
> > >> > represent
> > >> > > index of the field in the underlying row type. Field references
> > >> shouldn't
> > >> > > be
> > >> > >  aware of the number of inputs.
> > >> > >
> > >> > > Best,
> > >> > > Jark
> > >> > >
> > >> > >
> > >> > > On Tue, 29 Aug 2023 at 02:24, Venkatakrishnan Sowrirajan <
> > >> > vsowr...@asu.edu
> > >> > > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Jinsong,
> > >> > > >
> > >> > > > Thanks for your comments.
> > >> > > >
> > >> > > > What is inputIndex in NestedFieldReferenceExpression?
> > >> > > >
> > >> > > >
> > >> > > > I haven't looked at it before. Do you mean, given that it is now
> > >> only
> > >> > > used
> > >> > > > to push filters it won't be subsequently used in further
> > >> > > > planning/optimization and therefore it is not required at this
> > time?
> > >> > > >
> > >> > > > So if NestedFieldReferenceExpression doesn't need inputIndex, is
> > >> there
> > >> > > > > a need to introduce a base class `ReferenceExpression`?
> > >> > > >
> > >> > > > For SupportsFilterPushDown itself, *ReferenceExpression* base
> > class
> > >> is
> > >> > > not
> > >> > > > needed. But there were discussions around cleaning up and
> > >> standardizing
> > >> > > the
> > >> > > > API for Supports*PushDown. SupportsProjectionPushDown currently
> > >> pushes
> > >> > > the
> > >> > > > projects as a 2-d array, instead it would be better to use the
> > >> standard
> > >> > > API
> > >> > > > which seems to be the *ResolvedExpression*. For
> > >> > > SupportsProjectionPushDown
> > >> > > > either FieldReferenceExpression (top level fields) or
> > >> > > > NestedFieldReferenceExpression (nested fields) is enough, in
> order
> > >> to
> > >> > > > provide a single API that handles both top level and nested
> > fields,
> > >> > > > ReferenceExpression will be introduced as a base class.
> > >> > > >
> > >> > > > Eventually, *SupportsProjectionPushDown#applyProjections* would
> > >> evolve
> > >> > as
> > >> > > > applyProjection(List<ReferenceExpression> projectedFields) and
> > >> nested
> > >> > > > fields would be pushed only if *supportsNestedProjections*
> returns
> > >> > true.
> > >> > > >
> > >> > > > Regards
> > >> > > > Venkata krishnan
> > >> > > >
> > >> > > >
> > >> > > > On Sun, Aug 27, 2023 at 11:12 PM Jingsong Li <
> > >> jingsongl...@gmail.com>
> > >> > > > wrote:
> > >> > > >
> > >> > > > > So if NestedFieldReferenceExpression doesn't need inputIndex,
> is
> > >> > there
> > >> > > > > a need to introduce a base class `ReferenceExpression`?
> > >> > > > >
> > >> > > > > Best,
> > >> > > > > Jingsong
> > >> > > > >
> > >> > > > > On Mon, Aug 28, 2023 at 2:09 PM Jingsong Li <
> > >> jingsongl...@gmail.com>
> > >> > > > > wrote:
> > >> > > > > >
> > >> > > > > > Hi thanks all for your discussion.
> > >> > > > > >
> > >> > > > > > What is inputIndex in NestedFieldReferenceExpression?
> > >> > > > > >
> > >> > > > > > I know inputIndex has special usage in
> > FieldReferenceExpression,
> > >> > but
> > >> > > > > > it is only for Join operators, and it is only for SQL
> > >> optimization.
> > >> > > It
> > >> > > > > > looks like there is no requirement for Nested.
> > >> > > > > >
> > >> > > > > > Best,
> > >> > > > > > Jingsong
> > >> > > > > >
> > >> > > > > > On Mon, Aug 28, 2023 at 1:13 PM Venkatakrishnan Sowrirajan
> > >> > > > > > <vsowr...@asu.edu> wrote:
> > >> > > > > > >
> > >> > > > > > > Thanks for all the feedback and discussion everyone. Looks
> > >> like
> > >> > we
> > >> > > > have
> > >> > > > > > > reached a consensus here.
> > >> > > > > > >
> > >> > > > > > > Just to summarize:
> > >> > > > > > >
> > >> > > > > > > 1. Introduce a new *ReferenceExpression* (or
> > >> > > > *BaseReferenceExpression*)
> > >> > > > > > > abstract class which will be extended by both
> > >> > > > > *FieldReferenceExpression*
> > >> > > > > > > and *NestedFieldReferenceExpression* (to be introduced as
> > >> part of
> > >> > > > this
> > >> > > > > FLIP)
> > >> > > > > > > 2. No need of *supportsNestedFilters *check as the current
> > >> > > > > > > *SupportsFilterPushDown* should already ignore unknown
> > >> > expressions
> > >> > > (
> > >> > > > > > > *NestedFieldReferenceExpression* for example) and return
> > them
> > >> as
> > >> > > > > > > *remainingFilters.
> > >> > > > > > > *Maybe this should be clarified explicitly in the Javadoc
> of
> > >> > > > > > > *SupportsFilterPushDown.
> > >> > > > > > > *I will file a separate JIRA to fix the documentation.
> > >> > > > > > > 3. Refactor *SupportsProjectionPushDown* to use
> > >> > > *ReferenceExpression
> > >> > > > > *instead
> > >> > > > > > > of existing 2-d arrays to consolidate and be consistent
> with
> > >> > other
> > >> > > > > > > Supports*PushDown APIs - *outside the scope of this FLIP*
> > >> > > > > > > 4. Similarly *SupportsAggregatePushDown* should also be
> > >> evolved
> > >> > > > > whenever
> > >> > > > > > > nested fields support is added to use the
> > >> *ReferenceExpression -
> > >> > > > > **outside
> > >> > > > > > > the scope of this FLIP*
> > >> > > > > > >
> > >> > > > > > > Does this sound good? Please let me know if I have missed
> > >> > anything
> > >> > > > > here. If
> > >> > > > > > > there are no concerns, I will start a vote tomorrow. I
> will
> > >> also
> > >> > > get
> > >> > > > > the
> > >> > > > > > > FLIP-356 wiki updated. Thanks everyone once again!
> > >> > > > > > >
> > >> > > > > > > Regards
> > >> > > > > > > Venkata krishnan
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On Thu, Aug 24, 2023 at 8:19 PM Becket Qin <
> > >> becket....@gmail.com
> > >> > >
> > >> > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hi Jark,
> > >> > > > > > > >
> > >> > > > > > > > How about having a separate
> > NestedFieldReferenceExpression,
> > >> and
> > >> > > > > > > > > abstracting a common base class "ReferenceExpression"
> > for
> > >> > > > > > > > > NestedFieldReferenceExpression and
> > >> FieldReferenceExpression?
> > >> > > This
> > >> > > > > makes
> > >> > > > > > > > > unifying expressions in
> > >> > > > > > > > >
> > >> > > > >
> > >> >
> "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> > >> > > > > > > > > ...)"
> > >> > > > > > > > > possible.
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > I'd be fine with this. It at least provides a consistent
> > API
> > >> > > style
> > >> > > > /
> > >> > > > > > > > formality.
> > >> > > > > > > >
> > >> > > > > > > >  Re: Yunhong,
> > >> > > > > > > >
> > >> > > > > > > > 3. Finally, I think we need to look at the costs and
> > >> benefits
> > >> > of
> > >> > > > > unifying
> > >> > > > > > > > > the SupportsFilterPushDown and
> > SupportsProjectionPushDown
> > >> (or
> > >> > > > > others)
> > >> > > > > > > > from
> > >> > > > > > > > > the perspective of interface implementers. A stable
> API
> > >> can
> > >> > > > reduce
> > >> > > > > user
> > >> > > > > > > > > development and change costs, if the current API can
> > fully
> > >> > meet
> > >> > > > the
> > >> > > > > > > > > functional requirements at the framework level, I
> > personal
> > >> > > > suggest
> > >> > > > > > > > reducing
> > >> > > > > > > > > the impact on connector developers.
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > I agree that the cost and benefit should be measured.
> And
> > >> the
> > >> > > > > measurement
> > >> > > > > > > > should be in the long term instead of short term. That
> is
> > >> why
> > >> > we
> > >> > > > > always
> > >> > > > > > > > need to align on the ideal end state first.
> > >> > > > > > > > Meeting functionality requirements is the bare minimum
> bar
> > >> for
> > >> > an
> > >> > > > > API.
> > >> > > > > > > > Simplicity, intuitiveness, robustness and evolvability
> are
> > >> also
> > >> > > > > important.
> > >> > > > > > > > In addition, for projects with many APIs, such as
> Flink, a
> > >> > > > > consistent API
> > >> > > > > > > > style is also critical for the user adoption as well as
> > bug
> > >> > > > > avoidance. It
> > >> > > > > > > > is very helpful for the community to agree on some API
> > >> design
> > >> > > > > conventions /
> > >> > > > > > > > principles.
> > >> > > > > > > > For example, in this particular case, via our
> discussion,
> > >> > > hopefully
> > >> > > > > we sort
> > >> > > > > > > > of established the following API design conventions /
> > >> > principles
> > >> > > > for
> > >> > > > > all
> > >> > > > > > > > the Supports*PushDown interfaces.
> > >> > > > > > > >
> > >> > > > > > > > 1. By default, expressions should be used if applicable
> > >> instead
> > >> > > of
> > >> > > > > other
> > >> > > > > > > > representations.
> > >> > > > > > > > 2. In general, the pushdown method should not assume all
> > the
> > >> > > > > pushdowns will
> > >> > > > > > > > succeed. So the applyX() method should return a boolean
> or
> > >> > > List<X>,
> > >> > > > > to
> > >> > > > > > > > handle the cases that some of the pushdowns cannot be
> > >> fulfilled
> > >> > > by
> > >> > > > > the
> > >> > > > > > > > implementation.
> > >> > > > > > > >
> > >> > > > > > > > Establishing such conventions and principles demands
> > careful
> > >> > > > > thinking for
> > >> > > > > > > > the aspects I mentioned earlier in addition to the API
> > >> > > > > functionalities.
> > >> > > > > > > > This helps lower the bar of understanding, reduces the
> > >> chance
> > >> > of
> > >> > > > > having
> > >> > > > > > > > loose ends in the API, and will benefit all the
> > >> participants in
> > >> > > the
> > >> > > > > project
> > >> > > > > > > > over time. I think this is the right way to achieve real
> > API
> > >> > > > > stability.
> > >> > > > > > > > Otherwise, we may end up chasing our tails to find ways
> > not
> > >> to
> > >> > > > > change the
> > >> > > > > > > > existing non-ideal APIs.
> > >> > > > > > > >
> > >> > > > > > > > Thanks,
> > >> > > > > > > >
> > >> > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > >
> > >> > > > > > > > On Fri, Aug 25, 2023 at 9:33 AM yh z <
> > >> zhengyunhon...@gmail.com
> > >> > >
> > >> > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > > > Hi, Venkat,
> > >> > > > > > > > >
> > >> > > > > > > > > Thanks for the FLIP, it sounds good to support nested
> > >> fields
> > >> > > > filter
> > >> > > > > > > > > pushdown. Based on the design of flip and the above
> > >> options,
> > >> > I
> > >> > > > > would like
> > >> > > > > > > > > to make a few suggestions:
> > >> > > > > > > > >
> > >> > > > > > > > > 1.  At present, introducing
> > NestedFieldReferenceExpression
> > >> > > looks
> > >> > > > > like a
> > >> > > > > > > > > better solution, which can fully meet our requirements
> > >> while
> > >> > > > > reducing
> > >> > > > > > > > > modifications to base class FieldReferenceExpression.
> In
> > >> the
> > >> > > long
> > >> > > > > run, I
> > >> > > > > > > > > tend to abstract a basic class for
> > >> > > NestedFieldReferenceExpression
> > >> > > > > and
> > >> > > > > > > > > FieldReferenceExpression as u suggested.
> > >> > > > > > > > >
> > >> > > > > > > > > 2. Personally, I don't recommend introducing
> > >> > > > > *supportsNestedFilters() in
> > >> > > > > > > > > supportsFilterPushdown. We just need to better declare
> > the
> > >> > > return
> > >> > > > > value
> > >> > > > > > > > of
> > >> > > > > > > > > the method *applyFilters.
> > >> > > > > > > > >
> > >> > > > > > > > > 3. Finally, I think we need to look at the costs and
> > >> benefits
> > >> > > of
> > >> > > > > unifying
> > >> > > > > > > > > the SupportsFilterPushDown and
> > SupportsProjectionPushDown
> > >> (or
> > >> > > > > others)
> > >> > > > > > > > from
> > >> > > > > > > > > the perspective of interface implementers. A stable
> API
> > >> can
> > >> > > > reduce
> > >> > > > > user
> > >> > > > > > > > > development and change costs, if the current API can
> > fully
> > >> > meet
> > >> > > > the
> > >> > > > > > > > > functional requirements at the framework level, I
> > personal
> > >> > > > suggest
> > >> > > > > > > > reducing
> > >> > > > > > > > > the impact on connector developers.
> > >> > > > > > > > >
> > >> > > > > > > > > Regards,
> > >> > > > > > > > > Yunhong Zheng (Swuferhong)
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > Venkatakrishnan Sowrirajan <vsowr...@asu.edu>
> > >> 于2023年8月25日周五
> > >> > > > > 01:25写道:
> > >> > > > > > > > >
> > >> > > > > > > > > > To keep it backwards compatible, introduce another
> API
> > >> > > > > *applyAggregates
> > >> > > > > > > > > > *with
> > >> > > > > > > > > > *List<ReferenceExpression> *when nested field
> support
> > is
> > >> > > added
> > >> > > > > and
> > >> > > > > > > > > > deprecate the current API. This will by default
> throw
> > an
> > >> > > > > exception. In
> > >> > > > > > > > > > flink planner, *applyAggregates *with nested fields
> > and
> > >> if
> > >> > it
> > >> > > > > throws
> > >> > > > > > > > > > exception then *applyAggregates* without nested
> > fields.
> > >> > > > > > > > > >
> > >> > > > > > > > > > Regards
> > >> > > > > > > > > > Venkata krishnan
> > >> > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > > > On Thu, Aug 24, 2023 at 10:13 AM Venkatakrishnan
> > >> > Sowrirajan <
> > >> > > > > > > > > > vsowr...@asu.edu> wrote:
> > >> > > > > > > > > >
> > >> > > > > > > > > > > Jark,
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > How about having a separate
> > >> > NestedFieldReferenceExpression,
> > >> > > > and
> > >> > > > > > > > > > >> abstracting a common base class
> > "ReferenceExpression"
> > >> > for
> > >> > > > > > > > > > >> NestedFieldReferenceExpression and
> > >> > > FieldReferenceExpression?
> > >> > > > > This
> > >> > > > > > > > > makes
> > >> > > > > > > > > > >> unifying expressions in
> > >> > > > > > > > > > >>
> > >> > > > > > > >
> > >> > > > >
> > >> >
> "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> > >> > > > > > > > > > >> ...)"
> > >> > > > > > > > > > >> possible.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > This should be fine for
> *SupportsProjectionPushDown*
> > >> and
> > >> > > > > > > > > > > *SupportsFilterPushDown*. One concern in the case
> of
> > >> > > > > > > > > > > *SupportsAggregatePushDown* with nested fields
> > support
> > >> > (to
> > >> > > be
> > >> > > > > added
> > >> > > > > > > > in
> > >> > > > > > > > > > > the future), with this proposal, the API will
> become
> > >> > > > backwards
> > >> > > > > > > > > > incompatible
> > >> > > > > > > > > > > as the *args *for the aggregate function is
> > >> > > > > > > > > > *List<FieldReferenceExpression>
> > >> > > > > > > > > > > *that needs to change to
> > *List<ReferenceExpression>*.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Regards
> > >> > > > > > > > > > > Venkata krishnan
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > On Thu, Aug 24, 2023 at 1:18 AM Jark Wu <
> > >> > imj...@gmail.com>
> > >> > > > > wrote:
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >> Hi Becket,
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> I think it is the second case, that a
> > >> > > > > FieldReferenceExpression is
> > >> > > > > > > > > > >> constructed
> > >> > > > > > > > > > >> by the framework and passed to the connector
> > >> (interfaces
> > >> > > > > listed by
> > >> > > > > > > > > > >> Venkata[1]
> > >> > > > > > > > > > >> and Catalog#listPartitionsByFilter). Besides,
> > >> > > understanding
> > >> > > > > the
> > >> > > > > > > > nested
> > >> > > > > > > > > > >> field
> > >> > > > > > > > > > >> is optional for users/connectors (just treat it
> as
> > an
> > >> > > > unknown
> > >> > > > > > > > > expression
> > >> > > > > > > > > > >> if
> > >> > > > > > > > > > >> the
> > >> > > > > > > > > > >> connector doesn't want to support it).
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> If we extend FieldReferenceExpression, in the
> case
> > of
> > >> > > "where
> > >> > > > > > > > > col.nested
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >> 10",
> > >> > > > > > > > > > >> for the connectors already supported
> filter/delete
> > >> > > pushdown,
> > >> > > > > they
> > >> > > > > > > > may
> > >> > > > > > > > > > >> wrongly
> > >> > > > > > > > > > >> pushdown "col > 10" instead of "nested > 10"
> > because
> > >> > they
> > >> > > > > still
> > >> > > > > > > > treat
> > >> > > > > > > > > > >> FieldReferenceExpression as a top-level column.
> > This
> > >> > > problem
> > >> > > > > can be
> > >> > > > > > > > > > >> resolved
> > >> > > > > > > > > > >> by introducing an additional
> > >> "supportedNestedPushdown"
> > >> > for
> > >> > > > > each
> > >> > > > > > > > > > interface,
> > >> > > > > > > > > > >> but that method is not elegant and is hard to
> > remove
> > >> in
> > >> > > the
> > >> > > > > future,
> > >> > > > > > > > > and
> > >> > > > > > > > > > >> this could
> > >> > > > > > > > > > >> be avoided if we have a separate
> > >> > > > > NestedFieldReferenceExpression.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> If we want to extend FieldReferenceExpression, we
> > >> have
> > >> > to
> > >> > > > add
> > >> > > > > > > > > > protections
> > >> > > > > > > > > > >> for every related API in one shot. Besides,
> > >> > > > > FieldReferenceExpression
> > >> > > > > > > > > is
> > >> > > > > > > > > > a
> > >> > > > > > > > > > >> fundamental class in the planner, we have to go
> > >> through
> > >> > > all
> > >> > > > > the code
> > >> > > > > > > > > > that
> > >> > > > > > > > > > >> is using it to make sure it properly handling it
> if
> > >> it
> > >> > is
> > >> > > a
> > >> > > > > nested
> > >> > > > > > > > > field
> > >> > > > > > > > > > >> which
> > >> > > > > > > > > > >> is a big effort for the community.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> If we were designing this API on day 1, I fully
> > >> support
> > >> > > > > merging them
> > >> > > > > > > > > in
> > >> > > > > > > > > > a
> > >> > > > > > > > > > >> FieldReferenceExpression. But in this case, I'm
> > >> thinking
> > >> > > > > about how
> > >> > > > > > > > to
> > >> > > > > > > > > > >> provide
> > >> > > > > > > > > > >> users with a smooth migration path, and allow the
> > >> > > community
> > >> > > > to
> > >> > > > > > > > > gradually
> > >> > > > > > > > > > >> put efforts into evolving the API, and not block
> > the
> > >> > > "Nested
> > >> > > > > Fields
> > >> > > > > > > > > > Filter
> > >> > > > > > > > > > >> Pushdown"
> > >> > > > > > > > > > >> requirement.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> How about having a separate
> > >> > > NestedFieldReferenceExpression,
> > >> > > > > and
> > >> > > > > > > > > > >> abstracting a common base class
> > "ReferenceExpression"
> > >> > for
> > >> > > > > > > > > > >> NestedFieldReferenceExpression and
> > >> > > FieldReferenceExpression?
> > >> > > > > This
> > >> > > > > > > > > makes
> > >> > > > > > > > > > >> unifying expressions in
> > >> > > > > > > > > > >>
> > >> > > > > > > >
> > >> > > > >
> > >> >
> "SupportsProjectionPushdown#applyProjections(List<ReferenceExpression>
> > >> > > > > > > > > > >> ...)"
> > >> > > > > > > > > > >> possible.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> Best,
> > >> > > > > > > > > > >> Jark
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> On Thu, 24 Aug 2023 at 07:00, Venkatakrishnan
> > >> > Sowrirajan <
> > >> > > > > > > > > > >> vsowr...@asu.edu>
> > >> > > > > > > > > > >> wrote:
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> > Becket and Jark,
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> >  Deprecate all the other
> > >> > > > > > > > > > >> > > methods except tryApplyFilters() and
> > >> > > > > tryApplyProjections().
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > For *SupportsProjectionPushDown*, we still
> need a
> > >> > > > > > > > > > >> > *supportsNestedProjections* API on the table
> > >> source as
> > >> > > > some
> > >> > > > > of the
> > >> > > > > > > > > > table
> > >> > > > > > > > > > >> > sources might not be able to handle nested
> fields
> > >> and
> > >> > > > > therefore
> > >> > > > > > > > the
> > >> > > > > > > > > > >> Flink
> > >> > > > > > > > > > >> > planner should not push down the nested
> > >> projections or
> > >> > > > else
> > >> > > > > the
> > >> > > > > > > > > > >> > *applyProjection
> > >> > > > > > > > > > >> > *API has to be appropriately changed to return
> > >> > > > > > > > > > >> > *unconvertibleProjections *similar
> > >> > > > > > > > > > >> > to *SupportsFilterPushDown*.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > Or we have to introduce two different
> > >> > applyProjections()
> > >> > > > > > > > > > >> > > methods for FieldReferenceExpression /
> > >> > > > > > > > > > NestedFieldReferenceExpression
> > >> > > > > > > > > > >> > > respectively.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > Agree this is not preferred. Given that
> > >> > > > > *supportNestedProjections
> > >> > > > > > > > > > >> *cannot
> > >> > > > > > > > > > >> > be deprecated/removed based on the current API
> > >> form,
> > >> > > > > extending
> > >> > > > > > > > > > >> > *FieldReferenceExpression* to support nested
> > fields
> > >> > > should
> > >> > > > > be
> > >> > > > > > > > okay.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > Another alternative could be to change
> > >> > *applyProjections
> > >> > > > > *to take
> > >> > > > > > > > > > >> > List<ResolvedExpression> and on the connector
> > side
> > >> > they
> > >> > > > > choose to
> > >> > > > > > > > > > handle
> > >> > > > > > > > > > >> > *FieldReferenceExpression* and
> > >> > > > > *NestedFieldReferenceExpression *as
> > >> > > > > > > > > > >> > applicable and return the remainingProjections.
> > In
> > >> the
> > >> > > > case
> > >> > > > > of
> > >> > > > > > > > > nested
> > >> > > > > > > > > > >> field
> > >> > > > > > > > > > >> > projections not supported, it should return
> them
> > >> back
> > >> > > but
> > >> > > > > only
> > >> > > > > > > > > > >> projecting
> > >> > > > > > > > > > >> > the top level fields. IMO, this is also *not
> > >> > preferred*.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > *SupportsAggregatePushDown*
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > *AggregateExpression *currently takes in a list
> > of
> > >> > > > > > > > > > >> > *FieldReferenceExpression* as args for the
> > >> aggregate
> > >> > > > > function, if
> > >> > > > > > > > in
> > >> > > > > > > > > > >> future
> > >> > > > > > > > > > >> > *SupportsAggregatePushDown* adds support for
> > >> aggregate
> > >> > > > > pushdown on
> > >> > > > > > > > > > >> nested
> > >> > > > > > > > > > >> > fields then the AggregateExpression API also
> has
> > to
> > >> > > change
> > >> > > > > if a
> > >> > > > > > > > new
> > >> > > > > > > > > > >> > NestedFieldReferenceExpression is introduced
> for
> > >> > nested
> > >> > > > > fields.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > If we add a
> > >> > > > > > > > > > >> > > flag for each new filter,
> > >> > > > > > > > > > >> > > the interface will be filled with lots of
> flags
> > >> > (e.g.,
> > >> > > > > > > > > > >> supportsBetween,
> > >> > > > > > > > > > >> > > supportsIN)
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > In an ideal situation, I completely agree with
> > you.
> > >> > But
> > >> > > in
> > >> > > > > the
> > >> > > > > > > > > current
> > >> > > > > > > > > > >> > state, *supportsNestedFilters* can act as a
> > bridge
> > >> to
> > >> > > > reach
> > >> > > > > the
> > >> > > > > > > > > > eventual
> > >> > > > > > > > > > >> > desired state which is to have a clean and
> > >> consistent
> > >> > > set
> > >> > > > > of APIs
> > >> > > > > > > > > > >> > throughout all Supports*PushDown.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > Also shared some thoughts on the end state API
> > >> > > > > > > > > > >> > <
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >>
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://urldefense.com/v3/__https://docs.google.com/document/d/1stLRPKOcxlEv8eHblkrOh0Zf5PLM-h76WMhEINHOyPY/edit?usp=sharing__;!!IKRxdwAv5BmarQ!ZZ2nS1PYlXLnEGFcikS3NsYG7tMaV3wU_z7FmvihNwQBmoLZk2WmcpuRWszK0FFmsInh9A6cndkJrQ$
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > with extension to the
> *FieldReferenceExpression*
> > to
> > >> > > > support
> > >> > > > > nested
> > >> > > > > > > > > > >> fields.
> > >> > > > > > > > > > >> > Please take a look.
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > Regards
> > >> > > > > > > > > > >> > Venkata krishnan
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > On Tue, Aug 22, 2023 at 5:02 PM Becket Qin <
> > >> > > > > becket....@gmail.com>
> > >> > > > > > > > > > >> wrote:
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >> > > Hi Jark,
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > Regarding the migration path, it would be
> > useful
> > >> to
> > >> > > > > scrutinize
> > >> > > > > > > > the
> > >> > > > > > > > > > use
> > >> > > > > > > > > > >> > case
> > >> > > > > > > > > > >> > > of FiledReferenceExpression and
> > >> ResolvedExpressions.
> > >> > > > > There are
> > >> > > > > > > > two
> > >> > > > > > > > > > >> kinds
> > >> > > > > > > > > > >> > of
> > >> > > > > > > > > > >> > > use cases:
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > 1. A ResolvedExpression is constructed by the
> > >> user
> > >> > or
> > >> > > > > connector
> > >> > > > > > > > /
> > >> > > > > > > > > > >> plugin
> > >> > > > > > > > > > >> > > developers.
> > >> > > > > > > > > > >> > > 2. A ResolvedExpression is constructed by the
> > >> > > framework
> > >> > > > > and
> > >> > > > > > > > passed
> > >> > > > > > > > > > to
> > >> > > > > > > > > > >> > user
> > >> > > > > > > > > > >> > > or connector / plugin developers.
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > For the first case, both of the approaches
> > >> provide
> > >> > the
> > >> > > > > same
> > >> > > > > > > > > > migration
> > >> > > > > > > > > > >> > > experience.
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > For the second case, generally speaking,
> > >> introducing
> > >> > > > > > > > > > >> > > NestedFieldReferenceExpression and extending
> > >> > > > > > > > > > FieldReferenceExpression
> > >> > > > > > > > > > >> > would
> > >> > > > > > > > > > >> > > have the same impact for backwards
> > compatibility.
> > >> > > > > > > > > > >> SupportsFilterPushDown
> > >> > > > > > > > > > >> > is
> > >> > > > > > > > > > >> > > a special case here because understanding the
> > >> filter
> > >> > > > > expressions
> > >> > > > > > > > > is
> > >> > > > > > > > > > >> > > optional for the source implementation. In
> > other
> > >> use
> > >> > > > > cases, if
> > >> > > > > > > > > > >> > > understanding the reference to a nested field
> > is
> > >> a
> > >> > > must
> > >> > > > > have,
> > >> > > > > > > > the
> > >> > > > > > > > > > user
> > >> > > > > > > > > > >> > code
> > >> > > > > > > > > > >> > > has to be changed, regardless of which
> approach
> > >> we
> > >> > > take
> > >> > > > to
> > >> > > > > > > > support
> > >> > > > > > > > > > >> nested
> > >> > > > > > > > > > >> > > fields.
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > Therefore, I think we have to check each
> public
> > >> API
> > >> > > > where
> > >> > > > > the
> > >> > > > > > > > > nested
> > >> > > > > > > > > > >> > field
> > >> > > > > > > > > > >> > > reference is exposed. If we have many public
> > APIs
> > >> > > where
> > >> > > > > > > > > > understanding
> > >> > > > > > > > > > >> > > nested fields is optional for the user  /
> > plugin
> > >> /
> > >> > > > > connector
> > >> > > > > > > > > > >> developers,
> > >> > > > > > > > > > >> > > having a separate
> > NestedFieldReferenceExpression
> > >> > would
> > >> > > > > have a
> > >> > > > > > > > more
> > >> > > > > > > > > > >> smooth
> > >> > > > > > > > > > >> > > migration. Otherwise, there seems to be no
> > >> > difference
> > >> > > > > between
> > >> > > > > > > > the
> > >> > > > > > > > > > two
> > >> > > > > > > > > > >> > > approaches.
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > Migration path aside, the main reason I
> prefer
> > >> > > extending
> > >> > > > > > > > > > >> > > FieldReferenceExpression over a new
> > >> > > > > > > > NestedFieldReferenceExpression
> > >> > > > > > > > > > is
> > >> > > > > > > > > > >> > > because this makes the
> > SupportsProjectionPushDown
> > >> > > > > interface
> > >> > > > > > > > > simpler.
> > >> > > > > > > > > > >> > > Otherwise, we have to treat it as a special
> > case
> > >> > that
> > >> > > > > does not
> > >> > > > > > > > > match
> > >> > > > > > > > > > >> the
> > >> > > > > > > > > > >> > > overall API style. Or we have to introduce
> two
> > >> > > different
> > >> > > > > > > > > > >> > applyProjections()
> > >> > > > > > > > > > >> > > methods for FieldReferenceExpression /
> > >> > > > > > > > > > NestedFieldReferenceExpression
> > >> > > > > > > > > > >> > > respectively. This issue further extends to
> > >> > > > > implementation in
> > >> > > > > > > > > > >> addition to
> > >> > > > > > > > > > >> > > public API. A single FieldReferenceExpression
> > >> might
> > >> > > help
> > >> > > > > > > > simplify
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >> > > implementation code a little bit. For
> example,
> > >> in a
> > >> > > > > recursive
> > >> > > > > > > > > > >> processing
> > >> > > > > > > > > > >> > of
> > >> > > > > > > > > > >> > > a row with nested rows, we may not need to
> > switch
> > >> > > > between
> > >> > > > > > > > > > >> > > FieldReferenceExpression and
> > >> > > > > NestedFieldReferenceExpression
> > >> > > > > > > > > > depending
> > >> > > > > > > > > > >> on
> > >> > > > > > > > > > >> > > whether the record being processed is a top
> > level
> > >> > > record
> > >> > > > > or
> > >> > > > > > > > nested
> > >> > > > > > > > > > >> > record.
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > Thanks,
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > On Tue, Aug 22, 2023 at 11:43 PM Jark Wu <
> > >> > > > > imj...@gmail.com>
> > >> > > > > > > > > wrote:
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > > Hi Becket,
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > > > I totally agree we should try to have a
> > >> consistent
> > >> > > API
> > >> > > > > for a
> > >> > > > > > > > > final
> > >> > > > > > > > > > >> > state.
> > >> > > > > > > > > > >> > > > The only concern I have mentioned is the
> > >> "smooth"
> > >> > > > > migration
> > >> > > > > > > > > path.
> > >> > > > > > > > > > >> > > > The FiledReferenceExpression is widely used
> > in
> > >> > many
> > >> > > > > public
> > >> > > > > > > > APIs,
> > >> > > > > > > > > > >> > > > not only in the SupportsFilterPushDown.
> Yes,
> > we
> > >> > can
> > >> > > > > change
> > >> > > > > > > > every
> > >> > > > > > > > > > >> > > > methods in 2-steps, but is it good to
> change
> > >> API
> > >> > > back
> > >> > > > > and
> > >> > > > > > > > forth
> > >> > > > > > > > > > for
> > >> > > > > > > > > > >> > this?
> > >> > > > > > > > > > >> > > > Personally, I'm fine with a separate
> > >> > > > > > > > > > NestedFieldReferenceExpression
> > >> > > > > > > > > > >> > > class.
> > >> > > > > > > > > > >> > > > TBH, I prefer the separated way because it
> > >> makes
> > >> > the
> > >> > > > > reference
> > >> > > > > > > > > > >> > expression
> > >> > > > > > > > > > >> > > > more clear and concise.
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > > > Best,
> > >> > > > > > > > > > >> > > > Jark
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > > > On Tue, 22 Aug 2023 at 16:53, Becket Qin <
> > >> > > > > > > > becket....@gmail.com>
> > >> > > > > > > > > > >> wrote:
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > > > > Thanks for the reply, Jark.
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > I think it will be helpful to understand
> > the
> > >> > final
> > >> > > > > state we
> > >> > > > > > > > > want
> > >> > > > > > > > > > >> to
> > >> > > > > > > > > > >> > > > > eventually achieve first, then we can
> > discuss
> > >> > the
> > >> > > > > steps
> > >> > > > > > > > > towards
> > >> > > > > > > > > > >> that
> > >> > > > > > > > > > >> > > > final
> > >> > > > > > > > > > >> > > > > state.
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > It looks like there are two proposed end
> > >> states
> > >> > > now:
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > 1. Have a separate
> > >> > NestedFieldReferenceExpression
> > >> > > > > class;
> > >> > > > > > > > keep
> > >> > > > > > > > > > >> > > > > SupportsFilterPushDown and
> > >> > > > SupportsProjectionPushDown
> > >> > > > > the
> > >> > > > > > > > > same.
> > >> > > > > > > > > > >> It is
> > >> > > > > > > > > > >> > > > just
> > >> > > > > > > > > > >> > > > > a one step change.
> > >> > > > > > > > > > >> > > > >    - Regarding the
> > >> > supportsNestedFilterPushDown()
> > >> > > > > method, if
> > >> > > > > > > > > our
> > >> > > > > > > > > > >> > > contract
> > >> > > > > > > > > > >> > > > > with the connector developer today is
> "The
> > >> > > > > implementation
> > >> > > > > > > > > should
> > >> > > > > > > > > > >> > ignore
> > >> > > > > > > > > > >> > > > > unrecognized expressions by putting them
> > into
> > >> > the
> > >> > > > > remaining
> > >> > > > > > > > > > >> filters,
> > >> > > > > > > > > > >> > > > > instead of throwing exceptions". Then
> there
> > >> is
> > >> > no
> > >> > > > > need for
> > >> > > > > > > > > this
> > >> > > > > > > > > > >> > > method. I
> > >> > > > > > > > > > >> > > > > am not sure about the current contract.
> We
> > >> > should
> > >> > > > > probably
> > >> > > > > > > > > make
> > >> > > > > > > > > > it
> > >> > > > > > > > > > >> > > clear
> > >> > > > > > > > > > >> > > > in
> > >> > > > > > > > > > >> > > > > the interface Java doc.
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > 2. Extend the existing
> > >> FiledReferenceExpression
> > >> > > > class
> > >> > > > > to
> > >> > > > > > > > > support
> > >> > > > > > > > > > >> > nested
> > >> > > > > > > > > > >> > > > > fields; SupportsFilterPushDown only has
> one
> > >> > method
> > >> > > > of
> > >> > > > > > > > > > >> > > > > applyFilters(List<ResolvedExpression>);
> > >> > > > > > > > > > SupportsProjectionPushDown
> > >> > > > > > > > > > >> > only
> > >> > > > > > > > > > >> > > > has
> > >> > > > > > > > > > >> > > > > one method of
> > >> > > > > > > > applyProjections(List<FieldReferenceExpression>,
> > >> > > > > > > > > > >> > > DataType).
> > >> > > > > > > > > > >> > > > > It could just be two steps if we are not
> > too
> > >> > > > obsessed
> > >> > > > > with
> > >> > > > > > > > the
> > >> > > > > > > > > > >> exact
> > >> > > > > > > > > > >> > > > names
> > >> > > > > > > > > > >> > > > > of "applyFilters" and "applyProjections".
> > >> More
> > >> > > > > specifically,
> > >> > > > > > > > > it
> > >> > > > > > > > > > >> takes
> > >> > > > > > > > > > >> > > two
> > >> > > > > > > > > > >> > > > > steps to achieve this final state:
> > >> > > > > > > > > > >> > > > >     a. introduce a new method
> > >> > > > > > > > > > >> > tryApplyFilters(List<ResolvedExpression>)
> > >> > > > > > > > > > >> > > > to
> > >> > > > > > > > > > >> > > > > SupportsFilterPushDown, which may have
> > >> > > > > > > > > FiledReferenceExpression
> > >> > > > > > > > > > >> with
> > >> > > > > > > > > > >> > > > nested
> > >> > > > > > > > > > >> > > > > fields. The default implementation throws
> > an
> > >> > > > > exception. The
> > >> > > > > > > > > > >> runtime
> > >> > > > > > > > > > >> > > will
> > >> > > > > > > > > > >> > > > > first call tryApplyFilters() with nested
> > >> fields.
> > >> > > In
> > >> > > > > case of
> > >> > > > > > > > > > >> > exception,
> > >> > > > > > > > > > >> > > it
> > >> > > > > > > > > > >> > > > > calls the existing applyFilters() without
> > >> > > including
> > >> > > > > the
> > >> > > > > > > > nested
> > >> > > > > > > > > > >> > filters.
> > >> > > > > > > > > > >> > > > > Similarly, in SupportsProjectionPushDown,
> > >> > > introduce
> > >> > > > a
> > >> > > > > > > > > > >> > > > >
> > >> tryApplyProjections<List<NestedFieldReference>
> > >> > > > method
> > >> > > > > > > > > returning
> > >> > > > > > > > > > a
> > >> > > > > > > > > > >> > > Result.
> > >> > > > > > > > > > >> > > > > The Result also contains the accepted and
> > >> > > > unapplicable
> > >> > > > > > > > > > >> projections.
> > >> > > > > > > > > > >> > The
> > >> > > > > > > > > > >> > > > > default implementation also throws an
> > >> exception.
> > >> > > > > Deprecate
> > >> > > > > > > > all
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >> > > other
> > >> > > > > > > > > > >> > > > > methods except tryApplyFilters() and
> > >> > > > > tryApplyProjections().
> > >> > > > > > > > > > >> > > > >     b. remove the deprecated methods in
> the
> > >> next
> > >> > > > major
> > >> > > > > > > > version
> > >> > > > > > > > > > >> bump.
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > Now the question is putting the migration
> > >> steps
> > >> > > > > aside, which
> > >> > > > > > > > > end
> > >> > > > > > > > > > >> > state
> > >> > > > > > > > > > >> > > do
> > >> > > > > > > > > > >> > > > > we prefer? While the first end state is
> > >> > acceptable
> > >> > > > > for me,
> > >> > > > > > > > > > >> > personally,
> > >> > > > > > > > > > >> > > I
> > >> > > > > > > > > > >> > > > > prefer the latter if we are designing
> from
> > >> > > scratch.
> > >> > > > > It is
> > >> > > > > > > > > clean,
> > >> > > > > > > > > > >> > > > consistent
> > >> > > > > > > > > > >> > > > > and intuitive. Given the size of Flink,
> > >> keeping
> > >> > > APIs
> > >> > > > > in the
> > >> > > > > > > > > same
> > >> > > > > > > > > > >> > style
> > >> > > > > > > > > > >> > > > over
> > >> > > > > > > > > > >> > > > > time is important. The migration is also
> > not
> > >> > that
> > >> > > > > > > > complicated.
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > Thanks,
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > On Tue, Aug 22, 2023 at 2:23 PM Jark Wu <
> > >> > > > > imj...@gmail.com>
> > >> > > > > > > > > > wrote:
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > > > > Hi Venkat,
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > Thanks for the proposal.
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > I have some minor comments about the
> > FLIP.
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > 1. I think we don't need to
> > >> > > > > > > > > > >> > > > > > add
> > >> > > SupportsFilterPushDown#supportsNestedFilters()
> > >> > > > > method,
> > >> > > > > > > > > > >> > > > > > because connectors can skip nested
> > filters
> > >> by
> > >> > > > > putting them
> > >> > > > > > > > > in
> > >> > > > > > > > > > >> > > > > > Result#remainingFilters().
> > >> > > > > > > > > > >> > > > > > And this is backward-compatible because
> > >> > unknown
> > >> > > > > > > > expressions
> > >> > > > > > > > > > were
> > >> > > > > > > > > > >> > > added
> > >> > > > > > > > > > >> > > > to
> > >> > > > > > > > > > >> > > > > > the remaining filters.
> > >> > > > > > > > > > >> > > > > > Planner should push predicate
> expressions
> > >> as
> > >> > > more
> > >> > > > as
> > >> > > > > > > > > possible.
> > >> > > > > > > > > > >> If
> > >> > > > > > > > > > >> > we
> > >> > > > > > > > > > >> > > > add
> > >> > > > > > > > > > >> > > > > a
> > >> > > > > > > > > > >> > > > > > flag for each new filter,
> > >> > > > > > > > > > >> > > > > > the interface will be filled with lots
> of
> > >> > flags
> > >> > > > > (e.g.,
> > >> > > > > > > > > > >> > > supportsBetween,
> > >> > > > > > > > > > >> > > > > > supportsIN).
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > 2.
> > >> > > NestedFieldReferenceExpression#nestedFieldName
> > >> > > > > should
> > >> > > > > > > > be
> > >> > > > > > > > > an
> > >> > > > > > > > > > >> > array
> > >> > > > > > > > > > >> > > of
> > >> > > > > > > > > > >> > > > > > field names?
> > >> > > > > > > > > > >> > > > > > Each string represents a field name
> part
> > of
> > >> > the
> > >> > > > > field
> > >> > > > > > > > path.
> > >> > > > > > > > > > Just
> > >> > > > > > > > > > >> > keep
> > >> > > > > > > > > > >> > > > > > aligning with `nestedFieldIndexArray`.
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > 3. My concern about making
> > >> > > > FieldReferenceExpression
> > >> > > > > > > > support
> > >> > > > > > > > > > >> nested
> > >> > > > > > > > > > >> > > > fields
> > >> > > > > > > > > > >> > > > > > is the compatibility.
> > >> > > > > > > > > > >> > > > > > It is a public API and users/connectors
> > are
> > >> > > > already
> > >> > > > > using
> > >> > > > > > > > > it.
> > >> > > > > > > > > > >> > People
> > >> > > > > > > > > > >> > > > > > assumed it is a top-level column
> > >> > > > > > > > > > >> > > > > > reference, and applied logic on it. But
> > >> that's
> > >> > > not
> > >> > > > > true
> > >> > > > > > > > now
> > >> > > > > > > > > > and
> > >> > > > > > > > > > >> > this
> > >> > > > > > > > > > >> > > > may
> > >> > > > > > > > > > >> > > > > > lead to unexpected errors.
> > >> > > > > > > > > > >> > > > > > Having a separate
> > >> > NestedFieldReferenceExpression
> > >> > > > > sounds
> > >> > > > > > > > > safer
> > >> > > > > > > > > > to
> > >> > > > > > > > > > >> > me.
> > >> > > > > > > > > > >> > > > > Mixing
> > >> > > > > > > > > > >> > > > > > them in a class may
> > >> > > > > > > > > > >> > > > > >  confuse users what's the meaning of
> > >> > > > getFieldName()
> > >> > > > > and
> > >> > > > > > > > > > >> > > > getFieldIndex().
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > Regarding using
> > >> NestedFieldReferenceExpression
> > >> > > in
> > >> > > > > > > > > > >> > > > > > SupportsProjectionPushDown, do you
> > >> > > > > > > > > > >> > > > > > have any concerns @Timo Walther <
> > >> > > > twal...@apache.org>
> > >> > > > > ?
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > Best,
> > >> > > > > > > > > > >> > > > > > Jark
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > On Tue, 22 Aug 2023 at 05:55,
> > >> Venkatakrishnan
> > >> > > > > Sowrirajan <
> > >> > > > > > > > > > >> > > > > vsowr...@asu.edu
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > wrote:
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > > > > Sounds like a great suggestion,
> Becket.
> > >> +1.
> > >> > > > Agree
> > >> > > > > with
> > >> > > > > > > > > > >> cleaning
> > >> > > > > > > > > > >> > up
> > >> > > > > > > > > > >> > > > the
> > >> > > > > > > > > > >> > > > > > APIs
> > >> > > > > > > > > > >> > > > > > > and making it consistent in all the
> > >> pushdown
> > >> > > > APIs.
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > Your suggested approach seems fine to
> > me,
> > >> > > unless
> > >> > > > > anyone
> > >> > > > > > > > > else
> > >> > > > > > > > > > >> has
> > >> > > > > > > > > > >> > > any
> > >> > > > > > > > > > >> > > > > > other
> > >> > > > > > > > > > >> > > > > > > concerns. Just have couple of
> > clarifying
> > >> > > > > questions:
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > 1. Do you think we should standardize
> > the
> > >> > APIs
> > >> > > > > across
> > >> > > > > > > > all
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >> > > > pushdown
> > >> > > > > > > > > > >> > > > > > > supports like
> > SupportsPartitionPushdown,
> > >> > > > > > > > > > >> SupportsDynamicFiltering
> > >> > > > > > > > > > >> > > etc
> > >> > > > > > > > > > >> > > > > in
> > >> > > > > > > > > > >> > > > > > > the end state?
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > The current proposal works if we do
> not
> > >> want
> > >> > > to
> > >> > > > > migrate
> > >> > > > > > > > > > >> > > > > > > > SupportsFilterPushdown to also use
> > >> > > > > > > > > > >> > NestedFieldReferenceExpression
> > >> > > > > > > > > > >> > > > in
> > >> > > > > > > > > > >> > > > > > the
> > >> > > > > > > > > > >> > > > > > > > long term.
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > Did you mean
> *FieldReferenceExpression*
> > >> > > instead
> > >> > > > of
> > >> > > > > > > > > > >> > > > > > > *NestedFieldReferenceExpression*?
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > 2. Extend the
> FieldReferenceExpression
> > to
> > >> > > > support
> > >> > > > > nested
> > >> > > > > > > > > > >> fields.
> > >> > > > > > > > > > >> > > > > > > >     - Change the index field type
> > from
> > >> int
> > >> > > to
> > >> > > > > int[].
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > >     - Add a new method int[]
> > >> > > > getFieldIndexArray().
> > >> > > > > > > > > > >> > > > > > > >     - Deprecate the int
> > getFieldIndex()
> > >> > > > method,
> > >> > > > > the
> > >> > > > > > > > code
> > >> > > > > > > > > > >> will
> > >> > > > > > > > > > >> > be
> > >> > > > > > > > > > >> > > > > > removed
> > >> > > > > > > > > > >> > > > > > > in
> > >> > > > > > > > > > >> > > > > > > > the next major version bump.
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > I assume getFieldIndex would return
> > >> > > > > fieldIndexArray[0],
> > >> > > > > > > > > > right?
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > Thanks
> > >> > > > > > > > > > >> > > > > > > Venkat
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > On Fri, Aug 18, 2023 at 4:47 PM
> Becket
> > >> Qin <
> > >> > > > > > > > > > >> becket....@gmail.com
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> > > > > wrote:
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Thanks for the proposal, Venkata.
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > The current proposal works if we do
> > not
> > >> > want
> > >> > > > to
> > >> > > > > > > > migrate
> > >> > > > > > > > > > >> > > > > > > > SupportsFilterPushdown to also use
> > >> > > > > > > > > > >> > NestedFieldReferenceExpression
> > >> > > > > > > > > > >> > > > in
> > >> > > > > > > > > > >> > > > > > the
> > >> > > > > > > > > > >> > > > > > > > long term.
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > Did you mean
> *FieldReferenceExpression*
> > >> > > instead
> > >> > > > of
> > >> > > > > > > > > > >> > > > > > > *NestedFieldReferenceExpression*?
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Otherwise, the alternative solution
> > >> > briefly
> > >> > > > > mentioned
> > >> > > > > > > > in
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >> > > > rejected
> > >> > > > > > > > > > >> > > > > > > > alternatives would be the
> following:
> > >> > > > > > > > > > >> > > > > > > > Phase 1:
> > >> > > > > > > > > > >> > > > > > > > 1. Introduce a
> > supportsNestedFilters()
> > >> > > method
> > >> > > > > to the
> > >> > > > > > > > > > >> > > > > > > SupportsFilterPushdown
> > >> > > > > > > > > > >> > > > > > > > interface. (same as current
> > proposal).
> > >> > > > > > > > > > >> > > > > > > > 2. Extend the
> > FieldReferenceExpression
> > >> to
> > >> > > > > support
> > >> > > > > > > > nested
> > >> > > > > > > > > > >> > fields.
> > >> > > > > > > > > > >> > > > > > > >     - Change the index field type
> > from
> > >> int
> > >> > > to
> > >> > > > > int[].
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > >     - Add a new method int[]
> > >> > > > getFieldIndexArray().
> > >> > > > > > > > > > >> > > > > > > >     - Deprecate the int
> > getFieldIndex()
> > >> > > > method,
> > >> > > > > the
> > >> > > > > > > > code
> > >> > > > > > > > > > >> will
> > >> > > > > > > > > > >> > be
> > >> > > > > > > > > > >> > > > > > removed
> > >> > > > > > > > > > >> > > > > > > in
> > >> > > > > > > > > > >> > > > > > > > the next major version bump.
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > > > 3. In the SupportsProjectionPushDown
> > >> > interface
> > >> > > > > > > > > > >> > > > > > > >     - add a new method
> > >> > > > > > > > > > >> > > > >
> > >> applyProjection(List<FieldReferenceExpression>,
> > >> > > > > > > > > > >> > > > > > > > DataType), with default
> > implementation
> > >> > > > invoking
> > >> > > > > > > > > > >> > > > > > applyProjection(int[][],
> > >> > > > > > > > > > >> > > > > > > > DataType)
> > >> > > > > > > > > > >> > > > > > > >     - deprecate the current
> > >> > > > > applyProjection(int[][],
> > >> > > > > > > > > > >> DataType)
> > >> > > > > > > > > > >> > > > method
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Phase 2 (in the next major version
> > >> bump)
> > >> > > > > > > > > > >> > > > > > > > 1. remove the deprecated methods.
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Phase 3 (optional)
> > >> > > > > > > > > > >> > > > > > > > 1. deprecate and remove the
> > >> > > > > supportsNestedFilters() /
> > >> > > > > > > > > > >> > > > > > > > supportsNestedProjection() methods
> > from
> > >> > the
> > >> > > > > > > > > > >> > > SupportsFilterPushDown
> > >> > > > > > > > > > >> > > > /
> > >> > > > > > > > > > >> > > > > > > > SupportsProjectionPushDown
> > interfaces.
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Personally I prefer this
> alternative.
> > >> It
> > >> > > takes
> > >> > > > > longer
> > >> > > > > > > > to
> > >> > > > > > > > > > >> finish
> > >> > > > > > > > > > >> > > the
> > >> > > > > > > > > > >> > > > > > work,
> > >> > > > > > > > > > >> > > > > > > > but the API eventually becomes
> clean
> > >> and
> > >> > > > > consistent.
> > >> > > > > > > > > But I
> > >> > > > > > > > > > >> can
> > >> > > > > > > > > > >> > > live
> > >> > > > > > > > > > >> > > > > > with
> > >> > > > > > > > > > >> > > > > > > > the current proposal.
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Thanks,
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > On Sat, Aug 19, 2023 at 12:09 AM
> > >> > > > Venkatakrishnan
> > >> > > > > > > > > > Sowrirajan
> > >> > > > > > > > > > >> <
> > >> > > > > > > > > > >> > > > > > > > vsowr...@asu.edu> wrote:
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > Gentle ping for reviews/feedback.
> > >> > > > > > > > > > >> > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > On Tue, Aug 15, 2023, 5:37 PM
> > >> > > > Venkatakrishnan
> > >> > > > > > > > > > Sowrirajan <
> > >> > > > > > > > > > >> > > > > > > > vsowr...@asu.edu
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > wrote:
> > >> > > > > > > > > > >> > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > > Hi All,
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > > I am opening this thread to
> > discuss
> > >> > > > > FLIP-356:
> > >> > > > > > > > > Support
> > >> > > > > > > > > > >> > Nested
> > >> > > > > > > > > > >> > > > > Fields
> > >> > > > > > > > > > >> > > > > > > > > > Filter Pushdown. The FLIP can
> be
> > >> found
> > >> > > at
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >>
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://urldefense.com/v3/__https://cwiki.apache.org/confluence/display/FLINK/FLIP-356*3A*Support*Nested*Fields*Filter*Pushdown__;JSsrKysr!!IKRxdwAv5BmarQ!clxXJwshKpn559SAkQiieqgGe0ZduXCzUKCmYLtFIbQLmrmEEgdmuEIM8ZM1M3O_uGqOploU4ailqGpukAg$
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > > This FLIP adds support for
> > pushing
> > >> > down
> > >> > > > > nested
> > >> > > > > > > > > fields
> > >> > > > > > > > > > >> > filters
> > >> > > > > > > > > > >> > > > to
> > >> > > > > > > > > > >> > > > > > the
> > >> > > > > > > > > > >> > > > > > > > > > underlying TableSource. In our
> > data
> > >> > > lake,
> > >> > > > > we find
> > >> > > > > > > > a
> > >> > > > > > > > > > lot
> > >> > > > > > > > > > >> of
> > >> > > > > > > > > > >> > > > > datasets
> > >> > > > > > > > > > >> > > > > > > > have
> > >> > > > > > > > > > >> > > > > > > > > > nested fields and also user
> > queries
> > >> > with
> > >> > > > > filters
> > >> > > > > > > > > > >> defined on
> > >> > > > > > > > > > >> > > the
> > >> > > > > > > > > > >> > > > > > > nested
> > >> > > > > > > > > > >> > > > > > > > > > fields. This would drastically
> > >> improve
> > >> > > the
> > >> > > > > > > > > performance
> > >> > > > > > > > > > >> for
> > >> > > > > > > > > > >> > > > those
> > >> > > > > > > > > > >> > > > > > sets
> > >> > > > > > > > > > >> > > > > > > > of
> > >> > > > > > > > > > >> > > > > > > > > > queries.
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > > Appreciate any comments or
> > feedback
> > >> > you
> > >> > > > may
> > >> > > > > have
> > >> > > > > > > > on
> > >> > > > > > > > > > this
> > >> > > > > > > > > > >> > > > > proposal.
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > > > Regards
> > >> > > > > > > > > > >> > > > > > > > > > Venkata krishnan
> > >> > > > > > > > > > >> > > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > > >
> > >> > > > > > > > > > >> > > > > > > >
> > >> > > > > > > > > > >> > > > > > >
> > >> > > > > > > > > > >> > > > > >
> > >> > > > > > > > > > >> > > > >
> > >> > > > > > > > > > >> > > >
> > >> > > > > > > > > > >> > >
> > >> > > > > > > > > > >> >
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> >
>

Reply via email to