Just a friendly reminder that PR 2094 resolved some of the issue mentioned
here, but a bit remains in terms of fully consolidating the semantics.
Merging the PR as soon as Travis comes in green.

The reminder is documented in [1].

[1] https://issues.apache.org/jira/browse/FLINK-5156

On Thu, Nov 3, 2016 at 4:42 PM, Márton Balassi <balassi.mar...@gmail.com>
wrote:

> Thanks for bringing up the issue, Fabian.
>
> I am also for unifying the access patterns between the FieldAccessor and
> ExpressionKeys logic.
>
> In terms of Fabian's suggestion to refactor the ExpressionKeys parsing
> logic and rely on it in the FieldAccessors I think that is nice first step.
> It would be great if we could provide even tighter integration though.
>
> On Fri, Oct 28, 2016 at 6:17 PM, Fabian Hueske <fhue...@gmail.com> wrote:
>
>> IMO, FieldAccessors are certainly valuable and should exist.
>> I only want to make sure that there is a common syntax for field
>> references.
>>
>> Maybe we can refactor the parsing code for field references in
>> ExpressionKeys and expose it in static public methods that the
>> FieldAccessors can call.
>>
>> Best,
>> Fabian
>>
>>
>>
>> 2016-10-27 14:52 GMT+02:00 Gábor Gévay <gga...@gmail.com>:
>>
>> > Hello,
>> >
>> > Thanks Fabian for starting this discussion.
>> >
>> > I would just like to add a few thougths about why does the
>> > FieldAccessors even exist. One could say that we should instead just
>> > re-use ExpressionKeys for the aggregations, and we are done. As far as
>> > I can see, the way ExpressionKeys is often used is roughly to call
>> > computeLogicalKeyPositions on it, then call
>> > CompositeType.createComparator with these key positions, and then we
>> > can call extractKeys on the comparator to get the values of the fields
>> > of a record.
>> >
>> > However, the problem with this is that in the aggregations we also
>> > need to write fields and not just read them. So with the above
>> > procedure, we would need an "intractKey" method on the comparators
>> > that would be the opposite of extractKeys, i.e. which would set the
>> > key fields.
>> >
>> > Another difference between the FieldAccessor world and the
>> > ExpressionKeys world is that in the aggregations we need to deal with
>> > only one field. This means that enhancing the comparators with the
>> > exact opposite of extractKeys would be overkill, because we don't need
>> > to set multiple fields at a time.
>> >
>> > It certainly seems to be a good idea to unify these at least
>> > externally (from the side of the users), but I'm not sure what would
>> > be a neat way to also internally unify them.
>> >
>> > Best,
>> > Gábor
>> >
>> >
>> > 2016-10-26 20:54 GMT+02:00 Fabian Hueske <fhue...@gmail.com>:
>> > > Hi everybody,
>> > >
>> > > while reviewing PR #2094 [1] I noticed that the field reference syntax
>> > for
>> > > FieldAccessors is not compatible with the syntax supported for key
>> > > definitions (ExpressionKeys) used in groupBy(), keyBy(),
>> > > join().where().equalTo(), etc.
>> > >
>> > > FieldAccessors are only used for build-in aggregations in the
>> DataStream
>> > > API (sum(), min(), max(), ...).
>> > >
>> > > In particular I identified the following inconsistencies:
>> > >
>> > > - FieldAccessors allow to address array cells. ExpressionKeys treat
>> > arrays
>> > > as AtomicTypes (Array TypeInfos do not extend CompositeType). Hence,
>> it
>> > is
>> > > not possible to address array cells.
>> > > - ExpressionKeys do only support Integer keys for tuples. An atomic
>> type
>> > > can only be addressed with "*". FieldAccessors allow to address
>> > AtomicTypes
>> > > with 0 in addtion to "*".
>> > > - ExpressionKeys support to address fields of Java tuples with "f2"
>> and
>> > > Scala tuple fields with "_3". FieldAccessors do not support the "f" or
>> > "_"
>> > > prefix.
>> > >
>> > > I would like to propose to adapt the syntax of both mechanisms
>> (ideally,
>> > > both should use the same code for validation). IMO, the ExpressionKey
>> > > syntax much more widely used and is well designed. Therefore, I would
>> > adopt
>> > > it for FieldAccessors as well. However, that would mean to restrict
>> the
>> > > syntax of the FieldAccessors and might break existing code.
>> > >
>> > > What do others think?
>> > >
>> > > [1] https://github.com/apache/flink/pull/2094
>> >
>>
>
>

Reply via email to