Just a friendly reminder that PR 2094 resolved some of the issue mentioned here, but a bit remains in terms of fully consolidating the semantics. Merging the PR as soon as Travis comes in green.
The reminder is documented in [1]. [1] https://issues.apache.org/jira/browse/FLINK-5156 On Thu, Nov 3, 2016 at 4:42 PM, Márton Balassi <balassi.mar...@gmail.com> wrote: > Thanks for bringing up the issue, Fabian. > > I am also for unifying the access patterns between the FieldAccessor and > ExpressionKeys logic. > > In terms of Fabian's suggestion to refactor the ExpressionKeys parsing > logic and rely on it in the FieldAccessors I think that is nice first step. > It would be great if we could provide even tighter integration though. > > On Fri, Oct 28, 2016 at 6:17 PM, Fabian Hueske <fhue...@gmail.com> wrote: > >> IMO, FieldAccessors are certainly valuable and should exist. >> I only want to make sure that there is a common syntax for field >> references. >> >> Maybe we can refactor the parsing code for field references in >> ExpressionKeys and expose it in static public methods that the >> FieldAccessors can call. >> >> Best, >> Fabian >> >> >> >> 2016-10-27 14:52 GMT+02:00 Gábor Gévay <gga...@gmail.com>: >> >> > Hello, >> > >> > Thanks Fabian for starting this discussion. >> > >> > I would just like to add a few thougths about why does the >> > FieldAccessors even exist. One could say that we should instead just >> > re-use ExpressionKeys for the aggregations, and we are done. As far as >> > I can see, the way ExpressionKeys is often used is roughly to call >> > computeLogicalKeyPositions on it, then call >> > CompositeType.createComparator with these key positions, and then we >> > can call extractKeys on the comparator to get the values of the fields >> > of a record. >> > >> > However, the problem with this is that in the aggregations we also >> > need to write fields and not just read them. So with the above >> > procedure, we would need an "intractKey" method on the comparators >> > that would be the opposite of extractKeys, i.e. which would set the >> > key fields. >> > >> > Another difference between the FieldAccessor world and the >> > ExpressionKeys world is that in the aggregations we need to deal with >> > only one field. This means that enhancing the comparators with the >> > exact opposite of extractKeys would be overkill, because we don't need >> > to set multiple fields at a time. >> > >> > It certainly seems to be a good idea to unify these at least >> > externally (from the side of the users), but I'm not sure what would >> > be a neat way to also internally unify them. >> > >> > Best, >> > Gábor >> > >> > >> > 2016-10-26 20:54 GMT+02:00 Fabian Hueske <fhue...@gmail.com>: >> > > Hi everybody, >> > > >> > > while reviewing PR #2094 [1] I noticed that the field reference syntax >> > for >> > > FieldAccessors is not compatible with the syntax supported for key >> > > definitions (ExpressionKeys) used in groupBy(), keyBy(), >> > > join().where().equalTo(), etc. >> > > >> > > FieldAccessors are only used for build-in aggregations in the >> DataStream >> > > API (sum(), min(), max(), ...). >> > > >> > > In particular I identified the following inconsistencies: >> > > >> > > - FieldAccessors allow to address array cells. ExpressionKeys treat >> > arrays >> > > as AtomicTypes (Array TypeInfos do not extend CompositeType). Hence, >> it >> > is >> > > not possible to address array cells. >> > > - ExpressionKeys do only support Integer keys for tuples. An atomic >> type >> > > can only be addressed with "*". FieldAccessors allow to address >> > AtomicTypes >> > > with 0 in addtion to "*". >> > > - ExpressionKeys support to address fields of Java tuples with "f2" >> and >> > > Scala tuple fields with "_3". FieldAccessors do not support the "f" or >> > "_" >> > > prefix. >> > > >> > > I would like to propose to adapt the syntax of both mechanisms >> (ideally, >> > > both should use the same code for validation). IMO, the ExpressionKey >> > > syntax much more widely used and is well designed. Therefore, I would >> > adopt >> > > it for FieldAccessors as well. However, that would mean to restrict >> the >> > > syntax of the FieldAccessors and might break existing code. >> > > >> > > What do others think? >> > > >> > > [1] https://github.com/apache/flink/pull/2094 >> > >> > >