OK, I think we got a clear picture here. I'll update the corresponding JIRA issue FLINK-1045.
Thanks, Fabian 2016-01-14 18:53 GMT+01:00 Henry Saputra <henry.sapu...@gmail.com>: > +1 for approach #1 > > > > - Henry > > On Thu, Jan 14, 2016 at 5:35 AM, Chiwan Park <chiwanp...@apache.org> > wrote: > > > I’m also for approach #1. Now is nice time to apply some API-breaking > > changes. > > > > > On Jan 14, 2016, at 1:19 AM, Aljoscha Krettek <aljos...@apache.org> > > wrote: > > > > > > I’m also for Approach #1. I like simplifying things. > > >> On 13 Jan 2016, at 14:25, Vasiliki Kalavri <vasilikikala...@gmail.com > > > > wrote: > > >> > > >> Hi, > > >> > > >> +1 for removing the Combinable annotation. Approach 1 sounds like > the > > >> best option to me. > > >> > > >> > > >> On 13 January 2016 at 14:11, Till Rohrmann <trohrm...@apache.org> > > wrote: > > >> > > >>> Hi Fabian, > > >>> > > >>> thanks for bringing this issue up. I agree with you that it would be > > nice > > >>> to remove the Combinable annotation if it is not really needed. Now > if > > >>> people are not aware of the Combinable interface then they might > think > > that > > >>> they are actually using combiners by simply implementing > > CombineFunction. > > >>> > > >>> > > >> has happened to me :S > > >> > > >> > > >> > > >>> I would also be in favour of approach 1 combined with a migration > guide > > >>> where we highlight possible problems with a faulty combine > > implementation. > > >>> > > >>> > > >> Migration guide is a good idea! > > >> > > >> > > >> > > >>> Cheers, > > >>> Till > > >>> > > >>> > > >>> > > >> -Vasia. > > >> > > >> > > >> > > >>> On Wed, Jan 13, 2016 at 1:53 PM, Fabian Hueske <fhue...@gmail.com> > > wrote: > > >>> > > >>>> Hi everybody, > > >>>> > > >>>> > > >>>> > > >>>> As part of the upcoming 1.0 release we are stabilizing Flink's APIs. > > >>>> > > >>>> I would like to start a discussion about removing the Combinable > > >>> annotation > > >>>> from the DataSet API. > > >>>> > > >>>> > > >>>> > > >>>> # The Current State: > > >>>> > > >>>> The DataSet API offers two interface for combine functions, > > >>> CombineFunction > > >>>> and GroupCombineFunction, which can be added to a > GroupReduceFunctions > > >>>> (GRF). > > >>>> > > >>>> > > >>>> However, implementing a combine interface does not make a GRF > > combinable. > > >>>> In addition, the GRF class needs to be annotated with the Combinable > > >>>> annotation. The RichGroupReduceFunction has a default implementation > > of > > >>>> combine, which forwards the combine parameters to the reduce method. > > This > > >>>> default implementation is not used, unless the class that extends > RGRF > > >>> has > > >>>> the Combinable annotation. > > >>>> > > >>>> > > >>>> > > >>>> In addition to the Combinable annotation, the DataSet API > > >>>> GroupReduceOperator features the setCombinable(boolean) method to > > >>> override > > >>>> a missing or existing Combinable annotation. > > >>>> > > >>>> > > >>>> > > >>>> # My Proposal: > > >>>> > > >>>> I propose to remove the Combinable annotation because it is not > > required > > >>>> and complicates the definition of combiners. Instead, the combine > > method > > >>> of > > >>>> a GroupReduceFunction should be executed if it implements one of the > > >>>> combine function interfaces. This would require to remove the > default > > >>>> combine implementation of the RichGroupReduceFunction as well. > > >>>> > > >>>> This would be an API breaking change and has a few implications. > > >>>> > > >>>> > > >>>> > > >>>> # Possible Implementation: > > >>>> > > >>>> There are a few ways to implement this change. > > >>>> > > >>>> - Remove Combinable annotation or mark it deprecated (and keep > effect) > > >>>> > > >>>> - Remove default combine method from RichGroupReduceFunction or > > deprecate > > >>>> it > > >>>> > > >>>> > > >>>> > > >>>> Approach 1: > > >>>> - Remove Combinable annotation > > >>>> - Remove default combine() method from RichGroupReduceFunction > > >>>> - Effect: > > >>>> - All RichGroupReduceFunctions that do either have the Combinable > > >>>> annotation or implement the combine method do not compile anymore > > >>>> - GroupReduceFunctions that have the Combinable annotation do not > > >>>> compile anymore > > >>>> - GroupReduceFunctions that implement a combine interface without > > >>>> having the annotation (and not being set to combinable during > program > > >>>> construction) will execute the previously not executed combine > method. > > >>> This > > >>>> might change the behavior of the program. In worst case, the program > > >>> might > > >>>> silently produce wrong results (or crash) if the combiner > > implementation > > >>>> was faulty. In best case, the program executes faster. > > >>>> > > >>>> > > >>>> > > >>>> Approach 2: > > >>>> - As Approach 1 > > >>>> - In addition extend both combine interfaces with a deprecated > marker > > >>>> method. This will ensure that all functions that implement a > > combinable > > >>>> interface do not compile anymore and need to be fixed. This could > > prevent > > >>>> the silent failure as in Approach 1, but would also cause an > > additional > > >>> API > > >>>> breaking change once the deprecated marker method is removed again. > > >>>> > > >>>> > > >>>> > > >>>> Approach 3: > > >>>> - Mark Combinable annotation deprecated > > >>>> - Mark combine() method in RichGroupReduceFunction as deprecated > > >>>> - Effect: > > >>>> - There'll be a couple of deprecation warnings. > > >>>> - We face the same problem with silent failures as in Approach 1. > > >>>> - We have to check if RichGroupReduceFunction's override combine > or > > >>> not > > >>>> (can be done with reflection). If the method is not overridden we do > > not > > >>>> execute it (unless there is a Combinable annotation) and we are > fine. > > If > > >>> it > > >>>> is overridden and no Combinable annotation has been defined, we have > > the > > >>>> same problem with silent failures as before. > > >>>> - After we remove the deprecated annotation and method, we have > the > > >>>> same effect as with Approach 1. > > >>>> > > >>>> > > >>>> > > >>>> There are more alternatives, but these are the most viable, IMO. > > >>>> > > >>>> > > >>>> > > >>>> I think, if we want to remove the combinable annotation, we should > do > > it > > >>>> now. > > >>>> > > >>>> Given the three options, would go for Approach 1. Yes, breaks a lot > of > > >>> code > > >>>> and yes there is the possibility of computing incorrect results. > > >>> Approach 2 > > >>>> is safer but would mean another API breaking change in the future. > > >>> Approach > > >>>> 3 comes with fewer breaking changes but has the same problem of > silent > > >>>> failures. > > >>>> > > >>>> IMO, the breaking API changes of Approach 1 are even desirable > because > > >>> they > > >>>> will make users aware that this feature changed. > > >>>> > > >>>> > > >>>> > > >>>> What do you think? > > >>>> > > >>>> > > >>>> > > >>>> Cheers, Fabian > > >>>> > > > > Regards, > > Chiwan Park > > > > > > >