+1 for approach #1
- Henry On Thu, Jan 14, 2016 at 5:35 AM, Chiwan Park <chiwanp...@apache.org> wrote: > I’m also for approach #1. Now is nice time to apply some API-breaking > changes. > > > On Jan 14, 2016, at 1:19 AM, Aljoscha Krettek <aljos...@apache.org> > wrote: > > > > I’m also for Approach #1. I like simplifying things. > >> On 13 Jan 2016, at 14:25, Vasiliki Kalavri <vasilikikala...@gmail.com> > wrote: > >> > >> Hi, > >> > >> +1 for removing the Combinable annotation. Approach 1 sounds like the > >> best option to me. > >> > >> > >> On 13 January 2016 at 14:11, Till Rohrmann <trohrm...@apache.org> > wrote: > >> > >>> Hi Fabian, > >>> > >>> thanks for bringing this issue up. I agree with you that it would be > nice > >>> to remove the Combinable annotation if it is not really needed. Now if > >>> people are not aware of the Combinable interface then they might think > that > >>> they are actually using combiners by simply implementing > CombineFunction. > >>> > >>> > >> has happened to me :S > >> > >> > >> > >>> I would also be in favour of approach 1 combined with a migration guide > >>> where we highlight possible problems with a faulty combine > implementation. > >>> > >>> > >> Migration guide is a good idea! > >> > >> > >> > >>> Cheers, > >>> Till > >>> > >>> > >>> > >> -Vasia. > >> > >> > >> > >>> On Wed, Jan 13, 2016 at 1:53 PM, Fabian Hueske <fhue...@gmail.com> > wrote: > >>> > >>>> Hi everybody, > >>>> > >>>> > >>>> > >>>> As part of the upcoming 1.0 release we are stabilizing Flink's APIs. > >>>> > >>>> I would like to start a discussion about removing the Combinable > >>> annotation > >>>> from the DataSet API. > >>>> > >>>> > >>>> > >>>> # The Current State: > >>>> > >>>> The DataSet API offers two interface for combine functions, > >>> CombineFunction > >>>> and GroupCombineFunction, which can be added to a GroupReduceFunctions > >>>> (GRF). > >>>> > >>>> > >>>> However, implementing a combine interface does not make a GRF > combinable. > >>>> In addition, the GRF class needs to be annotated with the Combinable > >>>> annotation. The RichGroupReduceFunction has a default implementation > of > >>>> combine, which forwards the combine parameters to the reduce method. > This > >>>> default implementation is not used, unless the class that extends RGRF > >>> has > >>>> the Combinable annotation. > >>>> > >>>> > >>>> > >>>> In addition to the Combinable annotation, the DataSet API > >>>> GroupReduceOperator features the setCombinable(boolean) method to > >>> override > >>>> a missing or existing Combinable annotation. > >>>> > >>>> > >>>> > >>>> # My Proposal: > >>>> > >>>> I propose to remove the Combinable annotation because it is not > required > >>>> and complicates the definition of combiners. Instead, the combine > method > >>> of > >>>> a GroupReduceFunction should be executed if it implements one of the > >>>> combine function interfaces. This would require to remove the default > >>>> combine implementation of the RichGroupReduceFunction as well. > >>>> > >>>> This would be an API breaking change and has a few implications. > >>>> > >>>> > >>>> > >>>> # Possible Implementation: > >>>> > >>>> There are a few ways to implement this change. > >>>> > >>>> - Remove Combinable annotation or mark it deprecated (and keep effect) > >>>> > >>>> - Remove default combine method from RichGroupReduceFunction or > deprecate > >>>> it > >>>> > >>>> > >>>> > >>>> Approach 1: > >>>> - Remove Combinable annotation > >>>> - Remove default combine() method from RichGroupReduceFunction > >>>> - Effect: > >>>> - All RichGroupReduceFunctions that do either have the Combinable > >>>> annotation or implement the combine method do not compile anymore > >>>> - GroupReduceFunctions that have the Combinable annotation do not > >>>> compile anymore > >>>> - GroupReduceFunctions that implement a combine interface without > >>>> having the annotation (and not being set to combinable during program > >>>> construction) will execute the previously not executed combine method. > >>> This > >>>> might change the behavior of the program. In worst case, the program > >>> might > >>>> silently produce wrong results (or crash) if the combiner > implementation > >>>> was faulty. In best case, the program executes faster. > >>>> > >>>> > >>>> > >>>> Approach 2: > >>>> - As Approach 1 > >>>> - In addition extend both combine interfaces with a deprecated marker > >>>> method. This will ensure that all functions that implement a > combinable > >>>> interface do not compile anymore and need to be fixed. This could > prevent > >>>> the silent failure as in Approach 1, but would also cause an > additional > >>> API > >>>> breaking change once the deprecated marker method is removed again. > >>>> > >>>> > >>>> > >>>> Approach 3: > >>>> - Mark Combinable annotation deprecated > >>>> - Mark combine() method in RichGroupReduceFunction as deprecated > >>>> - Effect: > >>>> - There'll be a couple of deprecation warnings. > >>>> - We face the same problem with silent failures as in Approach 1. > >>>> - We have to check if RichGroupReduceFunction's override combine or > >>> not > >>>> (can be done with reflection). If the method is not overridden we do > not > >>>> execute it (unless there is a Combinable annotation) and we are fine. > If > >>> it > >>>> is overridden and no Combinable annotation has been defined, we have > the > >>>> same problem with silent failures as before. > >>>> - After we remove the deprecated annotation and method, we have the > >>>> same effect as with Approach 1. > >>>> > >>>> > >>>> > >>>> There are more alternatives, but these are the most viable, IMO. > >>>> > >>>> > >>>> > >>>> I think, if we want to remove the combinable annotation, we should do > it > >>>> now. > >>>> > >>>> Given the three options, would go for Approach 1. Yes, breaks a lot of > >>> code > >>>> and yes there is the possibility of computing incorrect results. > >>> Approach 2 > >>>> is safer but would mean another API breaking change in the future. > >>> Approach > >>>> 3 comes with fewer breaking changes but has the same problem of silent > >>>> failures. > >>>> > >>>> IMO, the breaking API changes of Approach 1 are even desirable because > >>> they > >>>> will make users aware that this feature changed. > >>>> > >>>> > >>>> > >>>> What do you think? > >>>> > >>>> > >>>> > >>>> Cheers, Fabian > >>>> > > Regards, > Chiwan Park > > >