+1 for approach #1


- Henry

On Thu, Jan 14, 2016 at 5:35 AM, Chiwan Park <chiwanp...@apache.org> wrote:

> I’m also for approach #1. Now is nice time to apply some API-breaking
> changes.
>
> > On Jan 14, 2016, at 1:19 AM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
> >
> > I’m also for Approach #1. I like simplifying things.
> >> On 13 Jan 2016, at 14:25, Vasiliki Kalavri <vasilikikala...@gmail.com>
> wrote:
> >>
> >> Hi,
> >>
> >> ​+1 for removing the Combinable annotation​. Approach 1 sounds like the
> >> best option to me.
> >>
> >>
> >> On 13 January 2016 at 14:11, Till Rohrmann <trohrm...@apache.org>
> wrote:
> >>
> >>> Hi Fabian,
> >>>
> >>> thanks for bringing this issue up. I agree with you that it would be
> nice
> >>> to remove the Combinable annotation if it is not really needed. Now if
> >>> people are not aware of the Combinable interface then they might think
> that
> >>> they are actually using combiners by simply implementing
> CombineFunction.
> >>>
> >>>
> >> ​has happened to me :S​
> >>
> >>
> >>
> >>> I would also be in favour of approach 1 combined with a migration guide
> >>> where we highlight possible problems with a faulty combine
> implementation.
> >>>
> >>>
> >> Migration guide is a ​good idea​!
> >>
> >>
> >>
> >>> Cheers,
> >>> Till
> >>> ​
> >>>
> >>>
> >> ​-Vasia.​
> >>
> >>
> >>
> >>> On Wed, Jan 13, 2016 at 1:53 PM, Fabian Hueske <fhue...@gmail.com>
> wrote:
> >>>
> >>>> Hi everybody,
> >>>>
> >>>>
> >>>>
> >>>> As part of the upcoming 1.0 release we are stabilizing Flink's APIs.
> >>>>
> >>>> I would like to start a discussion about removing the Combinable
> >>> annotation
> >>>> from the DataSet API.
> >>>>
> >>>>
> >>>>
> >>>> # The Current State:
> >>>>
> >>>> The DataSet API offers two interface for combine functions,
> >>> CombineFunction
> >>>> and GroupCombineFunction, which can be added to a GroupReduceFunctions
> >>>> (GRF).
> >>>>
> >>>>
> >>>> However, implementing a combine interface does not make a GRF
> combinable.
> >>>> In addition, the GRF class needs to be annotated with the Combinable
> >>>> annotation. The RichGroupReduceFunction has a default implementation
> of
> >>>> combine, which forwards the combine parameters to the reduce method.
> This
> >>>> default implementation is not used, unless the class that extends RGRF
> >>> has
> >>>> the Combinable annotation.
> >>>>
> >>>>
> >>>>
> >>>> In addition to the Combinable annotation, the DataSet API
> >>>> GroupReduceOperator features the setCombinable(boolean) method to
> >>> override
> >>>> a missing or existing Combinable annotation.
> >>>>
> >>>>
> >>>>
> >>>> # My Proposal:
> >>>>
> >>>> I propose to remove the Combinable annotation because it is not
> required
> >>>> and complicates the definition of combiners. Instead, the combine
> method
> >>> of
> >>>> a GroupReduceFunction should be executed if it implements one of the
> >>>> combine function interfaces. This would require to remove the default
> >>>> combine implementation of the RichGroupReduceFunction as well.
> >>>>
> >>>> This would be an API breaking change and has a few implications.
> >>>>
> >>>>
> >>>>
> >>>> # Possible Implementation:
> >>>>
> >>>> There are a few ways to implement this change.
> >>>>
> >>>> - Remove Combinable annotation or mark it deprecated (and keep effect)
> >>>>
> >>>> - Remove default combine method from RichGroupReduceFunction or
> deprecate
> >>>> it
> >>>>
> >>>>
> >>>>
> >>>> Approach 1:
> >>>> - Remove Combinable annotation
> >>>> - Remove default combine() method from RichGroupReduceFunction
> >>>> - Effect:
> >>>>   - All RichGroupReduceFunctions that do either have the Combinable
> >>>> annotation or implement the combine method do not compile anymore
> >>>>   - GroupReduceFunctions that have the Combinable annotation do not
> >>>> compile anymore
> >>>>   - GroupReduceFunctions that implement a combine interface without
> >>>> having the annotation (and not being set to combinable during program
> >>>> construction) will execute the previously not executed combine method.
> >>> This
> >>>> might change the behavior of the program. In worst case, the program
> >>> might
> >>>> silently produce wrong results (or crash) if the combiner
> implementation
> >>>> was faulty. In best case, the program executes faster.
> >>>>
> >>>>
> >>>>
> >>>> Approach 2:
> >>>> - As Approach 1
> >>>> - In addition extend both combine interfaces with a deprecated marker
> >>>> method. This will ensure that all functions that implement a
> combinable
> >>>> interface do not compile anymore and need to be fixed. This could
> prevent
> >>>> the silent failure as in Approach 1, but would also cause an
> additional
> >>> API
> >>>> breaking change once the deprecated marker method is removed again.
> >>>>
> >>>>
> >>>>
> >>>> Approach 3:
> >>>> - Mark Combinable annotation deprecated
> >>>> - Mark combine() method in RichGroupReduceFunction as deprecated
> >>>> - Effect:
> >>>>   - There'll be a couple of deprecation warnings.
> >>>>   - We face the same problem with silent failures as in Approach 1.
> >>>>   - We have to check if RichGroupReduceFunction's override combine or
> >>> not
> >>>> (can be done with reflection). If the method is not overridden we do
> not
> >>>> execute it (unless there is a Combinable annotation) and we are fine.
> If
> >>> it
> >>>> is overridden and no Combinable annotation has been defined, we have
> the
> >>>> same problem with silent failures as before.
> >>>>   - After we remove the deprecated annotation and method, we have the
> >>>> same effect as with Approach 1.
> >>>>
> >>>>
> >>>>
> >>>> There are more alternatives, but these are the most viable, IMO.
> >>>>
> >>>>
> >>>>
> >>>> I think, if we want to remove the combinable annotation, we should do
> it
> >>>> now.
> >>>>
> >>>> Given the three options, would go for Approach 1. Yes, breaks a lot of
> >>> code
> >>>> and yes there is the possibility of computing incorrect results.
> >>> Approach 2
> >>>> is safer but would mean another API breaking change in the future.
> >>> Approach
> >>>> 3 comes with fewer breaking changes but has the same problem of silent
> >>>> failures.
> >>>>
> >>>> IMO, the breaking API changes of Approach 1 are even desirable because
> >>> they
> >>>> will make users aware that this feature changed.
> >>>>
> >>>>
> >>>>
> >>>> What do you think?
> >>>>
> >>>>
> >>>>
> >>>> Cheers, Fabian
> >>>>
>
> Regards,
> Chiwan Park
>
>
>

Reply via email to