I've also had a look. +1 to merge from my side.
On Tue, Dec 15, 2015 at 3:27 PM, Robert Metzger <rmetz...@apache.org> wrote: > The PR has been open for a while, Stephan, Aljoscha and Henry looked over > it and I've addressed all comments by them: > https://github.com/apache/flink/pull/1427 > > I would like to merge the pull request adding the "flink-annotations" > module and annotating all classes in "flink-core" soon (say, next 24-48 > hours). > We can still make changes once its merged, but I would like to get this one > done and start discussing the remaining modules and merge them as well > afterwards. > > > > On Tue, Dec 1, 2015 at 2:20 PM, Robert Metzger <rmetz...@apache.org> wrote: > >> I left out the classes in memory except for the Input/Output views. >> >> Or course, we should try to minimize the stable classes, but user programs >> get a lot of extension points in Flink ;) >> >> I've opened a pull request with my current suggestion: >> https://github.com/apache/flink/pull/1426 >> >> >> On Tue, Dec 1, 2015 at 2:13 PM, Maximilian Michels <m...@apache.org> wrote: >> >>> Thanks for the explanation. I was just wondering how you approached >>> the problem. Going through class-by-class makes sense. >>> >>> Not sure whether we want to make "org.apache.flink.core.memory" and >>> "org.apache.flink.api.common.distributions" stable interfaces. I would >>> think these qualify more as experimental public. >>> >>> Shouldn't we try to minimize the number of classes we declare stable? >>> Of course it is nice to offer a stable interface to developers but it >>> might also come in the way of the Flink developers.. >>> >>> >>> On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <rmetz...@apache.org> >>> wrote: >>> > Hi Max, >>> > >>> > classes in flink-scala are annotated as well, and its also in the list >>> :) >>> > >>> > I considered classes in flink-core, flink-runtime, flink-scala, >>> > flink-streaming-java, flink-streaming-scala, >>> > flink-connector-kafka, flink-connector-filesystem, flink-avro and >>> > flink-hadoop-compatibility. >>> > I think there is no clear definition for a public interface, that's why >>> I >>> > just decided on a class-by-class basis. >>> > >>> > Classes I left out / I was uncertain with: >>> > >>> > - org.apache.flink.api.common.distributions >>> > - only some Input/output classes in org.apache.flink.api.common.io >>> > - org.apache.flink.api.common.operators >>> > - only the TypeInformation in org.apache.flink.api.common.typeinfo >>> (not >>> > the Atomic, basic, integer, .. type infos) >>> > - most in org.apache.flink.core.memory (except Input/output view) >>> > - I didn’t add the parsers in org.apache.flink.types.parser >>> > >>> > >>> > >>> > >>> > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <m...@apache.org> >>> wrote: >>> > >>> >> Thank for your getting us started on annotating the API. The list >>> >> looks good so far. I have the feeling it could even be extended a bit. >>> >> Just curious, how did you choose which classes you annotate? Did you >>> >> go through all the classes in flink-core, flink-java, and >>> >> flink-clients Maven projects? >>> >> >>> >> What about flink-scala? Shouldn't it be annotated as well? >>> >> >>> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <rmetz...@apache.org> >>> >> wrote: >>> >> > Okay, I'll introduce an annotation for experimental interfaces and >>> I'll >>> >> > make everything we have deprecated experimental. >>> >> > >>> >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek < >>> aljos...@apache.org> >>> >> > wrote: >>> >> > >>> >> >> I still think we also need an annotation to mark public interfaces >>> as >>> >> >> experimental. For example for the windowing/triggers I would like >>> to use >>> >> >> that. >>> >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <rmetz...@apache.org> >>> wrote: >>> >> >> > >>> >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh script >>> to >>> >> see >>> >> >> > which tools are used. >>> >> >> > I think finding breaking API changes immediately is a better >>> process >>> >> then >>> >> >> > reworking the APIs before a release. >>> >> >> > >>> >> >> > As you can see from my email response times (2 days since your >>> email), >>> >> >> I'm >>> >> >> > probably too overloaded right now to participate in the Yetus >>> project >>> >> ... >>> >> >> > Sadly. >>> >> >> > >>> >> >> > >>> >> >> > @others: I know its not the most interesting thing to go through >>> my >>> >> list >>> >> >> of >>> >> >> > stable interfaces, but keep in mind that we have to maintain the >>> stuff >>> >> >> for >>> >> >> > probably quite some time, so it would be good to have more than >>> one >>> >> pair >>> >> >> of >>> >> >> > eyes looking at it :) >>> >> >> > >>> >> >> > >>> >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <ndimi...@gmail.com >>> > >>> >> >> wrote: >>> >> >> > >>> >> >> >>> >>> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to >>> fail a >>> >> >> build >>> >> >> >> on >>> >> >> >>> breaking API changes? I would really like to have such a >>> >> functionality >>> >> >> in >>> >> >> >>> Flink, because we can spot breaking changes very early. >>> >> >> >> >>> >> >> >> >>> >> >> >> I don't think we have maven integration for this as of yet. We >>> >> release >>> >> >> >> managers run a script $HBASE/dev-support/check_compatibility.sh >>> that >>> >> >> >> generates a source and binary compatibility report. Issues are >>> then >>> >> >> >> resolved during the period leading up to the release candidate. >>> >> >> >> >>> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches from >>> JIRA >>> >> >> and >>> >> >> >>> then does these >>> >> >> >>> checks? >>> >> >> >>> >>> >> >> >> >>> >> >> >> The "QA bot" is just a collection of shell scripts used during >>> "Patch >>> >> >> >> Available" status when a patch has been attached to JIRA or when >>> a PR >>> >> >> has >>> >> >> >> been submitted through github. The check_compatibility script >>> could >>> >> be >>> >> >> >> included in that automation, I don't see why not. Maybe you'd >>> like to >>> >> >> open >>> >> >> >> a YETUS ticket :) >>> >> >> >> >>> >> >> >> I've pushed a branch to my own GitHub account with all classes I >>> >> would >>> >> >> make >>> >> >> >>> public annotated: >>> >> >> >>> >>> >> >> >>> >>> >> >> >> >>> >> >> >>> >> >>> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1 >>> >> >> >>> Since this is really hard to read, I (half-automated) generated >>> the >>> >> >> >>> following list of annotated classes: >>> >> >> >>> >>> >> >> >>> >>> >> >> >> >>> >> >> >>> >> >>> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md >>> >> >> >>> >>> >> >> >>> Please let me know if you would like to include or exclude >>> classes >>> >> from >>> >> >> >>> that list. >>> >> >> >>> Also, let me know which methods (in stable classes) you would >>> mark >>> >> as >>> >> >> >>> experimental. >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> >>> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek < >>> >> >> aljos...@apache.org> >>> >> >> >>> wrote: >>> >> >> >>> >>> >> >> >>>> +1 for some way of declaring public interfaces as experimental. >>> >> >> >>>> >>> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <se...@apache.org> >>> wrote: >>> >> >> >>>>> >>> >> >> >>>>> I think we need anyways an annotation "@PublicExperimental". >>> >> >> >>>>> >>> >> >> >>>>> We can make this annotation such that it can be added to >>> methods >>> >> and >>> >> >> >>> can >>> >> >> >>>>> use that to declare >>> >> >> >>>>> Methods in an otherwise public class (such as DataSet) as >>> >> >> >> experimental. >>> >> >> >>>>> >>> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske < >>> >> fhue...@gmail.com> >>> >> >> >>>> wrote: >>> >> >> >>>>> >>> >> >> >>>>>> I am not sure if we always should declare complete classes as >>> >> >> >>>>>> @PublicInterface. >>> >> >> >>>>>> This does definitely make sense for interfaces and abstract >>> >> classes >>> >> >> >>>> such as >>> >> >> >>>>>> MapFunction or InputFormat but not necessarily for classes >>> such >>> >> as >>> >> >> >>>> DataSet >>> >> >> >>>>>> that we might want to extend by methods which should not >>> >> immediately >>> >> >> >>> be >>> >> >> >>>>>> considered as stable. >>> >> >> >>>>>> >>> >> >> >>>>>> >>> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri < >>> >> >> >>> vasilikikala...@gmail.com >>> >> >> >>>>> : >>> >> >> >>>>>> >>> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly API >>> >> frozen >>> >> >> >>> yet. >>> >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release, but >>> if >>> >> >> >>>> possible, >>> >> >> >>>>>> I >>> >> >> >>>>>>> would give it some more time. >>> >> >> >>>>>>> >>> >> >> >>>>>>> -V. >>> >> >> >>>>>>> >>> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen < >>> se...@apache.org> >>> >> >> >> wrote: >>> >> >> >>>>>>> >>> >> >> >>>>>>>> I think no component should be forced to be stable. It >>> should >>> >> be >>> >> >> >> an >>> >> >> >>>>>>>> individual decision for each component, and in some cases >>> even >>> >> for >>> >> >> >>>>>>>> individual classes. >>> >> >> >>>>>>>> >>> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared >>> >> interface-frozen, >>> >> >> >>>> then >>> >> >> >>>>>>>> this is a good point to raise and this should definitely be >>> >> >> >>> reflected. >>> >> >> >>>>>>>> There is no point in declaring certain APIs as frozen when >>> we >>> >> are >>> >> >> >>> not >>> >> >> >>>>>> yet >>> >> >> >>>>>>>> confident they have converged. >>> >> >> >>>>>>>> >>> >> >> >>>>>>>> >>> >> >> >>>>>>>> >>> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri < >>> >> >> >>>>>>>> vasilikikala...@gmail.com >>> >> >> >>>>>>>>> wrote: >>> >> >> >>>>>>>> >>> >> >> >>>>>>>>> Hi Robert, >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>>> thanks for bringing this up! >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to >>> annotate the >>> >> >> >>> Gelly >>> >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm >>> quite >>> >> sure >>> >> >> >>>>>> we'll >>> >> >> >>>>>>>> find >>> >> >> >>>>>>>>> things to improve as it gets more exposure. >>> >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also e.g. >>> ML, >>> >> >> >> Table) >>> >> >> >>>>>> to >>> >> >> >>>>>>> a >>> >> >> >>>>>>>>> "1.0" status (from an API stability point of view) since >>> >> they're >>> >> >> >>>>>> really >>> >> >> >>>>>>>>> young compared to the other Flink APIs. >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>>> Cheers, >>> >> >> >>>>>>>>> Vasia. >>> >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" < >>> >> rmetz...@apache.org> >>> >> >> >>>>>> wrote: >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>>>> Hi everyone, >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> I would like to bring this discussion back to your >>> attention >>> >> as >>> >> >> >> we >>> >> >> >>>>>>> seem >>> >> >> >>>>>>>>> to >>> >> >> >>>>>>>>>> approach the 1.0 release of Flink. >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> My suggestion back in January was to annotate all >>> classes, >>> >> but I >>> >> >> >>>>>>> think >>> >> >> >>>>>>>>>> it'll be more feasible to just annotate public classes. >>> >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such as: >>> >> DataSet, >>> >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat, >>> MapFunction, >>> >> >> >>>>>>> FileSystems >>> >> >> >>>>>>>>> but >>> >> >> >>>>>>>>>> also Gelly for example. >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> I would not annotate as public components such as ML, >>> Storm >>> >> >> >>>>>>>>> compatibility, >>> >> >> >>>>>>>>>> internals from runtime, yarn, optimizer. >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> From a tooling perspective, I've looked into different >>> maven >>> >> >> >>>>>> plugins >>> >> >> >>>>>>>> and >>> >> >> >>>>>>>>>> java libraries and I found >>> https://github.com/siom79/japicmp >>> >> to >>> >> >> >>> be >>> >> >> >>>>>>> the >>> >> >> >>>>>>>>>> closest to our needs. I actually opened a pull request >>> to the >>> >> >> >>>>>> project >>> >> >> >>>>>>>> to >>> >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on >>> annotations. >>> >> Lets >>> >> >> >>>>>> hope >>> >> >> >>>>>>> it >>> >> >> >>>>>>>>>> gets merged. >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> Does everybody agree with adding just the >>> @PublicInterface >>> >> >> >>>>>>> annotation? >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level, >>> making >>> >> the >>> >> >> >>>>>> entire >>> >> >> >>>>>>>>> class >>> >> >> >>>>>>>>>> either public or private (from a stability point of >>> view). >>> >> If we >>> >> >> >>>>>>> need a >>> >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second >>> >> >> >>>>>>> @PrivateInterface >>> >> >> >>>>>>>>>> annotation which we'll only apply to certain methods. >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> The next step is that I'm going to open a pull request >>> with >>> >> all >>> >> >> >>>>>>> classes >>> >> >> >>>>>>>>>> annotated that I consider public. >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra < >>> >> >> >>>>>>>> henry.sapu...@gmail.com> >>> >> >> >>>>>>>>>> wrote: >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>>>> I like the idea. But would love to have different name >>> for >>> >> the >>> >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish. >>> >> >> >>>>>>>>>>> How about "Module" or "Package" ? >>> >> >> >>>>>>>>>>> >>> >> >> >>>>>>>>>>> - Henry >>> >> >> >>>>>>>>>>> >>> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger < >>> >> >> >>>>>>> rmetz...@apache.org >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>>>>> wrote: >>> >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the >>> different >>> >> >> >>>>>>>>> components >>> >> >> >>>>>>>>>> of >>> >> >> >>>>>>>>>>>> the project. >>> >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn"). >>> >> >> >>>>>>>>>>>> Here is a very good documentation on the topic: >>> >> >> >>>>>>>>>>>> >>> >> >> >>>>>>>>>>> >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>> >>> >> >> >>>>>>> >>> >> >> >>>>>> >>> >> >> >>>> >>> >> >> >>> >>> >> >> >> >>> >> >> >>> >> >>> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html >>> >> >> >>>>>>>>>>>> >>> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander Alexandrov < >>> >> >> >>>>>>>>>>>> alexander.s.alexand...@gmail.com> wrote: >>> >> >> >>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>> I don't get the difference between Private and >>> >> >> >> LimitedPrivate, >>> >> >> >>>>>>> but >>> >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea. >>> >> >> >>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what these >>> tags >>> >> >> >>>>>>> actually >>> >> >> >>>>>>>>>> mean >>> >> >> >>>>>>>>>>> and >>> >> >> >>>>>>>>>>>>> add this meaning to the documentation. >>> >> >> >>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger < >>> >> >> >>>>>> rmetz...@apache.org >>> >> >> >>>>>>>> : >>> >> >> >>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> Hi, >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability and >>> >> >> >>>>>> audience >>> >> >> >>>>>>> of >>> >> >> >>>>>>>>>>> classes >>> >> >> >>>>>>>>>>>>>> and methods. >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> Through that, you can have @InterfaceAudience.Public, >>> >> >> >>>>>> Private, >>> >> >> >>>>>>>>>>>>>> LimitedPrivate >>> >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable, and >>> >> Stable. >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to automatically >>> >> check >>> >> >> >>>>>> if >>> >> >> >>>>>>>>>>> interfaces, >>> >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed between >>> >> >> >>>>>> releases >>> >> >> >>>>>>>> (or >>> >> >> >>>>>>>>> by >>> >> >> >>>>>>>>>>> pull >>> >> >> >>>>>>>>>>>>>> requests). >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want to >>> >> guarantee >>> >> >> >>>>>>>>>> interface >>> >> >> >>>>>>>>>>>>>> stability between releases. >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> What do you think? Should we add those annotations? >>> Which >>> >> >> >>>>>> one >>> >> >> >>>>>>>>> would >>> >> >> >>>>>>>>>>> you >>> >> >> >>>>>>>>>>>>>> like to add? >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>>> Robert >>> >> >> >>>>>>>>>>>>>> >>> >> >> >>>>>>>>>>>>> >>> >> >> >>>>>>>>>>> >>> >> >> >>>>>>>>>> >>> >> >> >>>>>>>>> >>> >> >> >>>>>>>> >>> >> >> >>>>>>> >>> >> >> >>>>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>> >>> >> >> >> >>> >> >> >>> >> >> >>> >> >>> >> >>