Thanks for the explanation. I was just wondering how you approached the problem. Going through class-by-class makes sense.
Not sure whether we want to make "org.apache.flink.core.memory" and "org.apache.flink.api.common.distributions" stable interfaces. I would think these qualify more as experimental public. Shouldn't we try to minimize the number of classes we declare stable? Of course it is nice to offer a stable interface to developers but it might also come in the way of the Flink developers.. On Mon, Nov 30, 2015 at 10:31 AM, Robert Metzger <rmetz...@apache.org> wrote: > Hi Max, > > classes in flink-scala are annotated as well, and its also in the list :) > > I considered classes in flink-core, flink-runtime, flink-scala, > flink-streaming-java, flink-streaming-scala, > flink-connector-kafka, flink-connector-filesystem, flink-avro and > flink-hadoop-compatibility. > I think there is no clear definition for a public interface, that's why I > just decided on a class-by-class basis. > > Classes I left out / I was uncertain with: > > - org.apache.flink.api.common.distributions > - only some Input/output classes in org.apache.flink.api.common.io > - org.apache.flink.api.common.operators > - only the TypeInformation in org.apache.flink.api.common.typeinfo (not > the Atomic, basic, integer, .. type infos) > - most in org.apache.flink.core.memory (except Input/output view) > - I didn’t add the parsers in org.apache.flink.types.parser > > > > > On Mon, Nov 30, 2015 at 10:19 AM, Maximilian Michels <m...@apache.org> wrote: > >> Thank for your getting us started on annotating the API. The list >> looks good so far. I have the feeling it could even be extended a bit. >> Just curious, how did you choose which classes you annotate? Did you >> go through all the classes in flink-core, flink-java, and >> flink-clients Maven projects? >> >> What about flink-scala? Shouldn't it be annotated as well? >> >> On Fri, Nov 27, 2015 at 4:47 PM, Robert Metzger <rmetz...@apache.org> >> wrote: >> > Okay, I'll introduce an annotation for experimental interfaces and I'll >> > make everything we have deprecated experimental. >> > >> > On Fri, Nov 27, 2015 at 10:40 AM, Aljoscha Krettek <aljos...@apache.org> >> > wrote: >> > >> >> I still think we also need an annotation to mark public interfaces as >> >> experimental. For example for the windowing/triggers I would like to use >> >> that. >> >> > On 25 Nov 2015, at 01:23, Robert Metzger <rmetz...@apache.org> wrote: >> >> > >> >> > Thank you Nick. I'll look into the check_compatiblilty.sh script to >> see >> >> > which tools are used. >> >> > I think finding breaking API changes immediately is a better process >> then >> >> > reworking the APIs before a release. >> >> > >> >> > As you can see from my email response times (2 days since your email), >> >> I'm >> >> > probably too overloaded right now to participate in the Yetus project >> ... >> >> > Sadly. >> >> > >> >> > >> >> > @others: I know its not the most interesting thing to go through my >> list >> >> of >> >> > stable interfaces, but keep in mind that we have to maintain the stuff >> >> for >> >> > probably quite some time, so it would be good to have more than one >> pair >> >> of >> >> > eyes looking at it :) >> >> > >> >> > >> >> > On Mon, Nov 23, 2015 at 6:20 PM, Nick Dimiduk <ndimi...@gmail.com> >> >> wrote: >> >> > >> >> >>> >> >> >>> Do you know if Hadoop/HBase is also using a maven plugin to fail a >> >> build >> >> >> on >> >> >>> breaking API changes? I would really like to have such a >> functionality >> >> in >> >> >>> Flink, because we can spot breaking changes very early. >> >> >> >> >> >> >> >> >> I don't think we have maven integration for this as of yet. We >> release >> >> >> managers run a script $HBASE/dev-support/check_compatibility.sh that >> >> >> generates a source and binary compatibility report. Issues are then >> >> >> resolved during the period leading up to the release candidate. >> >> >> >> >> >> I think Hadoop is relying on a "QA bot" which reads patches from JIRA >> >> and >> >> >>> then does these >> >> >>> checks? >> >> >>> >> >> >> >> >> >> The "QA bot" is just a collection of shell scripts used during "Patch >> >> >> Available" status when a patch has been attached to JIRA or when a PR >> >> has >> >> >> been submitted through github. The check_compatibility script could >> be >> >> >> included in that automation, I don't see why not. Maybe you'd like to >> >> open >> >> >> a YETUS ticket :) >> >> >> >> >> >> I've pushed a branch to my own GitHub account with all classes I >> would >> >> make >> >> >>> public annotated: >> >> >>> >> >> >>> >> >> >> >> >> >> https://github.com/apache/flink/compare/master...rmetzger:interface_stability_revapi?expand=1 >> >> >>> Since this is really hard to read, I (half-automated) generated the >> >> >>> following list of annotated classes: >> >> >>> >> >> >>> >> >> >> >> >> >> https://github.com/rmetzger/flink/blob/interface_stability_revapi/annotations.md >> >> >>> >> >> >>> Please let me know if you would like to include or exclude classes >> from >> >> >>> that list. >> >> >>> Also, let me know which methods (in stable classes) you would mark >> as >> >> >>> experimental. >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> On Wed, Nov 11, 2015 at 10:17 AM, Aljoscha Krettek < >> >> aljos...@apache.org> >> >> >>> wrote: >> >> >>> >> >> >>>> +1 for some way of declaring public interfaces as experimental. >> >> >>>> >> >> >>>>> On 10 Nov 2015, at 22:24, Stephan Ewen <se...@apache.org> wrote: >> >> >>>>> >> >> >>>>> I think we need anyways an annotation "@PublicExperimental". >> >> >>>>> >> >> >>>>> We can make this annotation such that it can be added to methods >> and >> >> >>> can >> >> >>>>> use that to declare >> >> >>>>> Methods in an otherwise public class (such as DataSet) as >> >> >> experimental. >> >> >>>>> >> >> >>>>> On Tue, Nov 10, 2015 at 10:19 PM, Fabian Hueske < >> fhue...@gmail.com> >> >> >>>> wrote: >> >> >>>>> >> >> >>>>>> I am not sure if we always should declare complete classes as >> >> >>>>>> @PublicInterface. >> >> >>>>>> This does definitely make sense for interfaces and abstract >> classes >> >> >>>> such as >> >> >>>>>> MapFunction or InputFormat but not necessarily for classes such >> as >> >> >>>> DataSet >> >> >>>>>> that we might want to extend by methods which should not >> immediately >> >> >>> be >> >> >>>>>> considered as stable. >> >> >>>>>> >> >> >>>>>> >> >> >>>>>> 2015-11-10 21:36 GMT+01:00 Vasiliki Kalavri < >> >> >>> vasilikikala...@gmail.com >> >> >>>>> : >> >> >>>>>> >> >> >>>>>>> Yes, my opinion is that we shouldn't declare the Gelly API >> frozen >> >> >>> yet. >> >> >>>>>>> We can reconsider when we're closer to the 1.0 release, but if >> >> >>>> possible, >> >> >>>>>> I >> >> >>>>>>> would give it some more time. >> >> >>>>>>> >> >> >>>>>>> -V. >> >> >>>>>>> >> >> >>>>>>> On 10 November 2015 at 21:06, Stephan Ewen <se...@apache.org> >> >> >> wrote: >> >> >>>>>>> >> >> >>>>>>>> I think no component should be forced to be stable. It should >> be >> >> >> an >> >> >>>>>>>> individual decision for each component, and in some cases even >> for >> >> >>>>>>>> individual classes. >> >> >>>>>>>> >> >> >>>>>>>> @Vasia If you think Gelly should not be declared >> interface-frozen, >> >> >>>> then >> >> >>>>>>>> this is a good point to raise and this should definitely be >> >> >>> reflected. >> >> >>>>>>>> There is no point in declaring certain APIs as frozen when we >> are >> >> >>> not >> >> >>>>>> yet >> >> >>>>>>>> confident they have converged. >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> >> >> >>>>>>>> On Tue, Nov 10, 2015 at 8:39 PM, Vasiliki Kalavri < >> >> >>>>>>>> vasilikikala...@gmail.com >> >> >>>>>>>>> wrote: >> >> >>>>>>>> >> >> >>>>>>>>> Hi Robert, >> >> >>>>>>>>> >> >> >>>>>>>>> thanks for bringing this up! >> >> >>>>>>>>> >> >> >>>>>>>>> I generally like the idea, but I wouldn't rush to annotate the >> >> >>> Gelly >> >> >>>>>>>>> classes yet. Gelly hasn't had that many users and I'm quite >> sure >> >> >>>>>> we'll >> >> >>>>>>>> find >> >> >>>>>>>>> things to improve as it gets more exposure. >> >> >>>>>>>>> TBH, I think it's quite unfair to force Gelly (also e.g. ML, >> >> >> Table) >> >> >>>>>> to >> >> >>>>>>> a >> >> >>>>>>>>> "1.0" status (from an API stability point of view) since >> they're >> >> >>>>>> really >> >> >>>>>>>>> young compared to the other Flink APIs. >> >> >>>>>>>>> >> >> >>>>>>>>> Cheers, >> >> >>>>>>>>> Vasia. >> >> >>>>>>>>> On Nov 10, 2015 8:04 PM, "Robert Metzger" < >> rmetz...@apache.org> >> >> >>>>>> wrote: >> >> >>>>>>>>> >> >> >>>>>>>>>> Hi everyone, >> >> >>>>>>>>>> >> >> >>>>>>>>>> I would like to bring this discussion back to your attention >> as >> >> >> we >> >> >>>>>>> seem >> >> >>>>>>>>> to >> >> >>>>>>>>>> approach the 1.0 release of Flink. >> >> >>>>>>>>>> >> >> >>>>>>>>>> My suggestion back in January was to annotate all classes, >> but I >> >> >>>>>>> think >> >> >>>>>>>>>> it'll be more feasible to just annotate public classes. >> >> >>>>>>>>>> So how about adding an annotation @PublicInterface >> >> >>>>>>>>>> >> >> >>>>>>>>>> For @PublicInterface, I would annotate classes such as: >> DataSet, >> >> >>>>>>>>>> DataStream, ExecutionEnvironment, InputFormat, MapFunction, >> >> >>>>>>> FileSystems >> >> >>>>>>>>> but >> >> >>>>>>>>>> also Gelly for example. >> >> >>>>>>>>>> >> >> >>>>>>>>>> I would not annotate as public components such as ML, Storm >> >> >>>>>>>>> compatibility, >> >> >>>>>>>>>> internals from runtime, yarn, optimizer. >> >> >>>>>>>>>> >> >> >>>>>>>>>> >> >> >>>>>>>>>> From a tooling perspective, I've looked into different maven >> >> >>>>>> plugins >> >> >>>>>>>> and >> >> >>>>>>>>>> java libraries and I found https://github.com/siom79/japicmp >> to >> >> >>> be >> >> >>>>>>> the >> >> >>>>>>>>>> closest to our needs. I actually opened a pull request to the >> >> >>>>>> project >> >> >>>>>>>> to >> >> >>>>>>>>>> allow inclusion/exclusion of classes based on annotations. >> Lets >> >> >>>>>> hope >> >> >>>>>>> it >> >> >>>>>>>>>> gets merged. >> >> >>>>>>>>>> >> >> >>>>>>>>>> Does everybody agree with adding just the @PublicInterface >> >> >>>>>>> annotation? >> >> >>>>>>>>>> >> >> >>>>>>>>>> Note that I'll add the annotation on a class-level, making >> the >> >> >>>>>> entire >> >> >>>>>>>>> class >> >> >>>>>>>>>> either public or private (from a stability point of view). >> If we >> >> >>>>>>> need a >> >> >>>>>>>>>> more fine-grained annotation, we have to add a second >> >> >>>>>>> @PrivateInterface >> >> >>>>>>>>>> annotation which we'll only apply to certain methods. >> >> >>>>>>>>>> >> >> >>>>>>>>>> The next step is that I'm going to open a pull request with >> all >> >> >>>>>>> classes >> >> >>>>>>>>>> annotated that I consider public. >> >> >>>>>>>>>> >> >> >>>>>>>>>> >> >> >>>>>>>>>> On Fri, Jan 30, 2015 at 7:10 PM, Henry Saputra < >> >> >>>>>>>> henry.sapu...@gmail.com> >> >> >>>>>>>>>> wrote: >> >> >>>>>>>>>> >> >> >>>>>>>>>>> I like the idea. But would love to have different name for >> the >> >> >>>>>>>>>>> "LimitedPrivate" to make it easier to distinguish. >> >> >>>>>>>>>>> How about "Module" or "Package" ? >> >> >>>>>>>>>>> >> >> >>>>>>>>>>> - Henry >> >> >>>>>>>>>>> >> >> >>>>>>>>>>> On Wed, Jan 28, 2015 at 1:29 AM, Robert Metzger < >> >> >>>>>>> rmetz...@apache.org >> >> >>>>>>>>> >> >> >>>>>>>>>>> wrote: >> >> >>>>>>>>>>>> I think in Hadoop they use LimitedPrivate for the different >> >> >>>>>>>>> components >> >> >>>>>>>>>> of >> >> >>>>>>>>>>>> the project. >> >> >>>>>>>>>>>> For example LimitedPrivate("yarn"). >> >> >>>>>>>>>>>> Here is a very good documentation on the topic: >> >> >>>>>>>>>>>> >> >> >>>>>>>>>>> >> >> >>>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>> >> >> >>>>>>> >> >> >>>>>> >> >> >>>> >> >> >>> >> >> >> >> >> >> https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/InterfaceClassification.html >> >> >>>>>>>>>>>> >> >> >>>>>>>>>>>> On Tue, Jan 27, 2015 at 3:58 PM, Alexander Alexandrov < >> >> >>>>>>>>>>>> alexander.s.alexand...@gmail.com> wrote: >> >> >>>>>>>>>>>> >> >> >>>>>>>>>>>>> I don't get the difference between Private and >> >> >> LimitedPrivate, >> >> >>>>>>> but >> >> >>>>>>>>>>>>> otherwise seems like quite a nice idea. >> >> >>>>>>>>>>>>> >> >> >>>>>>>>>>>>> It will be also good if we can agree upon what these tags >> >> >>>>>>> actually >> >> >>>>>>>>>> mean >> >> >>>>>>>>>>> and >> >> >>>>>>>>>>>>> add this meaning to the documentation. >> >> >>>>>>>>>>>>> >> >> >>>>>>>>>>>>> 2015-01-27 15:46 GMT+01:00 Robert Metzger < >> >> >>>>>> rmetz...@apache.org >> >> >>>>>>>> : >> >> >>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> Hi, >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> Hadoop has annotations for tagging the stability and >> >> >>>>>> audience >> >> >>>>>>> of >> >> >>>>>>>>>>> classes >> >> >>>>>>>>>>>>>> and methods. >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> Through that, you can have @InterfaceAudience.Public, >> >> >>>>>> Private, >> >> >>>>>>>>>>>>>> LimitedPrivate >> >> >>>>>>>>>>>>>> and also @InterfaceStability.Evolving, Unstable, and >> Stable. >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> I guess there are tools which allow to automatically >> check >> >> >>>>>> if >> >> >>>>>>>>>>> interfaces, >> >> >>>>>>>>>>>>>> which are marked as Stable have been changed between >> >> >>>>>> releases >> >> >>>>>>>> (or >> >> >>>>>>>>> by >> >> >>>>>>>>>>> pull >> >> >>>>>>>>>>>>>> requests). >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> I think such annotations are crucial if we want to >> guarantee >> >> >>>>>>>>>> interface >> >> >>>>>>>>>>>>>> stability between releases. >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> What do you think? Should we add those annotations? Which >> >> >>>>>> one >> >> >>>>>>>>> would >> >> >>>>>>>>>>> you >> >> >>>>>>>>>>>>>> like to add? >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>>> Robert >> >> >>>>>>>>>>>>>> >> >> >>>>>>>>>>>>> >> >> >>>>>>>>>>> >> >> >>>>>>>>>> >> >> >>>>>>>>> >> >> >>>>>>>> >> >> >>>>>>> >> >> >>>>>> >> >> >>>> >> >> >>>> >> >> >>> >> >> >> >> >> >> >> >>