It sounds like there is agreement in eliminating the experimental annotation. Should we stop using them in new code? Or should we do a pass to remove those annotations?
On Mon, Apr 17, 2023 at 11:24 AM Kenneth Knowles <k...@apache.org> wrote: > > > On Mon, Apr 17, 2023 at 9:34 AM Kerry Donny-Clark via dev < > dev@beam.apache.org> wrote: > >> +1 to eliminating @Experimental as a Beam level annotation. >> I think the main point is that if no one pays attention to such >> annotations, then they are only noise and deliver negative value. >> > > Yes. Consider these two scenarios > > 1. We change an "experimental" API that is widely used. This causes a pain > for many users. We would probably not do it, and we would catch it in code > review. > 2. We change a non-"experimental" API that is fairly new. This applies to > many APIs, since we rarely remember to annotate new APIs. This causes just > minor pain for just a few users. TBH I would be OK with this. Rigidity in > rejecting such changes just means your first draft is your final draft. Try > that in any other endeavor and see how it works for you :-) > > And it is worse than noise - there are some users who do pay attention to > the annotations and are not using things even though they are super safe. > That was the main reason I started this thread. The rest of my proposal was > just to try to recover some flexibility, but it seems too hard and no > immediate consensus on how/if we could manage it. > > Kenn > > PS I do agree with Kerry's PS and would love to have that discussion. > Perhaps separately, since it will start from square one either way. Every > time someone says "Beam 3.0" we should really be thinking "how can we > iterate". One big breaking version change doesn't work. > +1 - Thinking about "How can we iterate" would allow us to build something users' want in shorter timelines. > > > > Kerry >> >> PS- Kenn says " the point about the culture of stagnation came from my >> recent experiences as code reviewer where there was some idea that we >> couldn't change things even when they were plainly wrong and the change was >> plainly a fix." This seems like a major point that deserves a more focused >> discussion. >> >> On Fri, Apr 14, 2023 at 5:47 PM Chamikara Jayalath via dev < >> dev@beam.apache.org> wrote: >> >>> I think we've been using the Java Experimental tags in two ways. >>> >>> * New APIs >>> * Any APIs that use specific features identified by pre-defined >>> experimental Kind types defined in [1] (for example, I/O connectors APIs >>> that use Beam Schemas). >>> >>> Removing the experimental tag has the effect of finalizing a number of >>> APIs we've been reluctant to call stable (for example, Beam Schemas, >>> portability, metrics related APIs). These APIs have been around for a long >>> time and I don't see them changing so probably this is the right thing to >>> do. But I just wanted to call it out. >>> >>> Thanks, >>> Cham >>> >>> [1] >>> https://github.com/apache/beam/blob/b9f27f9da2e63b564feecaeb593d7b12783192b0/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java#L48 >>> >>> On Fri, Apr 14, 2023 at 1:26 PM Ahmet Altay via dev <dev@beam.apache.org> >>> wrote: >>> >>>> >>>> >>>> On Fri, Apr 14, 2023 at 1:15 PM Kenneth Knowles <k...@apache.org> >>>> wrote: >>>> >>>>> >>>>> Thanks for the discussion. Many good points. Probably just removing >>>>> all the annotations is a noop to users, and will solve the "afraid to use >>>>> experimental features" problem. >>>>> >>>>> Regarding stability, the capabilities of Java (and Python is much much >>>>> worse) make it infeasible to produce quality software with the rule "once >>>>> it is public it is frozen forever". But on the other hand, there isn't >>>>> much >>>>> of a practical alternative. Most projects just make breaking changes at >>>>> minor releases quite often, in my experience. I don't want to follow that >>>>> pattern, for sure. >>>>> >>>>> Regarding Danny's comment of not seeing this culture - check out any >>>>> of our more mature IOs, which all have very high cyclomatic complexity due >>>>> to never being significantly refactored. Adhering to in-place state >>>>> compatibility for update instead of focusing on blue/green deployment is >>>>> also a culprit here. I don't have examples to mind, but the point about >>>>> the >>>>> culture of stagnation came from my recent experiences as code >>>>> reviewer where there was some idea that we couldn't change things even >>>>> when >>>>> they were plainly wrong and the change was plainly a fix. >>>>> >>>>> Often, it comes from corners like triggered side inputs where we >>>>> simply never had a clear concept and so bringing things into alignment >>>>> with >>>>> a spec will break someone, by necessity. To be clear: I have not received >>>>> pushback on that one (yet). Some other examples are >>>>> https://s.apache.org/finishing-triggers-drop-data (breaking change >>>>> necessary to eliminate data loss risk) >>>>> https://github.com/apache/beam/issues/20528 (fix was too slow because >>>>> we were hesitant to commit a breaking fix) >>>>> https://github.com/apache/beam/pull/8134#pullrequestreview-218592801 >>>>> (left unsafe API in place, applied doc-only fix). >>>>> >>>>> But indeed, of all the issues I raised, the customer concern with >>>>> `@Experimental` was the most important. We have had a few threads about it >>>>> in the past, too, and it hasn't gotten better. >>>>> >>>>> 1. It does not have the intended effect (making users OK with >>>>> evolving APIs and behavior to allow us to reach a high level of quality) >>>>> 2. It has an unintended effect (making users afraid to use things >>>>> which they should be happy to use) >>>>> 3. We don't use it consistently (many less-safe things are not >>>>> experimental, many totally stable things are experimental) >>>>> >>>>> Because of 3, if we don't have a feasible way to move to >>>>> "evolving/unstable by default" in a way that users know and are OK with, >>>>> then 1 is impossible. And so the only way to fix 2 is to just eliminate >>>>> the >>>>> annotation approach entirely and go with language conventions. >>>>> >>>> >>>> +1 to eliminating @Experimental as a Beam level annotation. That is the >>>> simplest approach that will get us to a consistent state, and it will align >>>> the goals and intentions of us with users'. >>>> >>>> >>>>> >>>>> Kenn >>>>> >>>>> On Wed, Apr 12, 2023 at 5:10 PM Ahmet Altay via dev < >>>>> dev@beam.apache.org> wrote: >>>>> >>>>>> I agree with Alexey and Byron. >>>>>> 1. We do not have any concrete evidence of our users paying attention >>>>>> to any of those annotations. Experimental API that were in that state >>>>>> for a >>>>>> long while are good examples. A possible exception is a deprecated >>>>>> annotation. My preference would be to simplify annotations to nothing >>>>>> (stable enough for use and will evolve backward compatibility), and maybe >>>>>> deprecated annotations. >>>>>> 2. If you all think that Experimental annotation is needed, Byron's >>>>>> suggestion (more or less what we do today) but with some concrete life >>>>>> cycle definitions of those annotations would be useful to our users. (An >>>>>> example could be: experimental APIs either need to graduate or be removed >>>>>> in X releases.) >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Apr 4, 2023 at 9:01 AM Alexey Romanenko < >>>>>> aromanenko....@gmail.com> wrote: >>>>>> >>>>>>> Great and long-to-wait topic to discuss. >>>>>>> >>>>>>> My personal opinion based on what I saw on different open-source >>>>>>> projects is that all such annotations, like @Experimental or @Stable, >>>>>>> are >>>>>>> not usefull along the time and even rather useless and misleading. What >>>>>>> actually play roles is artifacts publishing and public API despite how >>>>>>> it >>>>>>> was annotated. Once a class/method was published and available for >>>>>>> users to >>>>>>> use, it should be considered as “stable" (even if it’s not yet stable >>>>>>> from >>>>>>> its developers point of view) and can’t be easily removed/changed in the >>>>>>> next releases. >>>>>>> >>>>>>> At Beam, we have a “good" example with @Experimental that was used >>>>>>> to annotate many parts of code in the beginning of its creation but then >>>>>>> perhaps forgotten to be removed whenever this code is already used by >>>>>>> many >>>>>>> users and API can’t be just changed despite of this annotation. >>>>>>> >>>>>>> So, I’m pro to dismiss such annotations and consider all public and >>>>>>> user-available API as “stable”. If it’s needed to change/remove a public >>>>>>> API then we should follow the procedure of API deprecation and final >>>>>>> removing, at least, after 3 major (x.y) Beam releases. It should help to >>>>>>> have the clear rules for API changes and avoiding breaking changes for >>>>>>> users. >>>>>>> >>>>>>> — >>>>>>> Alexey >>>>>>> >>>>>>> >>>>>>> On 3 Apr 2023, at 17:04, Byron Ellis via dev <dev@beam.apache.org> >>>>>>> wrote: >>>>>>> >>>>>>> Honestly, I think APIs could be pretty simply defined if you think >>>>>>> of it in terms of the user: >>>>>>> >>>>>>> @Deprecated = this was either stable or evolve but the >>>>>>> functionality/interface will go away at a future date >>>>>>> >>>>>>> @Stable = the user of this API opting out of changes to >>>>>>> functionality and interface. For example, default options don't change >>>>>>> for >>>>>>> a transform annotated this way. >>>>>>> >>>>>>> Evolving (No Annotation) = the user is opting in to changes to >>>>>>> functionality but not to interface. We should generally try to write >>>>>>> backwards compatible code, but on the other hand the release model does >>>>>>> not >>>>>>> force users into an upgrade >>>>>>> >>>>>>> @Experimental = this functionality / interface might be a bad idea >>>>>>> and could go away at any time >>>>>>> >>>>>>> >>>>>>> On Mon, Apr 3, 2023 at 7:22 AM Danny McCormick via dev < >>>>>>> dev@beam.apache.org> wrote: >>>>>>> >>>>>>>> *;tldr - I'd like "evolving" to be further defined, specifically >>>>>>>> around how we will make decisions about breaking behavior and API >>>>>>>> changes* >>>>>>>> >>>>>>>> I don't particularly care what tags we use as long as they're well >>>>>>>> documented. With that said, I think the following framing needs to be >>>>>>>> documented with more definition to flesh out the underlying philosophy: >>>>>>>> >>>>>>>> *> - new code is changeable/evolving by default (so we don't have >>>>>>>> to always remember to annotate it) but users have confidence they can >>>>>>>> use >>>>>>>> it in production (because we have good software engineering practices)* >>>>>>>> >>>>>>>> * > - Experimental would be reserved for more risky things* >>>>>>>> * > - after we are confident an API is stable, because it has been >>>>>>>> the same across a couple releases, we mark it* >>>>>>>> >>>>>>>> Here, we have 3 classes of APIs - "experimental", "stable", and >>>>>>>> "evolving" (or alternately "undefined"). >>>>>>>> >>>>>>>> "Experimental" seems clear - we can make any changes we want. >>>>>>>> "Stable" is reasonably straightforward as well - we will only make >>>>>>>> non-breaking changes except in exceptional cases (e.g. security hole, >>>>>>>> total >>>>>>>> failure of functionality, etc...) >>>>>>>> >>>>>>>> With "evolving" is the idea that we can still make any changes we >>>>>>>> want, but we think it's less likely we'll need to? Are silent behavior >>>>>>>> changes acceptable here (my vote would be no)? What about breaking API >>>>>>>> changes (my vote would be rarely)? >>>>>>>> >>>>>>>> I think being able to change our APIs is an ok goal, but outside of >>>>>>>> a true experimental context we should still be weighing the cost of API >>>>>>>> changes against the benefit; we have a problem of people not updating >>>>>>>> to >>>>>>>> newer SDKs, and introducing more breaking changes will just exacerbate >>>>>>>> that >>>>>>>> problem. Maybe my concerns are just a consequence of me not really >>>>>>>> seeing >>>>>>>> the same things that you're seeing, specifically: "*I'm seeing a >>>>>>>> culture of being afraid to change things, even when it would be good >>>>>>>> for >>>>>>>> users, because our API surface area is far too large and not explicitly >>>>>>>> chosen.*" Mostly what I've seen is a healthy concern about making >>>>>>>> it hard for users to upgrade versions, but my view is probably just >>>>>>>> limited >>>>>>>> here. >>>>>>>> >>>>>>>> My ideal framing for "evolving" is: an *evolving* API can make >>>>>>>> breaking API changes between versions, but this will be rare and >>>>>>>> weighed >>>>>>>> against the cost of slowing users' upgrade process. All breaking >>>>>>>> changes >>>>>>>> will be communicated in our change log. An *evolving* API will not >>>>>>>> make silent behavior changes except in exceptional cases (e.g. >>>>>>>> patching a >>>>>>>> security gap, fixing total failures of functionality). >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Danny >>>>>>>> >>>>>>>> On Mon, Apr 3, 2023 at 9:02 AM Jan Lukavský <je...@seznam.cz> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> removing @Experimental and adding explicit @Stable annotation >>>>>>>>> makes >>>>>>>>> sense to me. FWIW, when we were designing Euphoria API, we adopted >>>>>>>>> the >>>>>>>>> following convention: >>>>>>>>> >>>>>>>>> - the default stability of "evolving", @Experimental for really >>>>>>>>> experimental code [1] >>>>>>>>> >>>>>>>>> - target @Audience of API [2] (pipeline author, runner, >>>>>>>>> internal, test) >>>>>>>>> >>>>>>>>> - and @StateComplexity of operators (PTransforms) [3] >>>>>>>>> >>>>>>>>> The last part is something that was planned to be used by tools >>>>>>>>> that can >>>>>>>>> analyze the Pipeline for performance or visualize which >>>>>>>>> transform(s) are >>>>>>>>> most state-consuming. But this ended only as plans. :) >>>>>>>>> >>>>>>>>> Jan >>>>>>>>> >>>>>>>>> [1] >>>>>>>>> >>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/stability/Experimental.java >>>>>>>>> >>>>>>>>> [2] >>>>>>>>> >>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/audience/Audience.java >>>>>>>>> >>>>>>>>> [3] >>>>>>>>> >>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/operator/StateComplexity.java >>>>>>>>> >>>>>>>>> >>>>>>>>> On 3/31/23 23:05, Kenneth Knowles wrote: >>>>>>>>> > Hi all, >>>>>>>>> > >>>>>>>>> > Long ago, we adopted two annotations in Beam to communicate to >>>>>>>>> users: >>>>>>>>> > >>>>>>>>> > - `@Experimental` indicates that an API might change >>>>>>>>> > - `@Internal` indicates that an API is not meant for users. >>>>>>>>> > >>>>>>>>> > I've seen some real problems with this approach: >>>>>>>>> > >>>>>>>>> > - Users are afraid to use `@Experimental` APIs, because they >>>>>>>>> are >>>>>>>>> > worried they are not production-ready. But it really just means >>>>>>>>> they >>>>>>>>> > might change, and has nothing to do with that. >>>>>>>>> > - People write new code and do not put `@Experimental` >>>>>>>>> annotations on >>>>>>>>> > it, even though it really should be able to change for a while, >>>>>>>>> so we >>>>>>>>> > can do a good job. >>>>>>>>> > - I'm seeing a culture of being afraid to change things, even >>>>>>>>> when it >>>>>>>>> > would be good for users, because our API surface area is far too >>>>>>>>> large >>>>>>>>> > and not explicitly chosen. >>>>>>>>> > - `@Internal` is not that well-known. And now we have many >>>>>>>>> target >>>>>>>>> > audiences: Beam devs, PTransform devs, tool devs, pipeline >>>>>>>>> authors. >>>>>>>>> > Some of them probably want to use `@Internal` stuff! >>>>>>>>> > >>>>>>>>> > I looked at a couple sibling projects and what they have >>>>>>>>> > - Flink: >>>>>>>>> > - Spark: >>>>>>>>> > >>>>>>>>> > They have many more tags, and some of them seem to have reverse >>>>>>>>> > defaults to Beam. >>>>>>>>> > >>>>>>>>> > Flink: >>>>>>>>> > >>>>>>>>> https://github.com/apache/flink/tree/master/flink-annotations/src/main/java/org/apache/flink/annotation >>>>>>>>> > >>>>>>>>> > - Experimental >>>>>>>>> > - Internal.java >>>>>>>>> > - Public >>>>>>>>> > - PublicEvolving >>>>>>>>> > - VisibleForTesting >>>>>>>>> > >>>>>>>>> > Spark: >>>>>>>>> > >>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/java/org/apache/spark/annotation >>>>>>>>> and >>>>>>>>> >>>>>>>>> > >>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/scala/org/apache/spark/annotation >>>>>>>>> > >>>>>>>>> > - AlphaComponent >>>>>>>>> > - DeveloperApi >>>>>>>>> > - Evolving >>>>>>>>> > - Experimental >>>>>>>>> > - Private >>>>>>>>> > - Stable >>>>>>>>> > - Unstable >>>>>>>>> > - Since >>>>>>>>> > >>>>>>>>> > I think it would help users to understand Beam with some simple, >>>>>>>>> > though possibly large-scale changes. My goal would be: >>>>>>>>> > >>>>>>>>> > - new code is changeable/evolving by default (so we don't have >>>>>>>>> to >>>>>>>>> > always remember to annotate it) but users have confidence they >>>>>>>>> can use >>>>>>>>> > it in production (because we have good software engineering >>>>>>>>> practices) >>>>>>>>> > - Experimental would be reserved for more risky things >>>>>>>>> > - after we are confident an API is stable, because it has been >>>>>>>>> the >>>>>>>>> > same across a couple releases, we mark it >>>>>>>>> > >>>>>>>>> > A concrete proposal to achieve this would be: >>>>>>>>> > >>>>>>>>> > - Add a @Stable annotation and use it as appropriate on our >>>>>>>>> primary APIs >>>>>>>>> > - [Possibly] add an @Evolving annotation that would also be the >>>>>>>>> default. >>>>>>>>> > - Remove most `@Experimental` annotations or change them to >>>>>>>>> `@Evolving` >>>>>>>>> > - Communicate about this (somehow). If possible, surface the >>>>>>>>> > `@Evolving` default in documentation. >>>>>>>>> > >>>>>>>>> > The last bit is the hardest. >>>>>>>>> > >>>>>>>>> > Kenn >>>>>>>>> >>>>>>>> >>>>>>>