Since this thread had a few different topics, I think I will just do a lazy consensus vote of the one thing that we seem to agree on - eliminate Experimental. If I have misunderstood, it is fine to raise it in the vote thread. If there are no objections I can make the crosscutting change.
Kenn On Tue, Apr 18, 2023 at 10:05 AM Robert Burke <rob...@frantil.com> wrote: > While this thread is beginning to move off topic, I think any real Beam > 3.0 effort largely should start with "what we know we're going to keep", > and what else a refined/simplified surface looks like for each SDK. But I'm > sure there's some known things to cut too. > > And that's ignoring any real breaking changes that would be healthy to > make for Beam. > > But critically, what are the concrete benefits to pipeline authors in such > a move. Otherwise it's self serving churn. > > > > On Tue, Apr 18, 2023, 9:08 AM Alexey Romanenko <aromanenko....@gmail.com> > wrote: > >> >> On 17 Apr 2023, at 21:14, Robert Burke <rob...@frantil.com> wrote: >> >> +1 on how to iterate without a Beam 3.0 >> >> Often that just means, write the new thing, "support both for a >> while",make it clear how to migrate to the new thing, and the next Major >> Version just drops everything that doesn't cut the mustard anymore. >> >> >> Exactly! If we are all agree with this process of >> adding/deprecating/removing new/old API then I think we need to add it into >> Beam documentation to make it clear for developers and users (if not yet). >> >> The only issue here is that we don’t do Major releases often (v2.0.0 is >> dated 2017-05-17). I think we even don’t have a public roadmap for that and >> we almost never discussed “what" Beam 3.x should be and, the most important >> question, “when” it will happen (sorry if I missed that). >> >> — >> Alexey >> >> >> >> On Mon, Apr 17, 2023, 11:54 AM Ahmet Altay via dev <dev@beam.apache.org> >> wrote: >> >>> It sounds like there is agreement in eliminating the >>> experimental annotation. Should we stop using them in new code? Or should >>> we do a pass to remove those annotations? >>> >>> On Mon, Apr 17, 2023 at 11:24 AM Kenneth Knowles <k...@apache.org> >>> wrote: >>> >>>> >>>> >>>> On Mon, Apr 17, 2023 at 9:34 AM Kerry Donny-Clark via dev < >>>> dev@beam.apache.org> wrote: >>>> >>>>> +1 to eliminating @Experimental as a Beam level annotation. >>>>> I think the main point is that if no one pays attention to such >>>>> annotations, then they are only noise and deliver negative value. >>>>> >>>> >>>> Yes. Consider these two scenarios >>>> >>>> 1. We change an "experimental" API that is widely used. This causes a >>>> pain for many users. We would probably not do it, and we would catch it in >>>> code review. >>>> 2. We change a non-"experimental" API that is fairly new. This applies >>>> to many APIs, since we rarely remember to annotate new APIs. This causes >>>> just minor pain for just a few users. TBH I would be OK with this. Rigidity >>>> in rejecting such changes just means your first draft is your final draft. >>>> Try that in any other endeavor and see how it works for you :-) >>>> >>>> And it is worse than noise - there are some users who do pay attention >>>> to the annotations and are not using things even though they are super >>>> safe. That was the main reason I started this thread. The rest of my >>>> proposal was just to try to recover some flexibility, but it seems too hard >>>> and no immediate consensus on how/if we could manage it. >>>> >>>> Kenn >>>> >>>> PS I do agree with Kerry's PS and would love to have that discussion. >>>> Perhaps separately, since it will start from square one either way. Every >>>> time someone says "Beam 3.0" we should really be thinking "how can we >>>> iterate". One big breaking version change doesn't work. >>>> >>> >>> +1 - Thinking about "How can we iterate" would allow us to build >>> something users' want in shorter timelines. >>> >>> >>>> >>>> >>>> >>>> Kerry >>>>> >>>>> PS- Kenn says " the point about the culture of stagnation came from my >>>>> recent experiences as code reviewer where there was some idea that we >>>>> couldn't change things even when they were plainly wrong and the change >>>>> was >>>>> plainly a fix." This seems like a major point that deserves a more focused >>>>> discussion. >>>>> >>>>> On Fri, Apr 14, 2023 at 5:47 PM Chamikara Jayalath via dev < >>>>> dev@beam.apache.org> wrote: >>>>> >>>>>> I think we've been using the Java Experimental tags in two ways. >>>>>> >>>>>> * New APIs >>>>>> * Any APIs that use specific features identified by pre-defined >>>>>> experimental Kind types defined in [1] (for example, I/O connectors APIs >>>>>> that use Beam Schemas). >>>>>> >>>>>> Removing the experimental tag has the effect of finalizing a number >>>>>> of APIs we've been reluctant to call stable (for example, Beam Schemas, >>>>>> portability, metrics related APIs). These APIs have been around for a >>>>>> long >>>>>> time and I don't see them changing so probably this is the right thing to >>>>>> do. But I just wanted to call it out. >>>>>> >>>>>> Thanks, >>>>>> Cham >>>>>> >>>>>> [1] >>>>>> https://github.com/apache/beam/blob/b9f27f9da2e63b564feecaeb593d7b12783192b0/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java#L48 >>>>>> >>>>>> On Fri, Apr 14, 2023 at 1:26 PM Ahmet Altay via dev < >>>>>> dev@beam.apache.org> wrote: >>>>>> >>>>>>> >>>>>>> >>>>>>> On Fri, Apr 14, 2023 at 1:15 PM Kenneth Knowles <k...@apache.org> >>>>>>> wrote: >>>>>>> >>>>>>>> >>>>>>>> Thanks for the discussion. Many good points. Probably just removing >>>>>>>> all the annotations is a noop to users, and will solve the "afraid to >>>>>>>> use >>>>>>>> experimental features" problem. >>>>>>>> >>>>>>>> Regarding stability, the capabilities of Java (and Python is much >>>>>>>> much worse) make it infeasible to produce quality software with the >>>>>>>> rule >>>>>>>> "once it is public it is frozen forever". But on the other hand, there >>>>>>>> isn't much of a practical alternative. Most projects just make breaking >>>>>>>> changes at minor releases quite often, in my experience. I don't want >>>>>>>> to >>>>>>>> follow that pattern, for sure. >>>>>>>> >>>>>>>> Regarding Danny's comment of not seeing this culture - check out >>>>>>>> any of our more mature IOs, which all have very high cyclomatic >>>>>>>> complexity >>>>>>>> due to never being significantly refactored. Adhering to in-place state >>>>>>>> compatibility for update instead of focusing on blue/green deployment >>>>>>>> is >>>>>>>> also a culprit here. I don't have examples to mind, but the point >>>>>>>> about the >>>>>>>> culture of stagnation came from my recent experiences as code >>>>>>>> reviewer where there was some idea that we couldn't change things even >>>>>>>> when >>>>>>>> they were plainly wrong and the change was plainly a fix. >>>>>>>> >>>>>>>> Often, it comes from corners like triggered side inputs where we >>>>>>>> simply never had a clear concept and so bringing things into alignment >>>>>>>> with >>>>>>>> a spec will break someone, by necessity. To be clear: I have not >>>>>>>> received >>>>>>>> pushback on that one (yet). Some other examples are >>>>>>>> https://s.apache.org/finishing-triggers-drop-data (breaking change >>>>>>>> necessary to eliminate data loss risk) >>>>>>>> https://github.com/apache/beam/issues/20528 (fix was too slow >>>>>>>> because we were hesitant to commit a breaking fix) >>>>>>>> https://github.com/apache/beam/pull/8134#pullrequestreview-218592801 >>>>>>>> (left unsafe API in place, applied doc-only fix). >>>>>>>> >>>>>>>> But indeed, of all the issues I raised, the customer concern with >>>>>>>> `@Experimental` was the most important. We have had a few threads >>>>>>>> about it >>>>>>>> in the past, too, and it hasn't gotten better. >>>>>>>> >>>>>>>> 1. It does not have the intended effect (making users OK with >>>>>>>> evolving APIs and behavior to allow us to reach a high level of >>>>>>>> quality) >>>>>>>> 2. It has an unintended effect (making users afraid to use things >>>>>>>> which they should be happy to use) >>>>>>>> 3. We don't use it consistently (many less-safe things are not >>>>>>>> experimental, many totally stable things are experimental) >>>>>>>> >>>>>>>> Because of 3, if we don't have a feasible way to move to >>>>>>>> "evolving/unstable by default" in a way that users know and are OK >>>>>>>> with, >>>>>>>> then 1 is impossible. And so the only way to fix 2 is to just >>>>>>>> eliminate the >>>>>>>> annotation approach entirely and go with language conventions. >>>>>>>> >>>>>>> >>>>>>> +1 to eliminating @Experimental as a Beam level annotation. That is >>>>>>> the simplest approach that will get us to a consistent state, and it >>>>>>> will >>>>>>> align the goals and intentions of us with users'. >>>>>>> >>>>>>> >>>>>>>> >>>>>>>> Kenn >>>>>>>> >>>>>>>> On Wed, Apr 12, 2023 at 5:10 PM Ahmet Altay via dev < >>>>>>>> dev@beam.apache.org> wrote: >>>>>>>> >>>>>>>>> I agree with Alexey and Byron. >>>>>>>>> 1. We do not have any concrete evidence of our users paying >>>>>>>>> attention to any of those annotations. Experimental API that were in >>>>>>>>> that >>>>>>>>> state for a long while are good examples. A possible exception is a >>>>>>>>> deprecated annotation. My preference would be to simplify annotations >>>>>>>>> to >>>>>>>>> nothing (stable enough for use and will evolve backward >>>>>>>>> compatibility), and >>>>>>>>> maybe deprecated annotations. >>>>>>>>> 2. If you all think that Experimental annotation is needed, >>>>>>>>> Byron's suggestion (more or less what we do today) but with some >>>>>>>>> concrete >>>>>>>>> life cycle definitions of those annotations would be useful to our >>>>>>>>> users. >>>>>>>>> (An example could be: experimental APIs either need to graduate or be >>>>>>>>> removed in X releases.) >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On Tue, Apr 4, 2023 at 9:01 AM Alexey Romanenko < >>>>>>>>> aromanenko....@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Great and long-to-wait topic to discuss. >>>>>>>>>> >>>>>>>>>> My personal opinion based on what I saw on different open-source >>>>>>>>>> projects is that all such annotations, like @Experimental or >>>>>>>>>> @Stable, are >>>>>>>>>> not usefull along the time and even rather useless and misleading. >>>>>>>>>> What >>>>>>>>>> actually play roles is artifacts publishing and public API despite >>>>>>>>>> how it >>>>>>>>>> was annotated. Once a class/method was published and available for >>>>>>>>>> users to >>>>>>>>>> use, it should be considered as “stable" (even if it’s not yet >>>>>>>>>> stable from >>>>>>>>>> its developers point of view) and can’t be easily removed/changed in >>>>>>>>>> the >>>>>>>>>> next releases. >>>>>>>>>> >>>>>>>>>> At Beam, we have a “good" example with @Experimental that was >>>>>>>>>> used to annotate many parts of code in the beginning of its creation >>>>>>>>>> but >>>>>>>>>> then perhaps forgotten to be removed whenever this code is already >>>>>>>>>> used by >>>>>>>>>> many users and API can’t be just changed despite of this annotation. >>>>>>>>>> >>>>>>>>>> So, I’m pro to dismiss such annotations and consider all public >>>>>>>>>> and user-available API as “stable”. If it’s needed to change/remove a >>>>>>>>>> public API then we should follow the procedure of API deprecation >>>>>>>>>> and final >>>>>>>>>> removing, at least, after 3 major (x.y) Beam releases. It should >>>>>>>>>> help to >>>>>>>>>> have the clear rules for API changes and avoiding breaking changes >>>>>>>>>> for >>>>>>>>>> users. >>>>>>>>>> >>>>>>>>>> — >>>>>>>>>> Alexey >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3 Apr 2023, at 17:04, Byron Ellis via dev <dev@beam.apache.org> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>> Honestly, I think APIs could be pretty simply defined if you >>>>>>>>>> think of it in terms of the user: >>>>>>>>>> >>>>>>>>>> @Deprecated = this was either stable or evolve but the >>>>>>>>>> functionality/interface will go away at a future date >>>>>>>>>> >>>>>>>>>> @Stable = the user of this API opting out of changes to >>>>>>>>>> functionality and interface. For example, default options don't >>>>>>>>>> change for >>>>>>>>>> a transform annotated this way. >>>>>>>>>> >>>>>>>>>> Evolving (No Annotation) = the user is opting in to changes to >>>>>>>>>> functionality but not to interface. We should generally try to write >>>>>>>>>> backwards compatible code, but on the other hand the release model >>>>>>>>>> does not >>>>>>>>>> force users into an upgrade >>>>>>>>>> >>>>>>>>>> @Experimental = this functionality / interface might be a bad >>>>>>>>>> idea and could go away at any time >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Mon, Apr 3, 2023 at 7:22 AM Danny McCormick via dev < >>>>>>>>>> dev@beam.apache.org> wrote: >>>>>>>>>> >>>>>>>>>>> *;tldr - I'd like "evolving" to be further defined, specifically >>>>>>>>>>> around how we will make decisions about breaking behavior and API >>>>>>>>>>> changes* >>>>>>>>>>> >>>>>>>>>>> I don't particularly care what tags we use as long as they're >>>>>>>>>>> well documented. With that said, I think the following framing >>>>>>>>>>> needs to be >>>>>>>>>>> documented with more definition to flesh out the underlying >>>>>>>>>>> philosophy: >>>>>>>>>>> >>>>>>>>>>> *> - new code is changeable/evolving by default (so we don't >>>>>>>>>>> have to always remember to annotate it) but users have confidence >>>>>>>>>>> they can >>>>>>>>>>> use it in production (because we have good software engineering >>>>>>>>>>> practices)* >>>>>>>>>>> >>>>>>>>>>> * > - Experimental would be reserved for more risky things* >>>>>>>>>>> * > - after we are confident an API is stable, because it has >>>>>>>>>>> been the same across a couple releases, we mark it* >>>>>>>>>>> >>>>>>>>>>> Here, we have 3 classes of APIs - "experimental", "stable", and >>>>>>>>>>> "evolving" (or alternately "undefined"). >>>>>>>>>>> >>>>>>>>>>> "Experimental" seems clear - we can make any changes we want. >>>>>>>>>>> "Stable" is reasonably straightforward as well - we will only make >>>>>>>>>>> non-breaking changes except in exceptional cases (e.g. security >>>>>>>>>>> hole, total >>>>>>>>>>> failure of functionality, etc...) >>>>>>>>>>> >>>>>>>>>>> With "evolving" is the idea that we can still make any changes >>>>>>>>>>> we want, but we think it's less likely we'll need to? Are silent >>>>>>>>>>> behavior >>>>>>>>>>> changes acceptable here (my vote would be no)? What about breaking >>>>>>>>>>> API >>>>>>>>>>> changes (my vote would be rarely)? >>>>>>>>>>> >>>>>>>>>>> I think being able to change our APIs is an ok goal, but outside >>>>>>>>>>> of a true experimental context we should still be weighing the cost >>>>>>>>>>> of API >>>>>>>>>>> changes against the benefit; we have a problem of people not >>>>>>>>>>> updating to >>>>>>>>>>> newer SDKs, and introducing more breaking changes will just >>>>>>>>>>> exacerbate that >>>>>>>>>>> problem. Maybe my concerns are just a consequence of me not really >>>>>>>>>>> seeing >>>>>>>>>>> the same things that you're seeing, specifically: "*I'm seeing >>>>>>>>>>> a culture of being afraid to change things, even when it would be >>>>>>>>>>> good for >>>>>>>>>>> users, because our API surface area is far too large and not >>>>>>>>>>> explicitly >>>>>>>>>>> chosen.*" Mostly what I've seen is a healthy concern about >>>>>>>>>>> making it hard for users to upgrade versions, but my view is >>>>>>>>>>> probably just >>>>>>>>>>> limited here. >>>>>>>>>>> >>>>>>>>>>> My ideal framing for "evolving" is: an *evolving* API can make >>>>>>>>>>> breaking API changes between versions, but this will be rare and >>>>>>>>>>> weighed >>>>>>>>>>> against the cost of slowing users' upgrade process. All breaking >>>>>>>>>>> changes >>>>>>>>>>> will be communicated in our change log. An *evolving* API will >>>>>>>>>>> not make silent behavior changes except in exceptional cases (e.g. >>>>>>>>>>> patching >>>>>>>>>>> a security gap, fixing total failures of functionality). >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Danny >>>>>>>>>>> >>>>>>>>>>> On Mon, Apr 3, 2023 at 9:02 AM Jan Lukavský <je...@seznam.cz> >>>>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi, >>>>>>>>>>>> >>>>>>>>>>>> removing @Experimental and adding explicit @Stable annotation >>>>>>>>>>>> makes >>>>>>>>>>>> sense to me. FWIW, when we were designing Euphoria API, we >>>>>>>>>>>> adopted the >>>>>>>>>>>> following convention: >>>>>>>>>>>> >>>>>>>>>>>> - the default stability of "evolving", @Experimental for >>>>>>>>>>>> really >>>>>>>>>>>> experimental code [1] >>>>>>>>>>>> >>>>>>>>>>>> - target @Audience of API [2] (pipeline author, runner, >>>>>>>>>>>> internal, test) >>>>>>>>>>>> >>>>>>>>>>>> - and @StateComplexity of operators (PTransforms) [3] >>>>>>>>>>>> >>>>>>>>>>>> The last part is something that was planned to be used by tools >>>>>>>>>>>> that can >>>>>>>>>>>> analyze the Pipeline for performance or visualize which >>>>>>>>>>>> transform(s) are >>>>>>>>>>>> most state-consuming. But this ended only as plans. :) >>>>>>>>>>>> >>>>>>>>>>>> Jan >>>>>>>>>>>> >>>>>>>>>>>> [1] >>>>>>>>>>>> >>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/stability/Experimental.java >>>>>>>>>>>> >>>>>>>>>>>> [2] >>>>>>>>>>>> >>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/audience/Audience.java >>>>>>>>>>>> >>>>>>>>>>>> [3] >>>>>>>>>>>> >>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/operator/StateComplexity.java >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 3/31/23 23:05, Kenneth Knowles wrote: >>>>>>>>>>>> > Hi all, >>>>>>>>>>>> > >>>>>>>>>>>> > Long ago, we adopted two annotations in Beam to communicate >>>>>>>>>>>> to users: >>>>>>>>>>>> > >>>>>>>>>>>> > - `@Experimental` indicates that an API might change >>>>>>>>>>>> > - `@Internal` indicates that an API is not meant for users. >>>>>>>>>>>> > >>>>>>>>>>>> > I've seen some real problems with this approach: >>>>>>>>>>>> > >>>>>>>>>>>> > - Users are afraid to use `@Experimental` APIs, because they >>>>>>>>>>>> are >>>>>>>>>>>> > worried they are not production-ready. But it really just >>>>>>>>>>>> means they >>>>>>>>>>>> > might change, and has nothing to do with that. >>>>>>>>>>>> > - People write new code and do not put `@Experimental` >>>>>>>>>>>> annotations on >>>>>>>>>>>> > it, even though it really should be able to change for a >>>>>>>>>>>> while, so we >>>>>>>>>>>> > can do a good job. >>>>>>>>>>>> > - I'm seeing a culture of being afraid to change things, >>>>>>>>>>>> even when it >>>>>>>>>>>> > would be good for users, because our API surface area is far >>>>>>>>>>>> too large >>>>>>>>>>>> > and not explicitly chosen. >>>>>>>>>>>> > - `@Internal` is not that well-known. And now we have many >>>>>>>>>>>> target >>>>>>>>>>>> > audiences: Beam devs, PTransform devs, tool devs, pipeline >>>>>>>>>>>> authors. >>>>>>>>>>>> > Some of them probably want to use `@Internal` stuff! >>>>>>>>>>>> > >>>>>>>>>>>> > I looked at a couple sibling projects and what they have >>>>>>>>>>>> > - Flink: >>>>>>>>>>>> > - Spark: >>>>>>>>>>>> > >>>>>>>>>>>> > They have many more tags, and some of them seem to have >>>>>>>>>>>> reverse >>>>>>>>>>>> > defaults to Beam. >>>>>>>>>>>> > >>>>>>>>>>>> > Flink: >>>>>>>>>>>> > >>>>>>>>>>>> https://github.com/apache/flink/tree/master/flink-annotations/src/main/java/org/apache/flink/annotation >>>>>>>>>>>> > >>>>>>>>>>>> > - Experimental >>>>>>>>>>>> > - Internal.java >>>>>>>>>>>> > - Public >>>>>>>>>>>> > - PublicEvolving >>>>>>>>>>>> > - VisibleForTesting >>>>>>>>>>>> > >>>>>>>>>>>> > Spark: >>>>>>>>>>>> > >>>>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/java/org/apache/spark/annotation >>>>>>>>>>>> and >>>>>>>>>>>> >>>>>>>>>>>> > >>>>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/scala/org/apache/spark/annotation >>>>>>>>>>>> > >>>>>>>>>>>> > - AlphaComponent >>>>>>>>>>>> > - DeveloperApi >>>>>>>>>>>> > - Evolving >>>>>>>>>>>> > - Experimental >>>>>>>>>>>> > - Private >>>>>>>>>>>> > - Stable >>>>>>>>>>>> > - Unstable >>>>>>>>>>>> > - Since >>>>>>>>>>>> > >>>>>>>>>>>> > I think it would help users to understand Beam with some >>>>>>>>>>>> simple, >>>>>>>>>>>> > though possibly large-scale changes. My goal would be: >>>>>>>>>>>> > >>>>>>>>>>>> > - new code is changeable/evolving by default (so we don't >>>>>>>>>>>> have to >>>>>>>>>>>> > always remember to annotate it) but users have confidence >>>>>>>>>>>> they can use >>>>>>>>>>>> > it in production (because we have good software engineering >>>>>>>>>>>> practices) >>>>>>>>>>>> > - Experimental would be reserved for more risky things >>>>>>>>>>>> > - after we are confident an API is stable, because it has >>>>>>>>>>>> been the >>>>>>>>>>>> > same across a couple releases, we mark it >>>>>>>>>>>> > >>>>>>>>>>>> > A concrete proposal to achieve this would be: >>>>>>>>>>>> > >>>>>>>>>>>> > - Add a @Stable annotation and use it as appropriate on our >>>>>>>>>>>> primary APIs >>>>>>>>>>>> > - [Possibly] add an @Evolving annotation that would also be >>>>>>>>>>>> the default. >>>>>>>>>>>> > - Remove most `@Experimental` annotations or change them to >>>>>>>>>>>> `@Evolving` >>>>>>>>>>>> > - Communicate about this (somehow). If possible, surface the >>>>>>>>>>>> > `@Evolving` default in documentation. >>>>>>>>>>>> > >>>>>>>>>>>> > The last bit is the hardest. >>>>>>>>>>>> > >>>>>>>>>>>> > Kenn >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>