+1 on how to iterate without a Beam 3.0 Often that just means, write the new thing, "support both for a while",make it clear how to migrate to the new thing, and the next Major Version just drops everything that doesn't cut the mustard anymore.
On Mon, Apr 17, 2023, 11:54 AM Ahmet Altay via dev <dev@beam.apache.org> wrote: > It sounds like there is agreement in eliminating the > experimental annotation. Should we stop using them in new code? Or should > we do a pass to remove those annotations? > > On Mon, Apr 17, 2023 at 11:24 AM Kenneth Knowles <k...@apache.org> wrote: > >> >> >> On Mon, Apr 17, 2023 at 9:34 AM Kerry Donny-Clark via dev < >> dev@beam.apache.org> wrote: >> >>> +1 to eliminating @Experimental as a Beam level annotation. >>> I think the main point is that if no one pays attention to such >>> annotations, then they are only noise and deliver negative value. >>> >> >> Yes. Consider these two scenarios >> >> 1. We change an "experimental" API that is widely used. This causes a >> pain for many users. We would probably not do it, and we would catch it in >> code review. >> 2. We change a non-"experimental" API that is fairly new. This applies to >> many APIs, since we rarely remember to annotate new APIs. This causes just >> minor pain for just a few users. TBH I would be OK with this. Rigidity in >> rejecting such changes just means your first draft is your final draft. Try >> that in any other endeavor and see how it works for you :-) >> >> And it is worse than noise - there are some users who do pay attention to >> the annotations and are not using things even though they are super safe. >> That was the main reason I started this thread. The rest of my proposal was >> just to try to recover some flexibility, but it seems too hard and no >> immediate consensus on how/if we could manage it. >> >> Kenn >> >> PS I do agree with Kerry's PS and would love to have that discussion. >> Perhaps separately, since it will start from square one either way. Every >> time someone says "Beam 3.0" we should really be thinking "how can we >> iterate". One big breaking version change doesn't work. >> > > +1 - Thinking about "How can we iterate" would allow us to build something > users' want in shorter timelines. > > >> >> >> >> Kerry >>> >>> PS- Kenn says " the point about the culture of stagnation came from my >>> recent experiences as code reviewer where there was some idea that we >>> couldn't change things even when they were plainly wrong and the change was >>> plainly a fix." This seems like a major point that deserves a more focused >>> discussion. >>> >>> On Fri, Apr 14, 2023 at 5:47 PM Chamikara Jayalath via dev < >>> dev@beam.apache.org> wrote: >>> >>>> I think we've been using the Java Experimental tags in two ways. >>>> >>>> * New APIs >>>> * Any APIs that use specific features identified by pre-defined >>>> experimental Kind types defined in [1] (for example, I/O connectors APIs >>>> that use Beam Schemas). >>>> >>>> Removing the experimental tag has the effect of finalizing a number of >>>> APIs we've been reluctant to call stable (for example, Beam Schemas, >>>> portability, metrics related APIs). These APIs have been around for a long >>>> time and I don't see them changing so probably this is the right thing to >>>> do. But I just wanted to call it out. >>>> >>>> Thanks, >>>> Cham >>>> >>>> [1] >>>> https://github.com/apache/beam/blob/b9f27f9da2e63b564feecaeb593d7b12783192b0/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java#L48 >>>> >>>> On Fri, Apr 14, 2023 at 1:26 PM Ahmet Altay via dev < >>>> dev@beam.apache.org> wrote: >>>> >>>>> >>>>> >>>>> On Fri, Apr 14, 2023 at 1:15 PM Kenneth Knowles <k...@apache.org> >>>>> wrote: >>>>> >>>>>> >>>>>> Thanks for the discussion. Many good points. Probably just removing >>>>>> all the annotations is a noop to users, and will solve the "afraid to use >>>>>> experimental features" problem. >>>>>> >>>>>> Regarding stability, the capabilities of Java (and Python is much >>>>>> much worse) make it infeasible to produce quality software with the rule >>>>>> "once it is public it is frozen forever". But on the other hand, there >>>>>> isn't much of a practical alternative. Most projects just make breaking >>>>>> changes at minor releases quite often, in my experience. I don't want to >>>>>> follow that pattern, for sure. >>>>>> >>>>>> Regarding Danny's comment of not seeing this culture - check out any >>>>>> of our more mature IOs, which all have very high cyclomatic complexity >>>>>> due >>>>>> to never being significantly refactored. Adhering to in-place state >>>>>> compatibility for update instead of focusing on blue/green deployment is >>>>>> also a culprit here. I don't have examples to mind, but the point about >>>>>> the >>>>>> culture of stagnation came from my recent experiences as code >>>>>> reviewer where there was some idea that we couldn't change things even >>>>>> when >>>>>> they were plainly wrong and the change was plainly a fix. >>>>>> >>>>>> Often, it comes from corners like triggered side inputs where we >>>>>> simply never had a clear concept and so bringing things into alignment >>>>>> with >>>>>> a spec will break someone, by necessity. To be clear: I have not received >>>>>> pushback on that one (yet). Some other examples are >>>>>> https://s.apache.org/finishing-triggers-drop-data (breaking change >>>>>> necessary to eliminate data loss risk) >>>>>> https://github.com/apache/beam/issues/20528 (fix was too slow >>>>>> because we were hesitant to commit a breaking fix) >>>>>> https://github.com/apache/beam/pull/8134#pullrequestreview-218592801 >>>>>> (left unsafe API in place, applied doc-only fix). >>>>>> >>>>>> But indeed, of all the issues I raised, the customer concern with >>>>>> `@Experimental` was the most important. We have had a few threads about >>>>>> it >>>>>> in the past, too, and it hasn't gotten better. >>>>>> >>>>>> 1. It does not have the intended effect (making users OK with >>>>>> evolving APIs and behavior to allow us to reach a high level of quality) >>>>>> 2. It has an unintended effect (making users afraid to use things >>>>>> which they should be happy to use) >>>>>> 3. We don't use it consistently (many less-safe things are not >>>>>> experimental, many totally stable things are experimental) >>>>>> >>>>>> Because of 3, if we don't have a feasible way to move to >>>>>> "evolving/unstable by default" in a way that users know and are OK with, >>>>>> then 1 is impossible. And so the only way to fix 2 is to just eliminate >>>>>> the >>>>>> annotation approach entirely and go with language conventions. >>>>>> >>>>> >>>>> +1 to eliminating @Experimental as a Beam level annotation. That is >>>>> the simplest approach that will get us to a consistent state, and it will >>>>> align the goals and intentions of us with users'. >>>>> >>>>> >>>>>> >>>>>> Kenn >>>>>> >>>>>> On Wed, Apr 12, 2023 at 5:10 PM Ahmet Altay via dev < >>>>>> dev@beam.apache.org> wrote: >>>>>> >>>>>>> I agree with Alexey and Byron. >>>>>>> 1. We do not have any concrete evidence of our users paying >>>>>>> attention to any of those annotations. Experimental API that were in >>>>>>> that >>>>>>> state for a long while are good examples. A possible exception is a >>>>>>> deprecated annotation. My preference would be to simplify annotations to >>>>>>> nothing (stable enough for use and will evolve backward compatibility), >>>>>>> and >>>>>>> maybe deprecated annotations. >>>>>>> 2. If you all think that Experimental annotation is needed, Byron's >>>>>>> suggestion (more or less what we do today) but with some concrete life >>>>>>> cycle definitions of those annotations would be useful to our users. (An >>>>>>> example could be: experimental APIs either need to graduate or be >>>>>>> removed >>>>>>> in X releases.) >>>>>>> >>>>>>> >>>>>>> >>>>>>> On Tue, Apr 4, 2023 at 9:01 AM Alexey Romanenko < >>>>>>> aromanenko....@gmail.com> wrote: >>>>>>> >>>>>>>> Great and long-to-wait topic to discuss. >>>>>>>> >>>>>>>> My personal opinion based on what I saw on different open-source >>>>>>>> projects is that all such annotations, like @Experimental or @Stable, >>>>>>>> are >>>>>>>> not usefull along the time and even rather useless and misleading. What >>>>>>>> actually play roles is artifacts publishing and public API despite how >>>>>>>> it >>>>>>>> was annotated. Once a class/method was published and available for >>>>>>>> users to >>>>>>>> use, it should be considered as “stable" (even if it’s not yet stable >>>>>>>> from >>>>>>>> its developers point of view) and can’t be easily removed/changed in >>>>>>>> the >>>>>>>> next releases. >>>>>>>> >>>>>>>> At Beam, we have a “good" example with @Experimental that was used >>>>>>>> to annotate many parts of code in the beginning of its creation but >>>>>>>> then >>>>>>>> perhaps forgotten to be removed whenever this code is already used by >>>>>>>> many >>>>>>>> users and API can’t be just changed despite of this annotation. >>>>>>>> >>>>>>>> So, I’m pro to dismiss such annotations and consider all public and >>>>>>>> user-available API as “stable”. If it’s needed to change/remove a >>>>>>>> public >>>>>>>> API then we should follow the procedure of API deprecation and final >>>>>>>> removing, at least, after 3 major (x.y) Beam releases. It should help >>>>>>>> to >>>>>>>> have the clear rules for API changes and avoiding breaking changes for >>>>>>>> users. >>>>>>>> >>>>>>>> — >>>>>>>> Alexey >>>>>>>> >>>>>>>> >>>>>>>> On 3 Apr 2023, at 17:04, Byron Ellis via dev <dev@beam.apache.org> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Honestly, I think APIs could be pretty simply defined if you think >>>>>>>> of it in terms of the user: >>>>>>>> >>>>>>>> @Deprecated = this was either stable or evolve but the >>>>>>>> functionality/interface will go away at a future date >>>>>>>> >>>>>>>> @Stable = the user of this API opting out of changes to >>>>>>>> functionality and interface. For example, default options don't change >>>>>>>> for >>>>>>>> a transform annotated this way. >>>>>>>> >>>>>>>> Evolving (No Annotation) = the user is opting in to changes to >>>>>>>> functionality but not to interface. We should generally try to write >>>>>>>> backwards compatible code, but on the other hand the release model >>>>>>>> does not >>>>>>>> force users into an upgrade >>>>>>>> >>>>>>>> @Experimental = this functionality / interface might be a bad idea >>>>>>>> and could go away at any time >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Apr 3, 2023 at 7:22 AM Danny McCormick via dev < >>>>>>>> dev@beam.apache.org> wrote: >>>>>>>> >>>>>>>>> *;tldr - I'd like "evolving" to be further defined, specifically >>>>>>>>> around how we will make decisions about breaking behavior and API >>>>>>>>> changes* >>>>>>>>> >>>>>>>>> I don't particularly care what tags we use as long as they're well >>>>>>>>> documented. With that said, I think the following framing needs to be >>>>>>>>> documented with more definition to flesh out the underlying >>>>>>>>> philosophy: >>>>>>>>> >>>>>>>>> *> - new code is changeable/evolving by default (so we don't have >>>>>>>>> to always remember to annotate it) but users have confidence they can >>>>>>>>> use >>>>>>>>> it in production (because we have good software engineering >>>>>>>>> practices)* >>>>>>>>> >>>>>>>>> * > - Experimental would be reserved for more risky things* >>>>>>>>> * > - after we are confident an API is stable, because it has been >>>>>>>>> the same across a couple releases, we mark it* >>>>>>>>> >>>>>>>>> Here, we have 3 classes of APIs - "experimental", "stable", and >>>>>>>>> "evolving" (or alternately "undefined"). >>>>>>>>> >>>>>>>>> "Experimental" seems clear - we can make any changes we want. >>>>>>>>> "Stable" is reasonably straightforward as well - we will only make >>>>>>>>> non-breaking changes except in exceptional cases (e.g. security hole, >>>>>>>>> total >>>>>>>>> failure of functionality, etc...) >>>>>>>>> >>>>>>>>> With "evolving" is the idea that we can still make any changes we >>>>>>>>> want, but we think it's less likely we'll need to? Are silent behavior >>>>>>>>> changes acceptable here (my vote would be no)? What about breaking API >>>>>>>>> changes (my vote would be rarely)? >>>>>>>>> >>>>>>>>> I think being able to change our APIs is an ok goal, but outside >>>>>>>>> of a true experimental context we should still be weighing the cost >>>>>>>>> of API >>>>>>>>> changes against the benefit; we have a problem of people not updating >>>>>>>>> to >>>>>>>>> newer SDKs, and introducing more breaking changes will just >>>>>>>>> exacerbate that >>>>>>>>> problem. Maybe my concerns are just a consequence of me not really >>>>>>>>> seeing >>>>>>>>> the same things that you're seeing, specifically: "*I'm seeing a >>>>>>>>> culture of being afraid to change things, even when it would be good >>>>>>>>> for >>>>>>>>> users, because our API surface area is far too large and not >>>>>>>>> explicitly >>>>>>>>> chosen.*" Mostly what I've seen is a healthy concern about making >>>>>>>>> it hard for users to upgrade versions, but my view is probably just >>>>>>>>> limited >>>>>>>>> here. >>>>>>>>> >>>>>>>>> My ideal framing for "evolving" is: an *evolving* API can make >>>>>>>>> breaking API changes between versions, but this will be rare and >>>>>>>>> weighed >>>>>>>>> against the cost of slowing users' upgrade process. All breaking >>>>>>>>> changes >>>>>>>>> will be communicated in our change log. An *evolving* API will >>>>>>>>> not make silent behavior changes except in exceptional cases (e.g. >>>>>>>>> patching >>>>>>>>> a security gap, fixing total failures of functionality). >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Danny >>>>>>>>> >>>>>>>>> On Mon, Apr 3, 2023 at 9:02 AM Jan Lukavský <je...@seznam.cz> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> removing @Experimental and adding explicit @Stable annotation >>>>>>>>>> makes >>>>>>>>>> sense to me. FWIW, when we were designing Euphoria API, we >>>>>>>>>> adopted the >>>>>>>>>> following convention: >>>>>>>>>> >>>>>>>>>> - the default stability of "evolving", @Experimental for really >>>>>>>>>> experimental code [1] >>>>>>>>>> >>>>>>>>>> - target @Audience of API [2] (pipeline author, runner, >>>>>>>>>> internal, test) >>>>>>>>>> >>>>>>>>>> - and @StateComplexity of operators (PTransforms) [3] >>>>>>>>>> >>>>>>>>>> The last part is something that was planned to be used by tools >>>>>>>>>> that can >>>>>>>>>> analyze the Pipeline for performance or visualize which >>>>>>>>>> transform(s) are >>>>>>>>>> most state-consuming. But this ended only as plans. :) >>>>>>>>>> >>>>>>>>>> Jan >>>>>>>>>> >>>>>>>>>> [1] >>>>>>>>>> >>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/stability/Experimental.java >>>>>>>>>> >>>>>>>>>> [2] >>>>>>>>>> >>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/audience/Audience.java >>>>>>>>>> >>>>>>>>>> [3] >>>>>>>>>> >>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/operator/StateComplexity.java >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 3/31/23 23:05, Kenneth Knowles wrote: >>>>>>>>>> > Hi all, >>>>>>>>>> > >>>>>>>>>> > Long ago, we adopted two annotations in Beam to communicate to >>>>>>>>>> users: >>>>>>>>>> > >>>>>>>>>> > - `@Experimental` indicates that an API might change >>>>>>>>>> > - `@Internal` indicates that an API is not meant for users. >>>>>>>>>> > >>>>>>>>>> > I've seen some real problems with this approach: >>>>>>>>>> > >>>>>>>>>> > - Users are afraid to use `@Experimental` APIs, because they >>>>>>>>>> are >>>>>>>>>> > worried they are not production-ready. But it really just means >>>>>>>>>> they >>>>>>>>>> > might change, and has nothing to do with that. >>>>>>>>>> > - People write new code and do not put `@Experimental` >>>>>>>>>> annotations on >>>>>>>>>> > it, even though it really should be able to change for a while, >>>>>>>>>> so we >>>>>>>>>> > can do a good job. >>>>>>>>>> > - I'm seeing a culture of being afraid to change things, even >>>>>>>>>> when it >>>>>>>>>> > would be good for users, because our API surface area is far >>>>>>>>>> too large >>>>>>>>>> > and not explicitly chosen. >>>>>>>>>> > - `@Internal` is not that well-known. And now we have many >>>>>>>>>> target >>>>>>>>>> > audiences: Beam devs, PTransform devs, tool devs, pipeline >>>>>>>>>> authors. >>>>>>>>>> > Some of them probably want to use `@Internal` stuff! >>>>>>>>>> > >>>>>>>>>> > I looked at a couple sibling projects and what they have >>>>>>>>>> > - Flink: >>>>>>>>>> > - Spark: >>>>>>>>>> > >>>>>>>>>> > They have many more tags, and some of them seem to have reverse >>>>>>>>>> > defaults to Beam. >>>>>>>>>> > >>>>>>>>>> > Flink: >>>>>>>>>> > >>>>>>>>>> https://github.com/apache/flink/tree/master/flink-annotations/src/main/java/org/apache/flink/annotation >>>>>>>>>> > >>>>>>>>>> > - Experimental >>>>>>>>>> > - Internal.java >>>>>>>>>> > - Public >>>>>>>>>> > - PublicEvolving >>>>>>>>>> > - VisibleForTesting >>>>>>>>>> > >>>>>>>>>> > Spark: >>>>>>>>>> > >>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/java/org/apache/spark/annotation >>>>>>>>>> and >>>>>>>>>> >>>>>>>>>> > >>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/scala/org/apache/spark/annotation >>>>>>>>>> > >>>>>>>>>> > - AlphaComponent >>>>>>>>>> > - DeveloperApi >>>>>>>>>> > - Evolving >>>>>>>>>> > - Experimental >>>>>>>>>> > - Private >>>>>>>>>> > - Stable >>>>>>>>>> > - Unstable >>>>>>>>>> > - Since >>>>>>>>>> > >>>>>>>>>> > I think it would help users to understand Beam with some >>>>>>>>>> simple, >>>>>>>>>> > though possibly large-scale changes. My goal would be: >>>>>>>>>> > >>>>>>>>>> > - new code is changeable/evolving by default (so we don't have >>>>>>>>>> to >>>>>>>>>> > always remember to annotate it) but users have confidence they >>>>>>>>>> can use >>>>>>>>>> > it in production (because we have good software engineering >>>>>>>>>> practices) >>>>>>>>>> > - Experimental would be reserved for more risky things >>>>>>>>>> > - after we are confident an API is stable, because it has been >>>>>>>>>> the >>>>>>>>>> > same across a couple releases, we mark it >>>>>>>>>> > >>>>>>>>>> > A concrete proposal to achieve this would be: >>>>>>>>>> > >>>>>>>>>> > - Add a @Stable annotation and use it as appropriate on our >>>>>>>>>> primary APIs >>>>>>>>>> > - [Possibly] add an @Evolving annotation that would also be >>>>>>>>>> the default. >>>>>>>>>> > - Remove most `@Experimental` annotations or change them to >>>>>>>>>> `@Evolving` >>>>>>>>>> > - Communicate about this (somehow). If possible, surface the >>>>>>>>>> > `@Evolving` default in documentation. >>>>>>>>>> > >>>>>>>>>> > The last bit is the hardest. >>>>>>>>>> > >>>>>>>>>> > Kenn >>>>>>>>>> >>>>>>>>> >>>>>>>>