While this thread is beginning to move off topic, I think any real Beam 3.0
effort largely should start with "what we know we're going to keep", and
what else a refined/simplified surface looks like for each SDK. But I'm
sure there's some known things to cut too.

And that's ignoring any real breaking changes that would be healthy to make
for Beam.

But critically, what are the concrete benefits to pipeline authors in such
a move. Otherwise it's self serving churn.



On Tue, Apr 18, 2023, 9:08 AM Alexey Romanenko <aromanenko....@gmail.com>
wrote:

>
> On 17 Apr 2023, at 21:14, Robert Burke <rob...@frantil.com> wrote:
>
> +1 on how to iterate without a Beam 3.0
>
> Often that just means, write the new thing, "support both for a
> while",make it clear how to migrate to the new thing, and the next Major
> Version just drops everything that doesn't cut the mustard anymore.
>
>
> Exactly! If we are all agree with this process of
> adding/deprecating/removing new/old API then I think we need to add it into
> Beam documentation to make it clear for developers and users (if not yet).
>
> The only issue here is that we don’t do Major releases often (v2.0.0 is
> dated 2017-05-17). I think we even don’t have a public roadmap for that and
> we almost never discussed “what" Beam 3.x should be and, the most important
> question, “when” it will happen (sorry if I missed that).
>
> —
> Alexey
>
>
>
> On Mon, Apr 17, 2023, 11:54 AM Ahmet Altay via dev <dev@beam.apache.org>
> wrote:
>
>> It sounds like there is agreement in eliminating the
>> experimental annotation. Should we stop using them in new code? Or should
>> we do a pass to remove those annotations?
>>
>> On Mon, Apr 17, 2023 at 11:24 AM Kenneth Knowles <k...@apache.org> wrote:
>>
>>>
>>>
>>> On Mon, Apr 17, 2023 at 9:34 AM Kerry Donny-Clark via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>>> +1 to eliminating @Experimental as a Beam level annotation.
>>>> I think the main point is that if no one pays attention to such
>>>> annotations, then they are only noise and deliver negative value.
>>>>
>>>
>>> Yes. Consider these two scenarios
>>>
>>> 1. We change an "experimental" API that is widely used. This causes a
>>> pain for many users. We would probably not do it, and we would catch it in
>>> code review.
>>> 2. We change a non-"experimental" API that is fairly new. This applies
>>> to many APIs, since we rarely remember to annotate new APIs. This causes
>>> just minor pain for just a few users. TBH I would be OK with this. Rigidity
>>> in rejecting such changes just means your first draft is your final draft.
>>> Try that in any other endeavor and see how it works for you :-)
>>>
>>> And it is worse than noise - there are some users who do pay attention
>>> to the annotations and are not using things even though they are super
>>> safe. That was the main reason I started this thread. The rest of my
>>> proposal was just to try to recover some flexibility, but it seems too hard
>>> and no immediate consensus on how/if we could manage it.
>>>
>>> Kenn
>>>
>>> PS I do agree with Kerry's PS and would love to have that discussion.
>>> Perhaps separately, since it will start from square one either way. Every
>>> time someone says "Beam 3.0" we should really be thinking "how can we
>>> iterate". One big breaking version change doesn't work.
>>>
>>
>> +1 - Thinking about "How can we iterate" would allow us to build
>> something users' want in shorter timelines.
>>
>>
>>>
>>>
>>>
>>> Kerry
>>>>
>>>> PS- Kenn says " the point about the culture of stagnation came from my
>>>> recent experiences as code reviewer where there was some idea that we
>>>> couldn't change things even when they were plainly wrong and the change was
>>>> plainly a fix." This seems like a major point that deserves a more focused
>>>> discussion.
>>>>
>>>> On Fri, Apr 14, 2023 at 5:47 PM Chamikara Jayalath via dev <
>>>> dev@beam.apache.org> wrote:
>>>>
>>>>> I think we've been using the Java Experimental tags in two ways.
>>>>>
>>>>> * New APIs
>>>>> * Any APIs that use specific features identified by pre-defined
>>>>> experimental Kind types defined in [1] (for example, I/O connectors APIs
>>>>> that use Beam Schemas).
>>>>>
>>>>> Removing the experimental tag has the effect of finalizing a number of
>>>>> APIs we've been reluctant to call stable (for example, Beam Schemas,
>>>>> portability, metrics related APIs). These APIs have been around for a long
>>>>> time and I don't see them changing so probably this is the right thing to
>>>>> do. But I just wanted to call it out.
>>>>>
>>>>> Thanks,
>>>>> Cham
>>>>>
>>>>> [1]
>>>>> https://github.com/apache/beam/blob/b9f27f9da2e63b564feecaeb593d7b12783192b0/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java#L48
>>>>>
>>>>> On Fri, Apr 14, 2023 at 1:26 PM Ahmet Altay via dev <
>>>>> dev@beam.apache.org> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 14, 2023 at 1:15 PM Kenneth Knowles <k...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> Thanks for the discussion. Many good points. Probably just removing
>>>>>>> all the annotations is a noop to users, and will solve the "afraid to 
>>>>>>> use
>>>>>>> experimental features" problem.
>>>>>>>
>>>>>>> Regarding stability, the capabilities of Java (and Python is much
>>>>>>> much worse) make it infeasible to produce quality software with the rule
>>>>>>> "once it is public it is frozen forever". But on the other hand, there
>>>>>>> isn't much of a practical alternative. Most projects just make breaking
>>>>>>> changes at minor releases quite often, in my experience. I don't want to
>>>>>>> follow that pattern, for sure.
>>>>>>>
>>>>>>> Regarding Danny's comment of not seeing this culture - check out any
>>>>>>> of our more mature IOs, which all have very high cyclomatic complexity 
>>>>>>> due
>>>>>>> to never being significantly refactored. Adhering to in-place state
>>>>>>> compatibility for update instead of focusing on blue/green deployment is
>>>>>>> also a culprit here. I don't have examples to mind, but the point about 
>>>>>>> the
>>>>>>> culture of stagnation came from my recent experiences as code
>>>>>>> reviewer where there was some idea that we couldn't change things even 
>>>>>>> when
>>>>>>> they were plainly wrong and the change was plainly a fix.
>>>>>>>
>>>>>>> Often, it comes from corners like triggered side inputs where we
>>>>>>> simply never had a clear concept and so bringing things into alignment 
>>>>>>> with
>>>>>>> a spec will break someone, by necessity. To be clear: I have not 
>>>>>>> received
>>>>>>> pushback on that one (yet). Some other examples are
>>>>>>> https://s.apache.org/finishing-triggers-drop-data (breaking change
>>>>>>> necessary to eliminate data loss risk)
>>>>>>> https://github.com/apache/beam/issues/20528 (fix was too slow
>>>>>>> because we were hesitant to commit a breaking fix)
>>>>>>> https://github.com/apache/beam/pull/8134#pullrequestreview-218592801
>>>>>>> (left unsafe API in place, applied doc-only fix).
>>>>>>>
>>>>>>> But indeed, of all the issues I raised, the customer concern with
>>>>>>> `@Experimental` was the most important. We have had a few threads about 
>>>>>>> it
>>>>>>> in the past, too, and it hasn't gotten better.
>>>>>>>
>>>>>>>  1. It does not have the intended effect (making users OK with
>>>>>>> evolving APIs and behavior to allow us to reach a high level of quality)
>>>>>>>  2. It has an unintended effect (making users afraid to use things
>>>>>>> which they should be happy to use)
>>>>>>>  3. We don't use it consistently (many less-safe things are not
>>>>>>> experimental, many totally stable things are experimental)
>>>>>>>
>>>>>>> Because of 3, if we don't have a feasible way to move to
>>>>>>> "evolving/unstable by default" in a way that users know and are OK with,
>>>>>>> then 1 is impossible. And so the only way to fix 2 is to just eliminate 
>>>>>>> the
>>>>>>> annotation approach entirely and go with language conventions.
>>>>>>>
>>>>>>
>>>>>> +1 to eliminating @Experimental as a Beam level annotation. That is
>>>>>> the simplest approach that will get us to a consistent state, and it will
>>>>>> align the goals and intentions of us with users'.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Wed, Apr 12, 2023 at 5:10 PM Ahmet Altay via dev <
>>>>>>> dev@beam.apache.org> wrote:
>>>>>>>
>>>>>>>> I agree with Alexey and Byron.
>>>>>>>> 1. We do not have any concrete evidence of our users paying
>>>>>>>> attention to any of those annotations. Experimental API that were in 
>>>>>>>> that
>>>>>>>> state for a long while are good examples. A possible exception is a
>>>>>>>> deprecated annotation. My preference would be to simplify annotations 
>>>>>>>> to
>>>>>>>> nothing (stable enough for use and will evolve backward 
>>>>>>>> compatibility), and
>>>>>>>> maybe deprecated annotations.
>>>>>>>> 2. If you all think that Experimental annotation is needed, Byron's
>>>>>>>> suggestion (more or less what we do today) but with some concrete life
>>>>>>>> cycle definitions of those annotations would be useful to our users. 
>>>>>>>> (An
>>>>>>>> example could be: experimental APIs either need to graduate or be 
>>>>>>>> removed
>>>>>>>> in X releases.)
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Apr 4, 2023 at 9:01 AM Alexey Romanenko <
>>>>>>>> aromanenko....@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Great and long-to-wait topic to discuss.
>>>>>>>>>
>>>>>>>>> My personal opinion based on what I saw on different open-source
>>>>>>>>> projects is that all such annotations, like @Experimental or @Stable, 
>>>>>>>>> are
>>>>>>>>> not usefull along the time and even rather useless and misleading. 
>>>>>>>>> What
>>>>>>>>> actually play roles is artifacts publishing and public API despite 
>>>>>>>>> how it
>>>>>>>>> was annotated. Once a class/method was published and available for 
>>>>>>>>> users to
>>>>>>>>> use, it should be considered as “stable" (even if it’s not yet stable 
>>>>>>>>> from
>>>>>>>>> its developers point of view) and can’t be easily removed/changed in 
>>>>>>>>> the
>>>>>>>>> next releases.
>>>>>>>>>
>>>>>>>>> At Beam, we have a “good" example with @Experimental that was used
>>>>>>>>> to annotate many parts of code in the beginning of its creation but 
>>>>>>>>> then
>>>>>>>>> perhaps forgotten to be removed whenever this code is already used by 
>>>>>>>>> many
>>>>>>>>> users and API can’t be just changed despite of this annotation.
>>>>>>>>>
>>>>>>>>> So, I’m pro to dismiss such annotations and consider all public
>>>>>>>>> and user-available API as “stable”. If it’s needed to change/remove a
>>>>>>>>> public API then we should follow the procedure of API deprecation and 
>>>>>>>>> final
>>>>>>>>> removing, at least, after 3 major (x.y) Beam releases. It should help 
>>>>>>>>> to
>>>>>>>>> have the clear rules for API changes and avoiding breaking changes for
>>>>>>>>> users.
>>>>>>>>>
>>>>>>>>> —
>>>>>>>>> Alexey
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 3 Apr 2023, at 17:04, Byron Ellis via dev <dev@beam.apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Honestly, I think APIs could be pretty simply defined if you think
>>>>>>>>> of it in terms of the user:
>>>>>>>>>
>>>>>>>>> @Deprecated = this was either stable or evolve but the
>>>>>>>>> functionality/interface will go away at a future date
>>>>>>>>>
>>>>>>>>> @Stable = the user of this API opting out of changes to
>>>>>>>>> functionality and interface. For example, default options don't 
>>>>>>>>> change for
>>>>>>>>> a transform annotated this way.
>>>>>>>>>
>>>>>>>>> Evolving (No Annotation) = the user is opting in to changes to
>>>>>>>>> functionality but not to interface. We should generally try to write
>>>>>>>>> backwards compatible code, but on the other hand the release model 
>>>>>>>>> does not
>>>>>>>>> force users into an upgrade
>>>>>>>>>
>>>>>>>>> @Experimental = this functionality / interface might be a bad idea
>>>>>>>>> and could go away at any time
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Apr 3, 2023 at 7:22 AM Danny McCormick via dev <
>>>>>>>>> dev@beam.apache.org> wrote:
>>>>>>>>>
>>>>>>>>>> *;tldr - I'd like "evolving" to be further defined, specifically
>>>>>>>>>> around how we will make decisions about breaking behavior and API 
>>>>>>>>>> changes*
>>>>>>>>>>
>>>>>>>>>> I don't particularly care what tags we use as long as they're
>>>>>>>>>> well documented. With that said, I think the following framing needs 
>>>>>>>>>> to be
>>>>>>>>>> documented with more definition to flesh out the underlying 
>>>>>>>>>> philosophy:
>>>>>>>>>>
>>>>>>>>>> *>  - new code is changeable/evolving by default (so we don't
>>>>>>>>>> have to always remember to annotate it) but users have confidence 
>>>>>>>>>> they can
>>>>>>>>>> use it in production (because we have good software engineering 
>>>>>>>>>> practices)*
>>>>>>>>>>
>>>>>>>>>> * > - Experimental would be reserved for more risky things*
>>>>>>>>>> * > - after we are confident an API is stable, because it has
>>>>>>>>>> been the same across a couple releases, we mark it*
>>>>>>>>>>
>>>>>>>>>> Here, we have 3 classes of APIs - "experimental", "stable", and
>>>>>>>>>> "evolving" (or alternately "undefined").
>>>>>>>>>>
>>>>>>>>>> "Experimental" seems clear - we can make any changes we want.
>>>>>>>>>> "Stable" is reasonably straightforward as well - we will only make
>>>>>>>>>> non-breaking changes except in exceptional cases (e.g. security 
>>>>>>>>>> hole, total
>>>>>>>>>> failure of functionality, etc...)
>>>>>>>>>>
>>>>>>>>>> With "evolving" is the idea that we can still make any changes we
>>>>>>>>>> want, but we think it's less likely we'll need to? Are silent 
>>>>>>>>>> behavior
>>>>>>>>>> changes acceptable here (my vote would be no)? What about breaking 
>>>>>>>>>> API
>>>>>>>>>> changes (my vote would be rarely)?
>>>>>>>>>>
>>>>>>>>>> I think being able to change our APIs is an ok goal, but outside
>>>>>>>>>> of a true experimental context we should still be weighing the cost 
>>>>>>>>>> of API
>>>>>>>>>> changes against the benefit; we have a problem of people not 
>>>>>>>>>> updating to
>>>>>>>>>> newer SDKs, and introducing more breaking changes will just 
>>>>>>>>>> exacerbate that
>>>>>>>>>> problem. Maybe my concerns are just a consequence of me not really 
>>>>>>>>>> seeing
>>>>>>>>>> the same things that you're seeing, specifically: "*I'm seeing a
>>>>>>>>>> culture of being afraid to change things, even when it would be good 
>>>>>>>>>> for
>>>>>>>>>> users, because our API surface area is far too large and not 
>>>>>>>>>> explicitly
>>>>>>>>>> chosen.*" Mostly what I've seen is a healthy concern about
>>>>>>>>>> making it hard for users to upgrade versions, but my view is 
>>>>>>>>>> probably just
>>>>>>>>>> limited here.
>>>>>>>>>>
>>>>>>>>>> My ideal framing for "evolving" is: an *evolving* API can make
>>>>>>>>>> breaking API changes between versions, but this will be rare and 
>>>>>>>>>> weighed
>>>>>>>>>> against the cost of slowing users' upgrade process. All breaking 
>>>>>>>>>> changes
>>>>>>>>>> will be communicated in our change log. An *evolving* API will
>>>>>>>>>> not make silent behavior changes except in exceptional cases (e.g. 
>>>>>>>>>> patching
>>>>>>>>>> a security gap, fixing total failures of functionality).
>>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Danny
>>>>>>>>>>
>>>>>>>>>> On Mon, Apr 3, 2023 at 9:02 AM Jan Lukavský <je...@seznam.cz>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>>
>>>>>>>>>>> removing @Experimental and adding explicit @Stable annotation
>>>>>>>>>>> makes
>>>>>>>>>>> sense to me. FWIW, when we were designing Euphoria API, we
>>>>>>>>>>> adopted the
>>>>>>>>>>> following convention:
>>>>>>>>>>>
>>>>>>>>>>>   - the default stability of "evolving", @Experimental for
>>>>>>>>>>> really
>>>>>>>>>>> experimental code [1]
>>>>>>>>>>>
>>>>>>>>>>>   - target @Audience of API [2] (pipeline author, runner,
>>>>>>>>>>> internal, test)
>>>>>>>>>>>
>>>>>>>>>>>   - and @StateComplexity of operators (PTransforms) [3]
>>>>>>>>>>>
>>>>>>>>>>> The last part is something that was planned to be used by tools
>>>>>>>>>>> that can
>>>>>>>>>>> analyze the Pipeline for performance or visualize which
>>>>>>>>>>> transform(s) are
>>>>>>>>>>> most state-consuming. But this ended only as plans. :)
>>>>>>>>>>>
>>>>>>>>>>>   Jan
>>>>>>>>>>>
>>>>>>>>>>> [1]
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/stability/Experimental.java
>>>>>>>>>>>
>>>>>>>>>>> [2]
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/audience/Audience.java
>>>>>>>>>>>
>>>>>>>>>>> [3]
>>>>>>>>>>>
>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/operator/StateComplexity.java
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On 3/31/23 23:05, Kenneth Knowles wrote:
>>>>>>>>>>> > Hi all,
>>>>>>>>>>> >
>>>>>>>>>>> > Long ago, we adopted two annotations in Beam to communicate to
>>>>>>>>>>> users:
>>>>>>>>>>> >
>>>>>>>>>>> >  - `@Experimental` indicates that an API might change
>>>>>>>>>>> >  - `@Internal` indicates that an API is not meant for users.
>>>>>>>>>>> >
>>>>>>>>>>> > I've seen some real problems with this approach:
>>>>>>>>>>> >
>>>>>>>>>>> >  - Users are afraid to use `@Experimental` APIs, because they
>>>>>>>>>>> are
>>>>>>>>>>> > worried they are not production-ready. But it really just
>>>>>>>>>>> means they
>>>>>>>>>>> > might change, and has nothing to do with that.
>>>>>>>>>>> >  - People write new code and do not put `@Experimental`
>>>>>>>>>>> annotations on
>>>>>>>>>>> > it, even though it really should be able to change for a
>>>>>>>>>>> while, so we
>>>>>>>>>>> > can do a good job.
>>>>>>>>>>> >  - I'm seeing a culture of being afraid to change things, even
>>>>>>>>>>> when it
>>>>>>>>>>> > would be good for users, because our API surface area is far
>>>>>>>>>>> too large
>>>>>>>>>>> > and not explicitly chosen.
>>>>>>>>>>> >  - `@Internal` is not that well-known. And now we have many
>>>>>>>>>>> target
>>>>>>>>>>> > audiences: Beam devs, PTransform devs, tool devs, pipeline
>>>>>>>>>>> authors.
>>>>>>>>>>> > Some of them probably want to use `@Internal` stuff!
>>>>>>>>>>> >
>>>>>>>>>>> > I looked at a couple sibling projects and what they have
>>>>>>>>>>> >  - Flink:
>>>>>>>>>>> >  - Spark:
>>>>>>>>>>> >
>>>>>>>>>>> > They have many more tags, and some of them seem to have
>>>>>>>>>>> reverse
>>>>>>>>>>> > defaults to Beam.
>>>>>>>>>>> >
>>>>>>>>>>> > Flink:
>>>>>>>>>>> >
>>>>>>>>>>> https://github.com/apache/flink/tree/master/flink-annotations/src/main/java/org/apache/flink/annotation
>>>>>>>>>>> >
>>>>>>>>>>> >  - Experimental
>>>>>>>>>>> >  - Internal.java
>>>>>>>>>>> >  - Public
>>>>>>>>>>> >  - PublicEvolving
>>>>>>>>>>> >  - VisibleForTesting
>>>>>>>>>>> >
>>>>>>>>>>> > Spark:
>>>>>>>>>>> >
>>>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/java/org/apache/spark/annotation
>>>>>>>>>>>  and
>>>>>>>>>>>
>>>>>>>>>>> >
>>>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/scala/org/apache/spark/annotation
>>>>>>>>>>> >
>>>>>>>>>>> >  - AlphaComponent
>>>>>>>>>>> >  - DeveloperApi
>>>>>>>>>>> >  - Evolving
>>>>>>>>>>> >  - Experimental
>>>>>>>>>>> >  - Private
>>>>>>>>>>> >  - Stable
>>>>>>>>>>> >  - Unstable
>>>>>>>>>>> >  - Since
>>>>>>>>>>> >
>>>>>>>>>>> > I think it would help users to understand Beam with some
>>>>>>>>>>> simple,
>>>>>>>>>>> > though possibly large-scale changes. My goal would be:
>>>>>>>>>>> >
>>>>>>>>>>> >  - new code is changeable/evolving by default (so we don't
>>>>>>>>>>> have to
>>>>>>>>>>> > always remember to annotate it) but users have confidence they
>>>>>>>>>>> can use
>>>>>>>>>>> > it in production (because we have good software engineering
>>>>>>>>>>> practices)
>>>>>>>>>>> >  - Experimental would be reserved for more risky things
>>>>>>>>>>> >  - after we are confident an API is stable, because it has
>>>>>>>>>>> been the
>>>>>>>>>>> > same across a couple releases, we mark it
>>>>>>>>>>> >
>>>>>>>>>>> > A concrete proposal to achieve this would be:
>>>>>>>>>>> >
>>>>>>>>>>> >  - Add a @Stable annotation and use it as appropriate on our
>>>>>>>>>>> primary APIs
>>>>>>>>>>> >  - [Possibly] add an @Evolving annotation that would also be
>>>>>>>>>>> the default.
>>>>>>>>>>> >  - Remove most `@Experimental` annotations or change them to
>>>>>>>>>>> `@Evolving`
>>>>>>>>>>> >  - Communicate about this (somehow). If possible, surface the
>>>>>>>>>>> > `@Evolving` default in documentation.
>>>>>>>>>>> >
>>>>>>>>>>> > The last bit is the hardest.
>>>>>>>>>>> >
>>>>>>>>>>> > Kenn
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>

Reply via email to