+1 on how to iterate without a Beam 3.0

Often that just means, write the new thing, "support both for a while",make
it clear how to migrate to the new thing, and the next Major Version just
drops everything that doesn't cut the mustard anymore.


On Mon, Apr 17, 2023, 11:54 AM Ahmet Altay via dev <dev@beam.apache.org>
wrote:

> It sounds like there is agreement in eliminating the
> experimental annotation. Should we stop using them in new code? Or should
> we do a pass to remove those annotations?
>
> On Mon, Apr 17, 2023 at 11:24 AM Kenneth Knowles <k...@apache.org> wrote:
>
>>
>>
>> On Mon, Apr 17, 2023 at 9:34 AM Kerry Donny-Clark via dev <
>> dev@beam.apache.org> wrote:
>>
>>> +1 to eliminating @Experimental as a Beam level annotation.
>>> I think the main point is that if no one pays attention to such
>>> annotations, then they are only noise and deliver negative value.
>>>
>>
>> Yes. Consider these two scenarios
>>
>> 1. We change an "experimental" API that is widely used. This causes a
>> pain for many users. We would probably not do it, and we would catch it in
>> code review.
>> 2. We change a non-"experimental" API that is fairly new. This applies to
>> many APIs, since we rarely remember to annotate new APIs. This causes just
>> minor pain for just a few users. TBH I would be OK with this. Rigidity in
>> rejecting such changes just means your first draft is your final draft. Try
>> that in any other endeavor and see how it works for you :-)
>>
>> And it is worse than noise - there are some users who do pay attention to
>> the annotations and are not using things even though they are super safe.
>> That was the main reason I started this thread. The rest of my proposal was
>> just to try to recover some flexibility, but it seems too hard and no
>> immediate consensus on how/if we could manage it.
>>
>> Kenn
>>
>> PS I do agree with Kerry's PS and would love to have that discussion.
>> Perhaps separately, since it will start from square one either way. Every
>> time someone says "Beam 3.0" we should really be thinking "how can we
>> iterate". One big breaking version change doesn't work.
>>
>
> +1 - Thinking about "How can we iterate" would allow us to build something
> users' want in shorter timelines.
>
>
>>
>>
>>
>> Kerry
>>>
>>> PS- Kenn says " the point about the culture of stagnation came from my
>>> recent experiences as code reviewer where there was some idea that we
>>> couldn't change things even when they were plainly wrong and the change was
>>> plainly a fix." This seems like a major point that deserves a more focused
>>> discussion.
>>>
>>> On Fri, Apr 14, 2023 at 5:47 PM Chamikara Jayalath via dev <
>>> dev@beam.apache.org> wrote:
>>>
>>>> I think we've been using the Java Experimental tags in two ways.
>>>>
>>>> * New APIs
>>>> * Any APIs that use specific features identified by pre-defined
>>>> experimental Kind types defined in [1] (for example, I/O connectors APIs
>>>> that use Beam Schemas).
>>>>
>>>> Removing the experimental tag has the effect of finalizing a number of
>>>> APIs we've been reluctant to call stable (for example, Beam Schemas,
>>>> portability, metrics related APIs). These APIs have been around for a long
>>>> time and I don't see them changing so probably this is the right thing to
>>>> do. But I just wanted to call it out.
>>>>
>>>> Thanks,
>>>> Cham
>>>>
>>>> [1]
>>>> https://github.com/apache/beam/blob/b9f27f9da2e63b564feecaeb593d7b12783192b0/sdks/java/core/src/main/java/org/apache/beam/sdk/annotations/Experimental.java#L48
>>>>
>>>> On Fri, Apr 14, 2023 at 1:26 PM Ahmet Altay via dev <
>>>> dev@beam.apache.org> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 14, 2023 at 1:15 PM Kenneth Knowles <k...@apache.org>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Thanks for the discussion. Many good points. Probably just removing
>>>>>> all the annotations is a noop to users, and will solve the "afraid to use
>>>>>> experimental features" problem.
>>>>>>
>>>>>> Regarding stability, the capabilities of Java (and Python is much
>>>>>> much worse) make it infeasible to produce quality software with the rule
>>>>>> "once it is public it is frozen forever". But on the other hand, there
>>>>>> isn't much of a practical alternative. Most projects just make breaking
>>>>>> changes at minor releases quite often, in my experience. I don't want to
>>>>>> follow that pattern, for sure.
>>>>>>
>>>>>> Regarding Danny's comment of not seeing this culture - check out any
>>>>>> of our more mature IOs, which all have very high cyclomatic complexity 
>>>>>> due
>>>>>> to never being significantly refactored. Adhering to in-place state
>>>>>> compatibility for update instead of focusing on blue/green deployment is
>>>>>> also a culprit here. I don't have examples to mind, but the point about 
>>>>>> the
>>>>>> culture of stagnation came from my recent experiences as code
>>>>>> reviewer where there was some idea that we couldn't change things even 
>>>>>> when
>>>>>> they were plainly wrong and the change was plainly a fix.
>>>>>>
>>>>>> Often, it comes from corners like triggered side inputs where we
>>>>>> simply never had a clear concept and so bringing things into alignment 
>>>>>> with
>>>>>> a spec will break someone, by necessity. To be clear: I have not received
>>>>>> pushback on that one (yet). Some other examples are
>>>>>> https://s.apache.org/finishing-triggers-drop-data (breaking change
>>>>>> necessary to eliminate data loss risk)
>>>>>> https://github.com/apache/beam/issues/20528 (fix was too slow
>>>>>> because we were hesitant to commit a breaking fix)
>>>>>> https://github.com/apache/beam/pull/8134#pullrequestreview-218592801
>>>>>> (left unsafe API in place, applied doc-only fix).
>>>>>>
>>>>>> But indeed, of all the issues I raised, the customer concern with
>>>>>> `@Experimental` was the most important. We have had a few threads about 
>>>>>> it
>>>>>> in the past, too, and it hasn't gotten better.
>>>>>>
>>>>>>  1. It does not have the intended effect (making users OK with
>>>>>> evolving APIs and behavior to allow us to reach a high level of quality)
>>>>>>  2. It has an unintended effect (making users afraid to use things
>>>>>> which they should be happy to use)
>>>>>>  3. We don't use it consistently (many less-safe things are not
>>>>>> experimental, many totally stable things are experimental)
>>>>>>
>>>>>> Because of 3, if we don't have a feasible way to move to
>>>>>> "evolving/unstable by default" in a way that users know and are OK with,
>>>>>> then 1 is impossible. And so the only way to fix 2 is to just eliminate 
>>>>>> the
>>>>>> annotation approach entirely and go with language conventions.
>>>>>>
>>>>>
>>>>> +1 to eliminating @Experimental as a Beam level annotation. That is
>>>>> the simplest approach that will get us to a consistent state, and it will
>>>>> align the goals and intentions of us with users'.
>>>>>
>>>>>
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Wed, Apr 12, 2023 at 5:10 PM Ahmet Altay via dev <
>>>>>> dev@beam.apache.org> wrote:
>>>>>>
>>>>>>> I agree with Alexey and Byron.
>>>>>>> 1. We do not have any concrete evidence of our users paying
>>>>>>> attention to any of those annotations. Experimental API that were in 
>>>>>>> that
>>>>>>> state for a long while are good examples. A possible exception is a
>>>>>>> deprecated annotation. My preference would be to simplify annotations to
>>>>>>> nothing (stable enough for use and will evolve backward compatibility), 
>>>>>>> and
>>>>>>> maybe deprecated annotations.
>>>>>>> 2. If you all think that Experimental annotation is needed, Byron's
>>>>>>> suggestion (more or less what we do today) but with some concrete life
>>>>>>> cycle definitions of those annotations would be useful to our users. (An
>>>>>>> example could be: experimental APIs either need to graduate or be 
>>>>>>> removed
>>>>>>> in X releases.)
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 4, 2023 at 9:01 AM Alexey Romanenko <
>>>>>>> aromanenko....@gmail.com> wrote:
>>>>>>>
>>>>>>>> Great and long-to-wait topic to discuss.
>>>>>>>>
>>>>>>>> My personal opinion based on what I saw on different open-source
>>>>>>>> projects is that all such annotations, like @Experimental or @Stable, 
>>>>>>>> are
>>>>>>>> not usefull along the time and even rather useless and misleading. What
>>>>>>>> actually play roles is artifacts publishing and public API despite how 
>>>>>>>> it
>>>>>>>> was annotated. Once a class/method was published and available for 
>>>>>>>> users to
>>>>>>>> use, it should be considered as “stable" (even if it’s not yet stable 
>>>>>>>> from
>>>>>>>> its developers point of view) and can’t be easily removed/changed in 
>>>>>>>> the
>>>>>>>> next releases.
>>>>>>>>
>>>>>>>> At Beam, we have a “good" example with @Experimental that was used
>>>>>>>> to annotate many parts of code in the beginning of its creation but 
>>>>>>>> then
>>>>>>>> perhaps forgotten to be removed whenever this code is already used by 
>>>>>>>> many
>>>>>>>> users and API can’t be just changed despite of this annotation.
>>>>>>>>
>>>>>>>> So, I’m pro to dismiss such annotations and consider all public and
>>>>>>>> user-available API as “stable”. If it’s needed to change/remove a 
>>>>>>>> public
>>>>>>>> API then we should follow the procedure of API deprecation and final
>>>>>>>> removing, at least, after 3 major (x.y) Beam releases. It should help 
>>>>>>>> to
>>>>>>>> have the clear rules for API changes and avoiding breaking changes for
>>>>>>>> users.
>>>>>>>>
>>>>>>>> —
>>>>>>>> Alexey
>>>>>>>>
>>>>>>>>
>>>>>>>> On 3 Apr 2023, at 17:04, Byron Ellis via dev <dev@beam.apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Honestly, I think APIs could be pretty simply defined if you think
>>>>>>>> of it in terms of the user:
>>>>>>>>
>>>>>>>> @Deprecated = this was either stable or evolve but the
>>>>>>>> functionality/interface will go away at a future date
>>>>>>>>
>>>>>>>> @Stable = the user of this API opting out of changes to
>>>>>>>> functionality and interface. For example, default options don't change 
>>>>>>>> for
>>>>>>>> a transform annotated this way.
>>>>>>>>
>>>>>>>> Evolving (No Annotation) = the user is opting in to changes to
>>>>>>>> functionality but not to interface. We should generally try to write
>>>>>>>> backwards compatible code, but on the other hand the release model 
>>>>>>>> does not
>>>>>>>> force users into an upgrade
>>>>>>>>
>>>>>>>> @Experimental = this functionality / interface might be a bad idea
>>>>>>>> and could go away at any time
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Apr 3, 2023 at 7:22 AM Danny McCormick via dev <
>>>>>>>> dev@beam.apache.org> wrote:
>>>>>>>>
>>>>>>>>> *;tldr - I'd like "evolving" to be further defined, specifically
>>>>>>>>> around how we will make decisions about breaking behavior and API 
>>>>>>>>> changes*
>>>>>>>>>
>>>>>>>>> I don't particularly care what tags we use as long as they're well
>>>>>>>>> documented. With that said, I think the following framing needs to be
>>>>>>>>> documented with more definition to flesh out the underlying 
>>>>>>>>> philosophy:
>>>>>>>>>
>>>>>>>>> *>  - new code is changeable/evolving by default (so we don't have
>>>>>>>>> to always remember to annotate it) but users have confidence they can 
>>>>>>>>> use
>>>>>>>>> it in production (because we have good software engineering 
>>>>>>>>> practices)*
>>>>>>>>>
>>>>>>>>> * > - Experimental would be reserved for more risky things*
>>>>>>>>> * > - after we are confident an API is stable, because it has been
>>>>>>>>> the same across a couple releases, we mark it*
>>>>>>>>>
>>>>>>>>> Here, we have 3 classes of APIs - "experimental", "stable", and
>>>>>>>>> "evolving" (or alternately "undefined").
>>>>>>>>>
>>>>>>>>> "Experimental" seems clear - we can make any changes we want.
>>>>>>>>> "Stable" is reasonably straightforward as well - we will only make
>>>>>>>>> non-breaking changes except in exceptional cases (e.g. security hole, 
>>>>>>>>> total
>>>>>>>>> failure of functionality, etc...)
>>>>>>>>>
>>>>>>>>> With "evolving" is the idea that we can still make any changes we
>>>>>>>>> want, but we think it's less likely we'll need to? Are silent behavior
>>>>>>>>> changes acceptable here (my vote would be no)? What about breaking API
>>>>>>>>> changes (my vote would be rarely)?
>>>>>>>>>
>>>>>>>>> I think being able to change our APIs is an ok goal, but outside
>>>>>>>>> of a true experimental context we should still be weighing the cost 
>>>>>>>>> of API
>>>>>>>>> changes against the benefit; we have a problem of people not updating 
>>>>>>>>> to
>>>>>>>>> newer SDKs, and introducing more breaking changes will just 
>>>>>>>>> exacerbate that
>>>>>>>>> problem. Maybe my concerns are just a consequence of me not really 
>>>>>>>>> seeing
>>>>>>>>> the same things that you're seeing, specifically: "*I'm seeing a
>>>>>>>>> culture of being afraid to change things, even when it would be good 
>>>>>>>>> for
>>>>>>>>> users, because our API surface area is far too large and not 
>>>>>>>>> explicitly
>>>>>>>>> chosen.*" Mostly what I've seen is a healthy concern about making
>>>>>>>>> it hard for users to upgrade versions, but my view is probably just 
>>>>>>>>> limited
>>>>>>>>> here.
>>>>>>>>>
>>>>>>>>> My ideal framing for "evolving" is: an *evolving* API can make
>>>>>>>>> breaking API changes between versions, but this will be rare and 
>>>>>>>>> weighed
>>>>>>>>> against the cost of slowing users' upgrade process. All breaking 
>>>>>>>>> changes
>>>>>>>>> will be communicated in our change log. An *evolving* API will
>>>>>>>>> not make silent behavior changes except in exceptional cases (e.g. 
>>>>>>>>> patching
>>>>>>>>> a security gap, fixing total failures of functionality).
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Danny
>>>>>>>>>
>>>>>>>>> On Mon, Apr 3, 2023 at 9:02 AM Jan Lukavský <je...@seznam.cz>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>> removing @Experimental and adding explicit @Stable annotation
>>>>>>>>>> makes
>>>>>>>>>> sense to me. FWIW, when we were designing Euphoria API, we
>>>>>>>>>> adopted the
>>>>>>>>>> following convention:
>>>>>>>>>>
>>>>>>>>>>   - the default stability of "evolving", @Experimental for really
>>>>>>>>>> experimental code [1]
>>>>>>>>>>
>>>>>>>>>>   - target @Audience of API [2] (pipeline author, runner,
>>>>>>>>>> internal, test)
>>>>>>>>>>
>>>>>>>>>>   - and @StateComplexity of operators (PTransforms) [3]
>>>>>>>>>>
>>>>>>>>>> The last part is something that was planned to be used by tools
>>>>>>>>>> that can
>>>>>>>>>> analyze the Pipeline for performance or visualize which
>>>>>>>>>> transform(s) are
>>>>>>>>>> most state-consuming. But this ended only as plans. :)
>>>>>>>>>>
>>>>>>>>>>   Jan
>>>>>>>>>>
>>>>>>>>>> [1]
>>>>>>>>>>
>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/stability/Experimental.java
>>>>>>>>>>
>>>>>>>>>> [2]
>>>>>>>>>>
>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/audience/Audience.java
>>>>>>>>>>
>>>>>>>>>> [3]
>>>>>>>>>>
>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/extensions/euphoria/src/main/java/org/apache/beam/sdk/extensions/euphoria/core/annotation/operator/StateComplexity.java
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On 3/31/23 23:05, Kenneth Knowles wrote:
>>>>>>>>>> > Hi all,
>>>>>>>>>> >
>>>>>>>>>> > Long ago, we adopted two annotations in Beam to communicate to
>>>>>>>>>> users:
>>>>>>>>>> >
>>>>>>>>>> >  - `@Experimental` indicates that an API might change
>>>>>>>>>> >  - `@Internal` indicates that an API is not meant for users.
>>>>>>>>>> >
>>>>>>>>>> > I've seen some real problems with this approach:
>>>>>>>>>> >
>>>>>>>>>> >  - Users are afraid to use `@Experimental` APIs, because they
>>>>>>>>>> are
>>>>>>>>>> > worried they are not production-ready. But it really just means
>>>>>>>>>> they
>>>>>>>>>> > might change, and has nothing to do with that.
>>>>>>>>>> >  - People write new code and do not put `@Experimental`
>>>>>>>>>> annotations on
>>>>>>>>>> > it, even though it really should be able to change for a while,
>>>>>>>>>> so we
>>>>>>>>>> > can do a good job.
>>>>>>>>>> >  - I'm seeing a culture of being afraid to change things, even
>>>>>>>>>> when it
>>>>>>>>>> > would be good for users, because our API surface area is far
>>>>>>>>>> too large
>>>>>>>>>> > and not explicitly chosen.
>>>>>>>>>> >  - `@Internal` is not that well-known. And now we have many
>>>>>>>>>> target
>>>>>>>>>> > audiences: Beam devs, PTransform devs, tool devs, pipeline
>>>>>>>>>> authors.
>>>>>>>>>> > Some of them probably want to use `@Internal` stuff!
>>>>>>>>>> >
>>>>>>>>>> > I looked at a couple sibling projects and what they have
>>>>>>>>>> >  - Flink:
>>>>>>>>>> >  - Spark:
>>>>>>>>>> >
>>>>>>>>>> > They have many more tags, and some of them seem to have reverse
>>>>>>>>>> > defaults to Beam.
>>>>>>>>>> >
>>>>>>>>>> > Flink:
>>>>>>>>>> >
>>>>>>>>>> https://github.com/apache/flink/tree/master/flink-annotations/src/main/java/org/apache/flink/annotation
>>>>>>>>>> >
>>>>>>>>>> >  - Experimental
>>>>>>>>>> >  - Internal.java
>>>>>>>>>> >  - Public
>>>>>>>>>> >  - PublicEvolving
>>>>>>>>>> >  - VisibleForTesting
>>>>>>>>>> >
>>>>>>>>>> > Spark:
>>>>>>>>>> >
>>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/java/org/apache/spark/annotation
>>>>>>>>>>  and
>>>>>>>>>>
>>>>>>>>>> >
>>>>>>>>>> https://github.com/apache/spark/tree/master/common/tags/src/main/scala/org/apache/spark/annotation
>>>>>>>>>> >
>>>>>>>>>> >  - AlphaComponent
>>>>>>>>>> >  - DeveloperApi
>>>>>>>>>> >  - Evolving
>>>>>>>>>> >  - Experimental
>>>>>>>>>> >  - Private
>>>>>>>>>> >  - Stable
>>>>>>>>>> >  - Unstable
>>>>>>>>>> >  - Since
>>>>>>>>>> >
>>>>>>>>>> > I think it would help users to understand Beam with some
>>>>>>>>>> simple,
>>>>>>>>>> > though possibly large-scale changes. My goal would be:
>>>>>>>>>> >
>>>>>>>>>> >  - new code is changeable/evolving by default (so we don't have
>>>>>>>>>> to
>>>>>>>>>> > always remember to annotate it) but users have confidence they
>>>>>>>>>> can use
>>>>>>>>>> > it in production (because we have good software engineering
>>>>>>>>>> practices)
>>>>>>>>>> >  - Experimental would be reserved for more risky things
>>>>>>>>>> >  - after we are confident an API is stable, because it has been
>>>>>>>>>> the
>>>>>>>>>> > same across a couple releases, we mark it
>>>>>>>>>> >
>>>>>>>>>> > A concrete proposal to achieve this would be:
>>>>>>>>>> >
>>>>>>>>>> >  - Add a @Stable annotation and use it as appropriate on our
>>>>>>>>>> primary APIs
>>>>>>>>>> >  - [Possibly] add an @Evolving annotation that would also be
>>>>>>>>>> the default.
>>>>>>>>>> >  - Remove most `@Experimental` annotations or change them to
>>>>>>>>>> `@Evolving`
>>>>>>>>>> >  - Communicate about this (somehow). If possible, surface the
>>>>>>>>>> > `@Evolving` default in documentation.
>>>>>>>>>> >
>>>>>>>>>> > The last bit is the hardest.
>>>>>>>>>> >
>>>>>>>>>> > Kenn
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>

Reply via email to