Hi Chesnay,

I think that there are two things we are discussing here:
1. The API stability story we WANT to have.
2. The API stability guarantees we CAN have.

We can only design for what we want. Good API stability with affordable
maintenance overhead does demand careful design from the high level
architecture to specific APIs. I also believe that the proposed stability
guarantees are achievable with good practices. I understand the
concern that there might be some existing code that makes it more difficult
for us to get where we want to be. In that case, I think we should discuss
how to improve the code, instead of compromising on what we want.

So I think it is valuable that we bring up the parts of the code that
blocks us and see if we can solve it.

On the current 2.0 agenda is potentially dropping support for Java 8/11,
> which may very well be a problem for our current users.

Java 8 support was deprecated in May 2022 with release 1.15. Assuming we
have 2.0 released by the end of 2024, there are 2.5 years of migration
window. If our survey shows that most people have migrated off from Java 8,
I think it is reasonable to drop 1.8 support. When doing so, that basically
means users on Java 8 will be stuck on 1.x, and not getting new features
because all the new features are going to be in the 2.x releases.
Personally I think this is reasonable, given the long migration window
there.

Technically yes, but look at how long it took to get us to 2.0. ;)
>
> There's a separate discussion to be had on the cadence of major releases
> going forward, and there seem to be different opinions on that.
>
> If we take the Kafka example of 2 minor releases between major ones, that
> for us means that users have to potentially deal with breaking changes
> every 6 months, which seems like a lot.
>
> Given our track record I would prefer a regular cycle (1-2 years) to force
> us to think about this whole topic, and not put it again to the wayside and
> giving us (and users) a clear expectation on when breaking changes can be
> made.
>
> But again, maybe this should be in a separate thread.
>
I agree it makes sense for us to review the necessity of a major release
with a regular cycle.


> For a concrete example, consider the job submission. A few releases back
> we made changes such that the initialization of the job master happens
> asynchronously.
> This meant the job submission call returns sooner, and the job state enum
> was extended to cover this state.
> API-wise we consider this a compatible change, but the observed behavior
> may be different.
> Metrics are another example; I believe over time we changed what some
> metrics returned a few times.


For the job submission example, here's what I think we should do:
1. If we consider dispatcher gateway as a public API as well, in the
DispatcherGateway, introduce a new RPC method version submitJobV2() for the
async submission. Otherwise, we can change the RPC method in place, maybe
with an option of async or not.
2. On the client side, have a separate method of submitAsync(), while
keeping the original synchronous API. Whether the sync API should be
removed or not is debatable as users may want to block on it to fail fast
in case of some failures. The implementation of submit() and submitAsync()
can potentially share most of the code.
3. Depending on how the JobStatus enum is exposed to the users, we may or
may not need to bump the API version of the related APIs as well. For
example, if we assume that users only query the job status after submit() /
submitAsync() returns, then we don't need to do anything because the
existing users only invoke  submit() which only returns after the job
status becomes CREATED. Therefore the new status of INITIALIZING is not
exposed to them. On the other hand, if we think users might query the job
status before submit() / submitAsync() returns, then we may need to create
RestfulApi.requestJobStatusV2() which may return Initializing status, while
we make sure RestfulApi.requestJobStatus() keeps the current behavior and
does not return Initializing status to the users.

Does this introduce maintenance overhead? Sure. But this is what Kafka has
been doing for the past 10 years. If you check the Kafka protocol guide[1],
it has all the versions of all the RPC requests/responses. Therefore the
client side behavior can be kept the same. Is it affordable? From my
experience, once you have this pattern setup, the maintenance overhead is
not that high.

Metrics are another example; I believe over time we changed what some
> metrics returned a few times.

For metrics usually it is easier. We just need to add a new metric
meanwhile deprecate the previous one.

Thanks again for raising these examples. This is a good discussion, as we
are getting to some root causes of our hesitation about the API stabilities.

Thanks,

Jiangjie (Becket) Qin


On Fri, Jun 16, 2023 at 10:13 AM Xintong Song <tonysong...@gmail.com> wrote:

> Public API is a well defined common concept, and one of its
>> convention is that it only changes with a major version change.
>>
>
> I agree. And from my understanding, demoting a Public API is also a kind
> of such change, just like removing one, which can only happen with major
> version bumps. I'm not proposing to allow demoting Public APIs anytime, but
> only in the case major version bumps happen before reaching the
> 2-minor-release migration period. Actually, demoting would be a weaker
> change compared to removing the API immediately upon major version bumps,
> in order to keep the commitment about the 2-minor-release migration period.
> If the concern is that `@Public` -> `@PublicEvolving` sounds against
> conventions, we may introduce a new annotation if necessary, e.g.,
> `@PublicRetiring`, to avoid confusions.
>
> But it should be
>> completely OK to bump up the major version if we really want to get rid of
>> a public API, right?
>>
>
> I'm not sure about this. Yes, it's completely "legal" that we bump up the
> major version whenever a breaking change is needed. However, this also
> weakens the value of the commitment that public APIs will stay stable
> within the major release series, as the series can end anytime. IMHO, short
> major release series are not something "make the end users happy", but
> backdoors that allow us as the developers to make frequent breaking
> changes. On the contrary, with the demoting approach, we can still have
> longer major release series, while only allowing Public APIs deprecated at
> the end of the previous major version to be removed in the next major
> version.
>
> Given our track record I would prefer a regular cycle (1-2 years) to
>> force us to think about this whole topic, and not put it again to the
>> wayside and giving us (and users) a clear expectation on when breaking
>> changes can be made.
>>
>
> +1. I personally think 2-3 years would be a good time for new major
> versions, or longer if there's no breaking changes needed. That makes 1-2
> year a perfect time to revisit the topic, while leaving us more time to
> prepare the major release if needed.
>
> Best,
>
> Xintong
>
>
>
> On Thu, Jun 15, 2023 at 10:09 PM Chesnay Schepler <ches...@apache.org>
> wrote:
>
>> On 13/06/2023 17:26, Becket Qin wrote:
>> > It would be valuable if we can avoid releasing minor versions for
>> previous
>> > major versions.
>>
>> On paper, /absolutely /agree, but I'm not sure how viable that is in
>> practice.
>>
>> On the current 2.0 agenda is potentially dropping support for Java 8/11,
>> which may very well be a problem for our current users.
>>
>>
>> On 13/06/2023 17:26, Becket Qin wrote:
>> > Thanks for the feedback and sorry for the confusion about Public API
>> > deprecation. I just noticed that there was a mistake in the NOTES part
>> for
>> > Public API due to a copy-paste error... I just fixed it.
>> I'm very relieved to hear that. Glad to hear that we are on the same
>> page on that note.
>>
>>
>> On 15/06/2023 15:20, Becket Qin wrote:
>> > But it should be
>> > completely OK to bump up the major version if we really want to get rid
>> of
>> > a public API, right?
>>
>> Technically yes, but look at how long it took to get us to 2.0. ;)
>>
>> There's a separate discussion to be had on the cadence of major releases
>> going forward, and there seem to be different opinions on that.
>>
>> If we take the Kafka example of 2 minor releases between major ones,
>> that for us means that users have to potentially deal with breaking
>> changes every 6 months, which seems like a lot.
>>
>> Given our track record I would prefer a regular cycle (1-2 years) to
>> force us to think about this whole topic, and not put it again to the
>> wayside and giving us (and users) a clear expectation on when breaking
>> changes can be made.
>>
>> But again, maybe this should be in a separate thread.
>>
>> On 14/06/2023 11:37, Becket Qin wrote:
>> > Do you have an example of behavioral change in mind? Not sure I fully
>> > understand the concern for behavioral change here.
>>
>> This could be a lot of things. It can be performance in certain
>> edge-cases, a bug fix that users (maybe unknowingly) relied upon
>> (https://xkcd.com/1172/), a semantic change to some API.
>>
>> For a concrete example, consider the job submission. A few releases back
>> we made changes such that the initialization of the job master happens
>> asynchronously.
>> This meant the job submission call returns sooner, and the job state
>> enum was extended to cover this state.
>> API-wise we consider this a compatible change, but the observed behavior
>> may be different.
>>
>> Metrics are another example; I believe over time we changed what some
>> metrics returned a few times.
>>
>

Reply via email to