Hi Andrew,


“

An API that is not touched during a "make test" can not be moved

beyond the "in-progress" status. Adding the unit test implies the

commitment from the API contributor to follow the option (5) above

even for "trivial endianness bugs".

“



Absolutely essential! More so, after a change to the API.





Thanks!



--

Regards,

Balaji.





On 5/15/20, 7:20 AM, "vpp-dev@lists.fd.io on behalf of Andrew Yourtchenko" 
<vpp-dev@lists.fd.io on behalf of ayour...@gmail.com> wrote:



    There's a very interesting couple of gerrit changes that just came in

    today that is worth discussing,

    and they are a perfect case study to further clarify the process - so

    I tweaked the subject accordingly..

    The API message itself is relatively minor, but regardless of what is

    agreed that makes a good case study.



    Backstory:



    Once upon a time on Aug 20 2019, commit 053204ab changed

    sw_interface_set_rx_mode.mode from u8 to an enum, but an htonl

    conversion

    function didn't make it there (enums are u32 by default as far as I can 
see).



    This was after the 19.08 branch pull, and it wasn't ever tackled, so

    this (buggy) behavior ended being in 20.01, 20.01.1, and in the

    current 20.05-rc1.



    Fast forward a bit, today I was pinged about the two changes - one for

    master, one for stable/2001:



    https://gerrit.fd.io/r/c/vpp/+/26879 - in master - forces the enum to be a 
u8



    https://gerrit.fd.io/r/c/vpp/+/26915 - in stable/2001 - adds the

    htonl() call and changes the existing ("buggy")

     behavior in the 20.01.2 - thus would silently break any API consumers

    that coded against the previous "buggy" behavior.



    And then we have a question open about stable/2005, which "by the

    letter" can potentially accept only the second approach, since it is

    past the API freeze.



    Additional bit: this API is not touched in make test, so this bug had

    slipped through.



    So there are the following options:



    1) Merge both patches, and treat the 20.05 similar to 20.01, thus

    making a "silent change" in both, but making the master look closer to

    a 19.08 format.



    2) Leave the 20.05 and 20.01 alone with the "buggy" behavior, and

    merge the master patch that shrinks the enum down to 1 byte



    3) Merge the 20.01 and cherry-pick it to master and 2005 - fixing the

    endianness of the u32 enum everywhere, but making an effective "silent

    change" in 20.01&20.05



    4)  merge the patch in master that shrinks the enum down to one byte,

    and cherry-pick it to 20.01 and 20.05 - thus breaking the contract of

   "no api changes" but at least this gets to be explicitly visible early

    on.



    5) under the current proposal, if the API message is defined as

    "production" then there would need to be a new "in-progress" message

    in master with either of the two fixes, the buggy message would need

    to be marked as "deprecated".  And under the current proposal by the

    20.09 the "in-progress" message would become "production", the current

    message would be shown as "deprecated",  to be deleted in 21.01.



    So, the feedback that I would appreciate on the above:



    1) the least worst course of action "right now", for this particular

    issue. I discussed this with Ole and we have different opinions, so I

    would like the actual API users to chime in. Please tell me which is

    the least worst option from your point of view :-)



    2) What is the best course of action in the future. Note, this is also

    the simpler case in that there is a way to trigger a CRC-affecting

    change by forcing the enum to be a u8. What would have been the best

    course of action if it was simply a missing ntohl() for a field that

    noone complained about for 1.5 releases. Can we assume that no-one

    used that API and "fix the representation" ?



    3) would there be an interest in having a sort of registry of "who

    wants to know about things related to each API" - so that if a bug

    like this arises that requires a behavior change to fix, I could poll

    the affected folks and maybe be able to get away with (1) or (3) from

    above ?



    And a tweak to the process (and potentially tooling) with regards to

    going to "production API status":



    An API that is not touched during a "make test" can not be moved

    beyond the "in-progress" status. Adding the unit test implies the

    commitment from the API contributor to follow the option (5) above

    even for "trivial endianness bugs".



    Thoughts ?



    --a









    On 5/14/20, Christian Hopps <cho...@chopps.org> wrote:

    > API stability is borderline critical for a project like VPP I think. Yes, 
it

    > can be used stand-alone, but real value is added by building products on 
top

    > of it.

    >

    > Also important is having an API framework that allows for

    > backward-compatible changes to the API for making improvements.

    >

    > I'm not sure if VPP has the latter, but if there's too much pain with your

    > proposed rules I think the latter should addressed rather than sacrificing

    > the former. So I support your rules.

    >

    > Thanks,

    > Chris.

    >

    >> On May 14, 2020, at 8:28 AM, Ole Troan <otr...@employees.org> wrote:

    >>

    >> Andrew and I have discussed a process around API changes with the goal to

    >> limit the impact on VPP consumers.

    >> One big painpoint in tracking VPP releases for users of the VPP binary 
has

    >> been API changes.

    >>

    >> We want to ensure:

    >> - A production API never changes.

    >>

    >> - A production API can be deprecated with one release notice.

    >> E.g. show_version_2 is introduced in 20.01 and show_version_1 is marked 
as

    >> deprecated.

    >> show_version_1 is marked with "option delete_by = "v20.05";

    >>

    >> - APIs marked as experimental / not released can change with no notice.

    >> Added support for option status = "in_progress";

    >>

    >> This patch:

    >> https://gerrit.fd.io/r/c/vpp/+/26881

    >>

    >> has a tool "crcchecker" that we intend to run as part of the verify job.

    >> It will block modifications to existing production APIs.

    >> make checkstyle-api

    >> It also supports dumping API manifests and make API comparisions across

    >> releases/revisions/commits.

    >>

    >> A production API is defined by having .api semantic major version > 0.

    >> New messages can override that by setting status="in_progress".

    >>

    >> The consequence of this is that developers must maintain two (or more)

    >> versions of an API for some time.

    >> This is obviously painful, but at least it aligns cost and benefit better

    >> than when we forced the API users to do it.

    >>

    >> Looking back from 17.01 onwards, it looks like we on average have done

    >> about 50 backwards incompatible changes per release.

    >> That's ignoring the work on API typing, which have resulted in

    >> significantly more changes.

    >>

    >> The hope is that we can start v20.09 development with this process in

    >> place.

    >>

    >> What do people think?

    >>

    >> Best regards,

    >> Ole

    >

    >


-=-=-=-=-=-=-=-=-=-=-=-
Links: You receive all messages sent to this group.

View/Reply Online (#16417): https://lists.fd.io/g/vpp-dev/message/16417
Mute This Topic: https://lists.fd.io/mt/74228149/21656
Group Owner: vpp-dev+ow...@lists.fd.io
Unsubscribe: https://lists.fd.io/g/vpp-dev/unsub  [arch...@mail-archive.com]
-=-=-=-=-=-=-=-=-=-=-=-

Reply via email to