Quick idea - can we add a label in Jira API or something like that and then Josh can filter those in the bi-weekly report? In the meantime if there are big changes that people consider they need a DISCUSS thread for they can always open one? I will be happy to help with the mentioned filter/report. Also +1 on having Contributing doc with broader discussion and directions around API
On Mon, 5 Dec 2022 at 8:32, Benedict <bened...@apache.org> wrote: > I would be fine with a formal API change review period prior to release, > but if we go that route people should expect to have to revisit work they > completed a while back, and there should be no presumption that decisions > taken without a DISCUSS thread should be preferred to alternative > suggestions - and we should have a clear policy of reverting any work if it > is not revisited based on the outcome of any discussion, since seeking > broader input earlier was always an option. I expect this approach could > lead to frustration, but it might actually be a better system than separate > DISCUSS threads as the changes can be considered holistically. > > The idea that a DISCUSS thread for each change would be burdensome however > is I think mistaken. Even if 70 were the true figure, it would have been > around one per week, and they could easily have been batched. I’d also be > fine with white listing some changes (eg JMX and warning messages) - but > definitely not virtual tables or CQL. These APIs develop strong user > dependencies, and are very hard to change. > > We should not restrict input on our main user experiences to the handful > of people with time to closely monitor Jira, most of whom are not even > users of Cassandra. We should be seeking the broadest visibility, including > casual observers and non-contributors. > > On 5 Dec 2022, at 13:05, Paulo Motta <pauloricard...@gmail.com> wrote: > > > > It feels bit of overkill to me to require addition of any new virtual > tables/JMX/configuration/knob to go through a discuss thread. If this would > require 70 threads for the previous release I think this would easily > become spammy and counter-productive. > > I think the burden should be on the maintainer to keep up with changes > being added to the database and chime in any areas it feel responsible for, > as it has been the case and has worked relatively well. > > I think it makes sense to look into improving visibility of API changes, > so people can more easily review a summary of API changes versus reading > through the whole changelog (perhaps we need a summarized API change log?). > > It would also help to have more explicit guidelines on what kinds of API > changes are riskier and might require additional visibility via a DISCUSS > thread. > > Also, would it make sense to introduce a new API review stage during > release validation, and agree to revert/update any API changes that may be > controversial that were not caught during normal review? > > On Mon, 5 Dec 2022 at 06:49 Andrés de la Peña <adelap...@apache.org> > wrote: > >> Indeed that contribution policy should be clearer and not be on a page >> titled code style, thanks for briging that up. >> >> If we consider all those things APIs, and additions are also considered >> changes that require a DISCUSS thread, it turns out that almost any >> not-bugfix ticket would require a mail list thread. In fact, if one goes >> through CHANGES.txt it's easy to see that most entries would have required >> a DISCUSS thread. >> >> I think that such a strict policy would only make us lose agility and >> increase the burden of almost any contribution. After all, it's not that >> changes without a DISCUSS thread happen in secret. Changes are publicly >> visible on their tickets, those tickets are notified on Slack so anyone can >> jump into the ticket discussions and set themselves as reviewers, and >> reviewers can ask for DISCUSS threads whenever they think more opinions or >> broader consensus are needed. >> >> Also, a previous DISCUSS thread is not going to impede that any changes >> are going to be questioned later. We have seen changes that are proposed, >> discussed and approved as CEPs, reviewed for weeks or months, and finally >> committed, and still they are questioned shortly after that cycle, and >> asked to be changed or discussed again. I don't think that an avalanche of >> DISCUSS threads is going to improve that, since usually the problem is that >> people don't have the time for deeply looking into the changes when they >> are happening. I doubt that more notification channels are going to improve >> that. >> >> Of course I'm not saying that there should never DISCUSS threads before >> starting a change. Probably we can all agree that major changes and things >> that break compatibility would need previous discussion. >> >> On Mon, 5 Dec 2022 at 10:16, Benjamin Lerer <ble...@apache.org> wrote: >> >>> Thanks for opening this thread Josh, >>> >>> It seems perfectly normal to me that for important changes or questions >>> we raise some discussion to the mailing list. >>> >>> My understanding of the current proposal implies that for the 4.1 >>> release we should have had to raise over 70 discussion threads. >>> We have a minimum of 2 commiters required for every patch. Should we not >>> trust them to update nodetool, the virtual tables or other things on their >>> own? >>> >>> There is already multiple existing ways to track changes in specific >>> code areas. I am personaly tracking the areas in which I am the most >>> involved this way and I know that a lot of people do the same. >>> >>> To be transparent, It is not clear to me what the underlying issue is? >>> Do we have some specific cases that illustrate the underlying problem? >>> Thrift and JMX are from a different time in my opinion. >>> >>> Le lun. 5 déc. 2022 à 08:09, Berenguer Blasi <berenguerbl...@gmail.com> >>> a écrit : >>> >>>> +1 to moving that into it's own section outside the coding style page. >>>> >>>> Dinesh I also thought in terms of backward compatibility here. But >>>> notice the discussion is about _any change_ to the API such as adding new >>>> CQL functions. Would adding or changing an exception type or a user warning >>>> qualify for a DISCUSS thread also? I wonder if we're talking ourselves into >>>> opening a DISCUSS for almost every ticket and sthg easy to miss. >>>> >>>> I wonder, you guys know the code better, if 'public APIs' could be >>>> matched to a reasonable set of files (cql parsing, yaml, etc) and have >>>> jenkins send an email when changes are detected on them. Overkill? bad >>>> idea? :thinking:... >>>> On 4/12/22 1:14, Dinesh Joshi wrote: >>>> >>>> We should also very clearly list out what is considered a public API. >>>> The current statement that we have is insufficient: >>>> >>>> public APIs, including CQL, virtual tables, JMX, yaml, system >>>> properties, etc. >>>> >>>> >>>> The guidance on treatment of public APIs should also move out of "Code >>>> Style" page as it isn't strictly related to code style. Backward >>>> compatibility of public APIs is a best practice & project policy. >>>> >>>> >>>> On Dec 2, 2022, at 2:08 PM, Benedict <bened...@apache.org> wrote: >>>> >>>> I think some of that text also got garbled by mixing up how you >>>> approach internal APIs and external APIs. We should probably clarify that >>>> there are different burdens for each. Which is all my fault as the >>>> formulator. I remember it being much clearer in my head. >>>> >>>> My view is the same as yours Josh. Evolving the database’s public APIs >>>> is something that needs community consensus. The more visibility these >>>> decisions get, the better the final outcome (usually). Even small API >>>> changes need to be carefully considered to ensure the API evolves >>>> coherently, and this is particularly true for something as complex and >>>> central as CQL. >>>> >>>> A DISCUSS thread is a good forcing function to think about what you’re >>>> trying to achieve and why, and to provide others a chance to spot potential >>>> flaws, alternatives and interactions with work you may not be aware of. >>>> >>>> It would be nice if there were an easy rubric for whether something >>>> needs feedback, but I don’t think there is. One person’s obvious >>>> change may be another’s obvious problem. So I think any decision that binds >>>> the project going forwards should have a lazy consensus DISCUSS thread at >>>> least. >>>> >>>> I don’t think it needs to be burdensome though - trivial API changes >>>> could begin while the DISCUSS thread is underway, expecting they usually >>>> won’t raise a murmur. >>>> >>>> On 2 Dec 2022, at 19:25, Josh McKenzie <jmcken...@apache.org> wrote: >>>> >>>> >>>> Came up this morning / afternoon in dev slack: >>>> https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189 >>>> >>>> The gist of it: we're lacking clarity on whether the expectation on the >>>> project is to hit the dev ML w/a [DISCUSS] thread on _any_ API modification >>>> or only on modifications where the author feels they are adjusting a >>>> paradigm / strategy for an API. >>>> >>>> The code style section on Public APIs is actually a little unclear: >>>> https://cassandra.apache.org/_/development/code_style.html >>>> >>>> Public APIs >>>> >>>> These considerations are especially important for public APIs, including >>>> CQL, virtual tables, JMX, yaml, system properties, etc. Any planned >>>> additions must be carefully considered in the context of any existing >>>> APIs. Where possible the approach of any existing API should be followed. >>>> Where the existing API is poorly suited, a strategy should be developed to >>>> modify or replace the existing API with one that is more coherent in light >>>> of the changes - which should also carefully consider any planned or >>>> expected future changes to minimise churn. Any strategy for modifying APIs >>>> should be brought to dev@cassandra.apache.org for discussion. >>>> >>>> >>>> My .02: >>>> 1. We should rename that page to a "code contribution guide" as >>>> discussed on the slack thread >>>> 2. *All* publicly facing API changes (tool output, CQL semantics, JMX, >>>> vtables, .java interfaces targeting user extension, etc) should hit the dev >>>> ML w/a [DISCUSS] thread. >>>> >>>> This takes the burden of trying to determine if a change is consistent >>>> w/existing strategy or not etc. off the author in isolation and allows devs >>>> to work concurrently on API changes w/out risk of someone else working on >>>> something that may inform their work or vice versa. >>>> >>>> We've learned that API's are *really really hard* to deprecate, >>>> disruptive to our users when we change or remove them, and can cause >>>> serious pain and ecosystem fragmentation when changed. See: Thrift, current >>>> discussions about JMX, etc. They're the definition of a "one-way-door" >>>> decision and represent a long-term maintenance burden commitment from the >>>> project. >>>> >>>> Lastly, I'd expect the vast majority of these discuss threads to be >>>> quick consensus checks resolved via lazy consensus or after some slight >>>> discussion; ideally this wouldn't represent a huge burden of coordination >>>> on folks working on changes. >>>> >>>> So that's 1 opinion. What other opinions are out there? >>>> >>>> ~Josh >>>> >>>> >>>>