> The lower the bar to participation on this stuff the better, as it isn’t > deeply technical and we have lots of lurkers here that have relevant > experience and knowledge that can chime in with valuable feedback if given > the chance Good point. Separate thread for API Revisions is the lowest bar for participation w/greatest inclusivity I've seen in the thread; I'm in favor of that on those grounds.
On Mon, Dec 5, 2022, at 4:58 PM, Benedict wrote: > > My view is simply that API discussion should be undertaken in a broader forum > than Jira. The lower the bar to participation on this stuff the better, as it > isn’t deeply technical and we have lots of lurkers here that have relevant > experience and knowledge that can chime in with valuable feedback if given > the chance. The transaction syntax thread demonstrated this with participants > we do not often see. > > Jira is exceptionally noisy, and most discussion is not particularly > important for broad consumption - implementation decisions can always be > revisited, and are anyway fine to be suboptimal. So quite rationally most > folk do not pay attention. > > We want their input for this stuff though. So, yes, I think we should ideally > keep API discussions to this list, preferably exclusively (modulo preparatory > discussions), as that retains a high signal:noise funnel of information for > people to subscribe to and participate in at low cost. > > Links to a list of API topics in the status email just doesn’t serve this > purpose. > >> On 5 Dec 2022, at 16:15, Josh McKenzie <jmcken...@apache.org> wrote: >> >>> can we add a label in Jira API or something like that and then Josh can >>> filter those in the bi-weekly report? >> >>> I do not personally think the project status thread is the right venue for >>> this, though a periodic dedicated “API Changes” thread might not be a bad >>> approach. >> My understanding of Ekaterina's suggestions it that I add a [Pending API >> Changes] section to the email thread every 2-3 weeks w/everything that's >> opened within that time frame with that tag. Batch processing by interested >> parties periodically rather than bulk batch processing at the end of a >> release cycle. >> >> I'm happy to take that on and I think that model could work. I'm also happy >> with a DISCUSS thread once a week on some pending API change someone has in >> flight; to me that doesn't seem like too big a burden, especially if a solid >> majority of them are lazy consensus noops. >> >>> Would adding or changing an exception type or a user warning qualify for a >>> DISCUSS thread also? >> I'm an emphatic "Yes" on this. I've had to use some poorly fit exception >> types in the past because of legacy coupling / commitment with the existing >> API's that make it harder both for us to work on things in the codebase and >> harder for our users to understand what we're trying to communicate with >> them. Further, users have splunk queries and other automated parsing set up >> to check logs for specific exceptions and warning texts; changing those >> things breaks an unknown number of downstream consumers and creates toil, or >> worse, introduces outages, for users. >> >> On Mon, Dec 5, 2022, at 10:34 AM, Andrés de la Peña wrote: >>>> This doesn’t seem like the right approach to me, but if we do not come to >>>> some policy approach here, I will try to schedule some time each quarter >>>> to scan for topics I think should have had a DISCUSS thread, and open them >>>> up for discussion. >>> >>> That after the fact review approach doesn't seem very dependant on whether >>> we have a DISCUSS thread for every change or not, since those supervision >>> scans will probably still happen. After all anyone can ask for >>> modifications on anything at any moment, previously agreed or not. For >>> example, the CEP for guardrails was publicly discussed, voted, approved, >>> reviewed and committed with nested config and a global enable flag, and >>> still we had to revert those two things shortly after commit. >>> >>> On Mon, 5 Dec 2022 at 15:12, Ekaterina Dimitrova <e.dimitr...@gmail.com> >>> wrote: >>>> “ I do not personally think the project status thread is the right venue >>>> for this, though a periodic dedicated “API Changes” thread might not be a >>>> bad approach.” >>>> Just to clarify, I do not suggest to use the status mail to respond with >>>> concerns. But more like - having a link to the filter in the bi-weekly >>>> report and then everyone can open it and check the list of tickets and >>>> comment directly on the tickets or open a thread if they think the issue >>>> deserves one. >>>> Same as having link to the tickets that are blockers or tickets that need >>>> reviewers. Whoever wants will have an easy way to get to those and take >>>> whatever actions they want. >>>> >>>> On Mon, 5 Dec 2022 at 10:07, Benedict <bened...@apache.org> wrote: >>>>> >>>>> Perhaps you misunderstand my concern? I think these decisions need >>>>> broader input, not just my input. >>>>> >>>>> Are you therefore asking why I do not monitor these topics and propose >>>>> DISCUSS threads based on activities others are undertaking? This doesn’t >>>>> seem like the right approach to me, but if we do not come to some policy >>>>> approach here, I will try to schedule some time each quarter to scan for >>>>> topics I think should have had a DISCUSS thread, and open them up for >>>>> discussion. >>>>> >>>>> I do not personally think the project status thread is the right venue >>>>> for this, though a periodic dedicated “API Changes” thread might not be a >>>>> bad approach. >>>>> >>>>> >>>>> >>>>>> On 5 Dec 2022, at 14:16, Benjamin Lerer <b.le...@gmail.com> wrote: >>>>>> >>>>>> Benedict, I am confused. If you are so much concerned about virtual >>>>>> tables or CQL why do you not track those components changes directly? >>>>>> People usually label them correctly I believe. Like that you would be >>>>>> able to provide feedback straight away rather than after the fact. It >>>>>> would be a win for everybody, no? >>>>>> >>>>>> Le lun. 5 déc. 2022 à 15:10, Ekaterina Dimitrova <e.dimitr...@gmail.com> >>>>>> a écrit : >>>>>>> Quick idea - can we add a label in Jira API or something like that and >>>>>>> then Josh can filter those in the bi-weekly report? In the meantime if >>>>>>> there are big changes that people consider they need a DISCUSS thread >>>>>>> for they can always open one? I will be happy to help with the >>>>>>> mentioned filter/report. >>>>>>> Also +1 on having Contributing doc with broader discussion and >>>>>>> directions around API >>>>>>> >>>>>>> On Mon, 5 Dec 2022 at 8:32, Benedict <bened...@apache.org> wrote: >>>>>>>> >>>>>>>> I would be fine with a formal API change review period prior to >>>>>>>> release, but if we go that route people should expect to have to >>>>>>>> revisit work they completed a while back, and there should be no >>>>>>>> presumption that decisions taken without a DISCUSS thread should be >>>>>>>> preferred to alternative suggestions - and we should have a clear >>>>>>>> policy of reverting any work if it is not revisited based on the >>>>>>>> outcome of any discussion, since seeking broader input earlier was >>>>>>>> always an option. I expect this approach could lead to frustration, >>>>>>>> but it might actually be a better system than separate DISCUSS threads >>>>>>>> as the changes can be considered holistically. >>>>>>>> >>>>>>>> The idea that a DISCUSS thread for each change would be burdensome >>>>>>>> however is I think mistaken. Even if 70 were the true figure, it would >>>>>>>> have been around one per week, and they could easily have been >>>>>>>> batched. I’d also be fine with white listing some changes (eg JMX and >>>>>>>> warning messages) - but definitely not virtual tables or CQL. These >>>>>>>> APIs develop strong user dependencies, and are very hard to change. >>>>>>>> >>>>>>>> We should not restrict input on our main user experiences to the >>>>>>>> handful of people with time to closely monitor Jira, most of whom are >>>>>>>> not even users of Cassandra. We should be seeking the broadest >>>>>>>> visibility, including casual observers and non-contributors. >>>>>>>> >>>>>>>> >>>>>>>>> On 5 Dec 2022, at 13:05, Paulo Motta <pauloricard...@gmail.com> wrote: >>>>>>>>> >>>>>>>>> It feels bit of overkill to me to require addition of any new virtual >>>>>>>>> tables/JMX/configuration/knob to go through a discuss thread. If this >>>>>>>>> would require 70 threads for the previous release I think this would >>>>>>>>> easily become spammy and counter-productive. >>>>>>>>> >>>>>>>>> I think the burden should be on the maintainer to keep up with >>>>>>>>> changes being added to the database and chime in any areas it feel >>>>>>>>> responsible for, as it has been the case and has worked relatively >>>>>>>>> well. >>>>>>>>> >>>>>>>>> I think it makes sense to look into improving visibility of API >>>>>>>>> changes, so people can more easily review a summary of API changes >>>>>>>>> versus reading through the whole changelog (perhaps we need a >>>>>>>>> summarized API change log?). >>>>>>>>> >>>>>>>>> It would also help to have more explicit guidelines on what kinds of >>>>>>>>> API changes are riskier and might require additional visibility via >>>>>>>>> a DISCUSS thread. >>>>>>>>> >>>>>>>>> Also, would it make sense to introduce a new API review stage during >>>>>>>>> release validation, and agree to revert/update any API changes that >>>>>>>>> may be controversial that were not caught during normal review? >>>>>>>>> >>>>>>>>> On Mon, 5 Dec 2022 at 06:49 Andrés de la Peña <adelap...@apache.org> >>>>>>>>> wrote: >>>>>>>>>> Indeed that contribution policy should be clearer and not be on a >>>>>>>>>> page titled code style, thanks for briging that up. >>>>>>>>>> >>>>>>>>>> If we consider all those things APIs, and additions are also >>>>>>>>>> considered changes that require a DISCUSS thread, it turns out that >>>>>>>>>> almost any not-bugfix ticket would require a mail list thread. In >>>>>>>>>> fact, if one goes through CHANGES.txt it's easy to see that most >>>>>>>>>> entries would have required a DISCUSS thread. >>>>>>>>>> >>>>>>>>>> I think that such a strict policy would only make us lose agility >>>>>>>>>> and increase the burden of almost any contribution. After all, it's >>>>>>>>>> not that changes without a DISCUSS thread happen in secret. Changes >>>>>>>>>> are publicly visible on their tickets, those tickets are notified on >>>>>>>>>> Slack so anyone can jump into the ticket discussions and set >>>>>>>>>> themselves as reviewers, and reviewers can ask for DISCUSS threads >>>>>>>>>> whenever they think more opinions or broader consensus are needed. >>>>>>>>>> >>>>>>>>>> Also, a previous DISCUSS thread is not going to impede that any >>>>>>>>>> changes are going to be questioned later. We have seen changes that >>>>>>>>>> are proposed, discussed and approved as CEPs, reviewed for weeks or >>>>>>>>>> months, and finally committed, and still they are questioned shortly >>>>>>>>>> after that cycle, and asked to be changed or discussed again. I >>>>>>>>>> don't think that an avalanche of DISCUSS threads is going to improve >>>>>>>>>> that, since usually the problem is that people don't have the time >>>>>>>>>> for deeply looking into the changes when they are happening. I doubt >>>>>>>>>> that more notification channels are going to improve that. >>>>>>>>>> >>>>>>>>>> Of course I'm not saying that there should never DISCUSS threads >>>>>>>>>> before starting a change. Probably we can all agree that major >>>>>>>>>> changes and things that break compatibility would need previous >>>>>>>>>> discussion. >>>>>>>>>> >>>>>>>>>> On Mon, 5 Dec 2022 at 10:16, Benjamin Lerer <ble...@apache.org> >>>>>>>>>> wrote: >>>>>>>>>>> Thanks for opening this thread Josh, >>>>>>>>>>> >>>>>>>>>>> It seems perfectly normal to me that for important changes or >>>>>>>>>>> questions we raise some discussion to the mailing list. >>>>>>>>>>> >>>>>>>>>>> My understanding of the current proposal implies that for the 4.1 >>>>>>>>>>> release we should have had to raise over 70 discussion threads. >>>>>>>>>>> We have a minimum of 2 commiters required for every patch. Should >>>>>>>>>>> we not trust them to update nodetool, the virtual tables or other >>>>>>>>>>> things on their own? >>>>>>>>>>> >>>>>>>>>>> There is already multiple existing ways to track changes in >>>>>>>>>>> specific code areas. I am personaly tracking the areas in which I >>>>>>>>>>> am the most involved this way and I know that a lot of people do >>>>>>>>>>> the same. >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> To be transparent, It is not clear to me what the underlying issue >>>>>>>>>>> is? Do we have some specific cases that illustrate the underlying >>>>>>>>>>> problem? Thrift and JMX are from a different time in my opinion. >>>>>>>>>>> >>>>>>>>>>> Le lun. 5 déc. 2022 à 08:09, Berenguer Blasi >>>>>>>>>>> <berenguerbl...@gmail.com> a écrit : >>>>>>>>>>>> +1 to moving that into it's own section outside the coding style >>>>>>>>>>>> page. >>>>>>>>>>>> >>>>>>>>>>>> Dinesh I also thought in terms of backward compatibility here. But >>>>>>>>>>>> notice the discussion is about _any change_ to the API such as >>>>>>>>>>>> adding new CQL functions. Would adding or changing an exception >>>>>>>>>>>> type or a user warning qualify for a DISCUSS thread also? I wonder >>>>>>>>>>>> if we're talking ourselves into opening a DISCUSS for almost every >>>>>>>>>>>> ticket and sthg easy to miss. >>>>>>>>>>>> >>>>>>>>>>>> I wonder, you guys know the code better, if 'public APIs' could be >>>>>>>>>>>> matched to a reasonable set of files (cql parsing, yaml, etc) and >>>>>>>>>>>> have jenkins send an email when changes are detected on them. >>>>>>>>>>>> Overkill? bad idea? :thinking:... >>>>>>>>>>>> >>>>>>>>>>>> On 4/12/22 1:14, Dinesh Joshi wrote: >>>>>>>>>>>>> We should also very clearly list out what is considered a public >>>>>>>>>>>>> API. The current statement that we have is insufficient: >>>>>>>>>>>>> >>>>>>>>>>>>>> public APIs, including CQL, virtual tables, JMX, yaml, system >>>>>>>>>>>>>> properties, etc. >>>>>>>>>>>>> >>>>>>>>>>>>> The guidance on treatment of public APIs should also move out of >>>>>>>>>>>>> "Code Style" page as it isn't strictly related to code style. >>>>>>>>>>>>> Backward compatibility of public APIs is a best practice & >>>>>>>>>>>>> project policy. >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> On Dec 2, 2022, at 2:08 PM, Benedict <bened...@apache.org> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> I think some of that text also got garbled by mixing up how you >>>>>>>>>>>>>> approach internal APIs and external APIs. We should probably >>>>>>>>>>>>>> clarify that there are different burdens for each. Which is all >>>>>>>>>>>>>> my fault as the formulator. I remember it being much clearer in >>>>>>>>>>>>>> my head. >>>>>>>>>>>>>> >>>>>>>>>>>>>> My view is the same as yours Josh. Evolving the database’s >>>>>>>>>>>>>> public APIs is something that needs community consensus. The >>>>>>>>>>>>>> more visibility these decisions get, the better the final >>>>>>>>>>>>>> outcome (usually). Even small API changes need to be carefully >>>>>>>>>>>>>> considered to ensure the API evolves coherently, and this is >>>>>>>>>>>>>> particularly true for something as complex and central as CQL. >>>>>>>>>>>>>> >>>>>>>>>>>>>> A DISCUSS thread is a good forcing function to think about what >>>>>>>>>>>>>> you’re trying to achieve and why, and to provide others a chance >>>>>>>>>>>>>> to spot potential flaws, alternatives and interactions with work >>>>>>>>>>>>>> you may not be aware of. >>>>>>>>>>>>>> >>>>>>>>>>>>>> It would be nice if there were an easy rubric for whether >>>>>>>>>>>>>> something needs feedback, but I don’t think there is. One >>>>>>>>>>>>>> person’s obvious change may be another’s obvious problem. So I >>>>>>>>>>>>>> think any decision that binds the project going forwards should >>>>>>>>>>>>>> have a lazy consensus DISCUSS thread at least. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I don’t think it needs to be burdensome though - trivial API >>>>>>>>>>>>>> changes could begin while the DISCUSS thread is underway, >>>>>>>>>>>>>> expecting they usually won’t raise a murmur. >>>>>>>>>>>>>> >>>>>>>>>>>>>>> On 2 Dec 2022, at 19:25, Josh McKenzie <jmcken...@apache.org> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Came up this morning / afternoon in dev slack: >>>>>>>>>>>>>>> https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189 >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The gist of it: we're lacking clarity on whether the >>>>>>>>>>>>>>> expectation on the project is to hit the dev ML w/a [DISCUSS] >>>>>>>>>>>>>>> thread on _any_ API modification or only on modifications where >>>>>>>>>>>>>>> the author feels they are adjusting a paradigm / strategy for >>>>>>>>>>>>>>> an API. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> The code style section on Public APIs is actually a little >>>>>>>>>>>>>>> unclear: >>>>>>>>>>>>>>> https://cassandra.apache.org/_/development/code_style.html >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Public APIs >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> These considerations are especially important for public APIs, >>>>>>>>>>>>>>>> including CQL, virtual tables, JMX, yaml, system properties, >>>>>>>>>>>>>>>> etc. Any planned additions must be carefully considered in the >>>>>>>>>>>>>>>> context of any existing APIs. Where possible the approach of >>>>>>>>>>>>>>>> any existing API should be followed. Where the existing API is >>>>>>>>>>>>>>>> poorly suited, a strategy should be developed to modify or >>>>>>>>>>>>>>>> replace the existing API with one that is more coherent in >>>>>>>>>>>>>>>> light of the changes - which should also carefully consider >>>>>>>>>>>>>>>> any planned or expected future changes to minimise churn. Any >>>>>>>>>>>>>>>> strategy for modifying APIs should be brought to >>>>>>>>>>>>>>>> dev@cassandra.apache.org for discussion. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> My .02: >>>>>>>>>>>>>>> 1. We should rename that page to a "code contribution guide" as >>>>>>>>>>>>>>> discussed on the slack thread >>>>>>>>>>>>>>> 2. *All* publicly facing API changes (tool output, CQL >>>>>>>>>>>>>>> semantics, JMX, vtables, .java interfaces targeting user >>>>>>>>>>>>>>> extension, etc) should hit the dev ML w/a [DISCUSS] thread. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> This takes the burden of trying to determine if a change is >>>>>>>>>>>>>>> consistent w/existing strategy or not etc. off the author in >>>>>>>>>>>>>>> isolation and allows devs to work concurrently on API changes >>>>>>>>>>>>>>> w/out risk of someone else working on something that may inform >>>>>>>>>>>>>>> their work or vice versa. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> We've learned that API's are *really really hard* to deprecate, >>>>>>>>>>>>>>> disruptive to our users when we change or remove them, and can >>>>>>>>>>>>>>> cause serious pain and ecosystem fragmentation when changed. >>>>>>>>>>>>>>> See: Thrift, current discussions about JMX, etc. They're the >>>>>>>>>>>>>>> definition of a "one-way-door" decision and represent a >>>>>>>>>>>>>>> long-term maintenance burden commitment from the project. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Lastly, I'd expect the vast majority of these discuss threads >>>>>>>>>>>>>>> to be quick consensus checks resolved via lazy consensus or >>>>>>>>>>>>>>> after some slight discussion; ideally this wouldn't represent a >>>>>>>>>>>>>>> huge burden of coordination on folks working on changes. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> So that's 1 opinion. What other opinions are out there? >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> ~Josh >>