> The lower the bar to participation on this stuff the better, as it isn’t 
> deeply technical and we have lots of lurkers here that have relevant 
> experience and knowledge that can chime in with valuable feedback if given 
> the chance
Good point. Separate thread for API Revisions is the lowest bar for 
participation w/greatest inclusivity I've seen in the thread; I'm in favor of 
that on those grounds.

On Mon, Dec 5, 2022, at 4:58 PM, Benedict wrote:
> 
> My view is simply that API discussion should be undertaken in a broader forum 
> than Jira. The lower the bar to participation on this stuff the better, as it 
> isn’t deeply technical and we have lots of lurkers here that have relevant 
> experience and knowledge that can chime in with valuable feedback if given 
> the chance. The transaction syntax thread demonstrated this with participants 
> we do not often see.
> 
> Jira is exceptionally noisy, and most discussion is not particularly 
> important for broad consumption - implementation decisions can always be 
> revisited, and are anyway fine to be suboptimal. So quite rationally most 
> folk do not pay attention.
> 
> We want their input for this stuff though. So, yes, I think we should ideally 
> keep API discussions to this list, preferably exclusively (modulo preparatory 
> discussions), as that retains a high signal:noise funnel of information for 
> people to subscribe to and participate in at low cost.
> 
> Links to a list of API topics in the status email just doesn’t serve this 
> purpose.
> 
>> On 5 Dec 2022, at 16:15, Josh McKenzie <jmcken...@apache.org> wrote:
>> 
>>> can we add a label in Jira API or something like that and then Josh can 
>>> filter those in the bi-weekly report?
>> 
>>>  I do not personally think the project status thread is the right venue for 
>>> this, though a periodic dedicated “API Changes” thread might not be a bad 
>>> approach.
>> My understanding of Ekaterina's suggestions it that I add a [Pending API 
>> Changes] section to the email thread every 2-3 weeks w/everything that's 
>> opened within that time frame with that tag. Batch processing by interested 
>> parties periodically rather than bulk batch processing at the end of a 
>> release cycle.
>> 
>> I'm happy to take that on and I think that model could work. I'm also happy 
>> with a DISCUSS thread once a week on some pending API change someone has in 
>> flight; to me that doesn't seem like too big a burden, especially if a solid 
>> majority of them are lazy consensus noops.
>> 
>>> Would adding or changing an exception type or a user warning qualify for a 
>>> DISCUSS thread also?
>> I'm an emphatic "Yes" on this. I've had to use some poorly fit exception 
>> types in the past because of legacy coupling / commitment with the existing 
>> API's that make it harder both for us to work on things in the codebase and 
>> harder for our users to understand what we're trying to communicate with 
>> them. Further, users have splunk queries and other automated parsing set up 
>> to check logs for specific exceptions and warning texts; changing those 
>> things breaks an unknown number of downstream consumers and creates toil, or 
>> worse, introduces outages, for users.
>> 
>> On Mon, Dec 5, 2022, at 10:34 AM, Andrés de la Peña wrote:
>>>> This doesn’t seem like the right approach to me, but if we do not come to 
>>>> some policy approach here, I will try to schedule some time each quarter 
>>>> to scan for topics I think should have had a DISCUSS thread, and open them 
>>>> up for discussion.
>>> 
>>> That after the fact review approach doesn't seem very dependant on whether 
>>> we have a DISCUSS thread for every change or not, since those supervision 
>>> scans will probably still happen. After all anyone can ask for 
>>> modifications on anything at any moment, previously agreed or not. For 
>>> example, the CEP for guardrails was publicly discussed, voted, approved, 
>>> reviewed and committed with nested config and a global enable flag, and 
>>> still we had to revert those two things shortly after commit.
>>> 
>>> On Mon, 5 Dec 2022 at 15:12, Ekaterina Dimitrova <e.dimitr...@gmail.com> 
>>> wrote:
>>>> “ I do not personally think the project status thread is the right venue 
>>>> for this, though a periodic dedicated “API Changes” thread might not be a 
>>>> bad approach.”
>>>> Just to clarify, I do not suggest to use the status mail to respond with 
>>>> concerns. But more like - having a link to the filter in the bi-weekly 
>>>> report and then everyone can open it and check the list of tickets and 
>>>> comment directly on the tickets or open a thread if they think the issue 
>>>> deserves one.
>>>> Same as having link to the tickets that are blockers or tickets that need 
>>>> reviewers. Whoever wants will have an easy way to get to those and take 
>>>> whatever actions they want.
>>>> 
>>>> On Mon, 5 Dec 2022 at 10:07, Benedict <bened...@apache.org> wrote:
>>>>> 
>>>>> Perhaps you misunderstand my concern? I think these decisions need 
>>>>> broader input, not just my input.
>>>>> 
>>>>> Are you therefore asking why I do not monitor these topics and propose 
>>>>> DISCUSS threads based on activities others are undertaking? This doesn’t 
>>>>> seem like the right approach to me, but if we do not come to some policy 
>>>>> approach here, I will try to schedule some time each quarter to scan for 
>>>>> topics I think should have had a DISCUSS thread, and open them up for 
>>>>> discussion.
>>>>> 
>>>>> I do not personally think the project status thread is the right venue 
>>>>> for this, though a periodic dedicated “API Changes” thread might not be a 
>>>>> bad approach.
>>>>> 
>>>>> 
>>>>> 
>>>>>> On 5 Dec 2022, at 14:16, Benjamin Lerer <b.le...@gmail.com> wrote:
>>>>>> 
>>>>>> Benedict, I am confused. If you are so much concerned about virtual 
>>>>>> tables or CQL why do you not track those components changes directly?  
>>>>>> People usually label them correctly I believe. Like that you would be 
>>>>>> able to provide feedback straight away rather than after the fact. It 
>>>>>> would be a win for everybody, no? 
>>>>>> 
>>>>>> Le lun. 5 déc. 2022 à 15:10, Ekaterina Dimitrova <e.dimitr...@gmail.com> 
>>>>>> a écrit :
>>>>>>> Quick idea - can we add a label in Jira API or something like that and 
>>>>>>> then Josh can filter those in the bi-weekly report? In the meantime if 
>>>>>>> there are big changes that people consider they need a DISCUSS thread 
>>>>>>> for they can always open one? I will be happy to help with the 
>>>>>>> mentioned filter/report.
>>>>>>> Also +1 on having Contributing doc with broader discussion and 
>>>>>>> directions around API
>>>>>>> 
>>>>>>> On Mon, 5 Dec 2022 at 8:32, Benedict <bened...@apache.org> wrote:
>>>>>>>> 
>>>>>>>> I would be fine with a formal API change review period prior to 
>>>>>>>> release, but if we go that route people should expect to have to 
>>>>>>>> revisit work they completed a while back, and there should be no 
>>>>>>>> presumption that decisions taken without a DISCUSS thread should be 
>>>>>>>> preferred to alternative suggestions - and we should have a clear 
>>>>>>>> policy of reverting any work if it is not revisited based on the 
>>>>>>>> outcome of any discussion, since seeking broader input earlier was 
>>>>>>>> always an option. I expect this approach could lead to frustration, 
>>>>>>>> but it might actually be a better system than separate DISCUSS threads 
>>>>>>>> as the changes can be considered holistically.
>>>>>>>> 
>>>>>>>> The idea that a DISCUSS thread for each change would be burdensome 
>>>>>>>> however is I think mistaken. Even if 70 were the true figure, it would 
>>>>>>>> have been around one per week, and they could easily have been 
>>>>>>>> batched. I’d also be fine with white listing some changes (eg JMX and 
>>>>>>>> warning messages) - but definitely not virtual tables or CQL. These 
>>>>>>>> APIs develop strong user dependencies, and are very hard to change.
>>>>>>>> 
>>>>>>>> We should not restrict input on our main user experiences to the 
>>>>>>>> handful of people with time to closely monitor Jira, most of whom are 
>>>>>>>> not even users of Cassandra. We should be seeking the broadest 
>>>>>>>> visibility, including casual observers and non-contributors.
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> On 5 Dec 2022, at 13:05, Paulo Motta <pauloricard...@gmail.com> wrote:
>>>>>>>>> 
>>>>>>>>> It feels bit of overkill to me to require addition of any new virtual 
>>>>>>>>> tables/JMX/configuration/knob to go through a discuss thread. If this 
>>>>>>>>> would require 70 threads for the previous release I think this would 
>>>>>>>>> easily become spammy and counter-productive. 
>>>>>>>>> 
>>>>>>>>> I think the burden should be on the maintainer to keep up with 
>>>>>>>>> changes being added to the database and chime in any areas it feel 
>>>>>>>>> responsible for, as it has been the case and has worked relatively 
>>>>>>>>> well.
>>>>>>>>> 
>>>>>>>>> I think it makes sense to look into improving visibility of API 
>>>>>>>>> changes, so people can more easily review a summary of API changes 
>>>>>>>>> versus reading through the whole changelog (perhaps we need a 
>>>>>>>>> summarized API change log?).
>>>>>>>>> 
>>>>>>>>> It would also help to have more explicit guidelines on what kinds of 
>>>>>>>>> API changes are riskier and might require additional  visibility via 
>>>>>>>>> a DISCUSS thread.
>>>>>>>>> 
>>>>>>>>> Also, would it make sense to introduce a new API review stage during 
>>>>>>>>> release validation, and agree to revert/update any API changes that 
>>>>>>>>> may be controversial that were not caught during normal review?
>>>>>>>>> 
>>>>>>>>> On Mon, 5 Dec 2022 at 06:49 Andrés de la Peña <adelap...@apache.org> 
>>>>>>>>> wrote:
>>>>>>>>>> Indeed that contribution policy should be clearer and not be on a 
>>>>>>>>>> page titled code style, thanks for briging that up.
>>>>>>>>>> 
>>>>>>>>>> If we consider all those things APIs, and additions are also 
>>>>>>>>>> considered changes that require a DISCUSS thread, it turns out that 
>>>>>>>>>> almost any not-bugfix ticket would require a mail list thread. In 
>>>>>>>>>> fact, if one goes through CHANGES.txt it's easy to see that most 
>>>>>>>>>> entries would have required a DISCUSS thread.
>>>>>>>>>> 
>>>>>>>>>> I think that such a strict policy would only make us lose agility 
>>>>>>>>>> and increase the burden of almost any contribution. After all, it's 
>>>>>>>>>> not that changes without a DISCUSS thread happen in secret. Changes 
>>>>>>>>>> are publicly visible on their tickets, those tickets are notified on 
>>>>>>>>>> Slack so anyone can jump into the ticket discussions and set 
>>>>>>>>>> themselves as reviewers, and reviewers can ask for DISCUSS threads 
>>>>>>>>>> whenever they think more opinions or broader consensus are needed.
>>>>>>>>>> 
>>>>>>>>>> Also, a previous DISCUSS thread is not going to impede that any 
>>>>>>>>>> changes are going to be questioned later. We have seen changes that 
>>>>>>>>>> are proposed, discussed and approved as CEPs, reviewed for weeks or 
>>>>>>>>>> months, and finally committed, and still they are questioned shortly 
>>>>>>>>>> after that cycle, and asked to be changed or discussed again. I 
>>>>>>>>>> don't think that an avalanche of DISCUSS threads is going to improve 
>>>>>>>>>> that, since usually the problem is that people don't have the time 
>>>>>>>>>> for deeply looking into the changes when they are happening. I doubt 
>>>>>>>>>> that more notification channels are going to improve that.
>>>>>>>>>> 
>>>>>>>>>> Of course I'm not saying that there should never DISCUSS threads 
>>>>>>>>>> before starting a change. Probably we can all agree that major 
>>>>>>>>>> changes and things that break compatibility would need previous 
>>>>>>>>>> discussion.
>>>>>>>>>> 
>>>>>>>>>> On Mon, 5 Dec 2022 at 10:16, Benjamin Lerer <ble...@apache.org> 
>>>>>>>>>> wrote:
>>>>>>>>>>> Thanks for opening this thread Josh,
>>>>>>>>>>> 
>>>>>>>>>>> It seems perfectly normal to me that for important changes or 
>>>>>>>>>>> questions we raise some discussion to the mailing list.
>>>>>>>>>>> 
>>>>>>>>>>> My understanding of the current proposal  implies that for the 4.1 
>>>>>>>>>>> release we should have had to raise over 70 discussion threads.
>>>>>>>>>>> We have a minimum of 2 commiters required for every patch. Should 
>>>>>>>>>>> we not trust them to update nodetool, the virtual tables or other 
>>>>>>>>>>> things on their own?  
>>>>>>>>>>> 
>>>>>>>>>>> There is already multiple existing ways to track changes in 
>>>>>>>>>>> specific code areas. I am personaly tracking the areas in which I 
>>>>>>>>>>> am the most involved this way and I know that a lot of people do 
>>>>>>>>>>> the same.
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> To be transparent, It is not clear to me what the underlying issue 
>>>>>>>>>>> is? Do we have some specific cases that illustrate the underlying 
>>>>>>>>>>> problem? Thrift and JMX are from a different time in my opinion.  
>>>>>>>>>>> 
>>>>>>>>>>> Le lun. 5 déc. 2022 à 08:09, Berenguer Blasi 
>>>>>>>>>>> <berenguerbl...@gmail.com> a écrit :
>>>>>>>>>>>> +1 to moving that into it's own section outside the coding style 
>>>>>>>>>>>> page.
>>>>>>>>>>>> 
>>>>>>>>>>>> Dinesh I also thought in terms of backward compatibility here. But 
>>>>>>>>>>>> notice the discussion is about _any change_ to the API such as 
>>>>>>>>>>>> adding new CQL functions. Would adding or changing an exception 
>>>>>>>>>>>> type or a user warning qualify for a DISCUSS thread also? I wonder 
>>>>>>>>>>>> if we're talking ourselves into opening a DISCUSS for almost every 
>>>>>>>>>>>> ticket and sthg easy to miss.
>>>>>>>>>>>> 
>>>>>>>>>>>> I wonder, you guys know the code better, if 'public APIs' could be 
>>>>>>>>>>>> matched to a reasonable set of files (cql parsing, yaml, etc) and 
>>>>>>>>>>>> have jenkins send an email when changes are detected on them. 
>>>>>>>>>>>> Overkill? bad idea? :thinking:...
>>>>>>>>>>>> 
>>>>>>>>>>>> On 4/12/22 1:14, Dinesh Joshi wrote:
>>>>>>>>>>>>> We should also very clearly list out what is considered a public 
>>>>>>>>>>>>> API. The current statement that we have is insufficient: 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> public APIs, including CQL, virtual tables, JMX, yaml, system 
>>>>>>>>>>>>>> properties, etc. 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> The guidance on treatment of public APIs should also move out of 
>>>>>>>>>>>>> "Code Style" page as it isn't strictly related to code style. 
>>>>>>>>>>>>> Backward compatibility of public APIs is a best practice & 
>>>>>>>>>>>>> project policy.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>>> On Dec 2, 2022, at 2:08 PM, Benedict <bened...@apache.org> wrote:
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I think some of that text also got garbled by mixing up how you 
>>>>>>>>>>>>>> approach internal APIs and external APIs. We should probably 
>>>>>>>>>>>>>> clarify that there are different burdens for each. Which is all 
>>>>>>>>>>>>>> my fault as the formulator. I remember it being much clearer in 
>>>>>>>>>>>>>> my head.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> My view is the same as yours Josh. Evolving the database’s 
>>>>>>>>>>>>>> public APIs is something that needs community consensus. The 
>>>>>>>>>>>>>> more visibility these decisions get, the better the final 
>>>>>>>>>>>>>> outcome (usually). Even small API changes need to be carefully 
>>>>>>>>>>>>>> considered to ensure the API evolves coherently, and this is 
>>>>>>>>>>>>>> particularly true for something as complex and central as CQL. 
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> A DISCUSS thread is a good forcing function to think about what 
>>>>>>>>>>>>>> you’re trying to achieve and why, and to provide others a chance 
>>>>>>>>>>>>>> to spot potential flaws, alternatives and interactions with work 
>>>>>>>>>>>>>> you may not be aware of.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> It would be nice if there were an easy rubric for whether 
>>>>>>>>>>>>>> something needs feedback, but I don’t think there is. One 
>>>>>>>>>>>>>> person’s obvious change may be another’s obvious problem. So I 
>>>>>>>>>>>>>> think any decision that binds the project going forwards should 
>>>>>>>>>>>>>> have a lazy consensus DISCUSS thread at least.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>> I don’t think it needs to be burdensome though - trivial API 
>>>>>>>>>>>>>> changes could begin while the DISCUSS thread is underway, 
>>>>>>>>>>>>>> expecting they usually won’t raise a murmur.
>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> On 2 Dec 2022, at 19:25, Josh McKenzie <jmcken...@apache.org> 
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>> Came up this morning / afternoon in dev slack: 
>>>>>>>>>>>>>>> https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The gist of it: we're lacking clarity on whether the 
>>>>>>>>>>>>>>> expectation on the project is to hit the dev ML w/a [DISCUSS] 
>>>>>>>>>>>>>>> thread on _any_ API modification or only on modifications where 
>>>>>>>>>>>>>>> the author feels they are adjusting a paradigm / strategy for 
>>>>>>>>>>>>>>> an API.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> The code style section on Public APIs is actually a little 
>>>>>>>>>>>>>>> unclear: 
>>>>>>>>>>>>>>> https://cassandra.apache.org/_/development/code_style.html
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> Public APIs
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>>> These considerations are especially important for public APIs, 
>>>>>>>>>>>>>>>> including CQL, virtual tables, JMX, yaml, system properties, 
>>>>>>>>>>>>>>>> etc. Any planned additions must be carefully considered in the 
>>>>>>>>>>>>>>>> context of any existing APIs. Where possible the approach of 
>>>>>>>>>>>>>>>> any existing API should be followed. Where the existing API is 
>>>>>>>>>>>>>>>> poorly suited, a strategy should be developed to modify or 
>>>>>>>>>>>>>>>> replace the existing API with one that is more coherent in 
>>>>>>>>>>>>>>>> light of the changes - which should also carefully consider 
>>>>>>>>>>>>>>>> any planned or expected future changes to minimise churn. Any 
>>>>>>>>>>>>>>>> strategy for modifying APIs should be brought to 
>>>>>>>>>>>>>>>> dev@cassandra.apache.org for discussion.
>>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> My .02:
>>>>>>>>>>>>>>> 1. We should rename that page to a "code contribution guide" as 
>>>>>>>>>>>>>>> discussed on the slack thread
>>>>>>>>>>>>>>> 2. *All* publicly facing API changes (tool output, CQL 
>>>>>>>>>>>>>>> semantics, JMX, vtables, .java interfaces targeting user 
>>>>>>>>>>>>>>> extension, etc) should hit the dev ML w/a [DISCUSS] thread.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> This takes the burden of trying to determine if a change is 
>>>>>>>>>>>>>>> consistent w/existing strategy or not etc. off the author in 
>>>>>>>>>>>>>>> isolation and allows devs to work concurrently on API changes 
>>>>>>>>>>>>>>> w/out risk of someone else working on something that may inform 
>>>>>>>>>>>>>>> their work or vice versa.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> We've learned that API's are *really really hard* to deprecate, 
>>>>>>>>>>>>>>> disruptive to our users when we change or remove them, and can 
>>>>>>>>>>>>>>> cause serious pain and ecosystem fragmentation when changed. 
>>>>>>>>>>>>>>> See: Thrift, current discussions about JMX, etc. They're the 
>>>>>>>>>>>>>>> definition of a "one-way-door" decision and represent a 
>>>>>>>>>>>>>>> long-term maintenance burden commitment from the project.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Lastly, I'd expect the vast majority of these discuss threads 
>>>>>>>>>>>>>>> to be quick consensus checks resolved via lazy consensus or 
>>>>>>>>>>>>>>> after some slight discussion; ideally this wouldn't represent a 
>>>>>>>>>>>>>>> huge burden of coordination on folks working on changes.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> So that's 1 opinion. What other opinions are out there?
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> ~Josh
>> 

Reply via email to