Quick idea - can we add a label in Jira API or something like that and then
Josh can filter those in the bi-weekly report? In the meantime if there are
big changes that people consider they need a DISCUSS thread for they can
always open one? I will be happy to help with the mentioned filter/report.
Also +1 on having Contributing doc with broader discussion and directions
around API

On Mon, 5 Dec 2022 at 8:32, Benedict <bened...@apache.org> wrote:

> I would be fine with a formal API change review period prior to release,
> but if we go that route people should expect to have to revisit work they
> completed a while back, and there should be no presumption that decisions
> taken without a DISCUSS thread should be preferred to alternative
> suggestions - and we should have a clear policy of reverting any work if it
> is not revisited based on the outcome of any discussion, since seeking
> broader input earlier was always an option. I expect this approach could
> lead to frustration, but it might actually be a better system than separate
> DISCUSS threads as the changes can be considered holistically.
>
> The idea that a DISCUSS thread for each change would be burdensome however
> is I think mistaken. Even if 70 were the true figure, it would have been
> around one per week, and they could easily have been batched. I’d also be
> fine with white listing some changes (eg JMX and warning messages) - but
> definitely not virtual tables or CQL. These APIs develop strong user
> dependencies, and are very hard to change.
>
> We should not restrict input on our main user experiences to the handful
> of people with time to closely monitor Jira, most of whom are not even
> users of Cassandra. We should be seeking the broadest visibility, including
> casual observers and non-contributors.
>
> On 5 Dec 2022, at 13:05, Paulo Motta <pauloricard...@gmail.com> wrote:
>
> 
>
> It feels bit of overkill to me to require addition of any new virtual
> tables/JMX/configuration/knob to go through a discuss thread. If this would
> require 70 threads for the previous release I think this would easily
> become spammy and counter-productive.
>
> I think the burden should be on the maintainer to keep up with changes
> being added to the database and chime in any areas it feel responsible for,
> as it has been the case and has worked relatively well.
>
> I think it makes sense to look into improving visibility of API changes,
> so people can more easily review a summary of API changes versus reading
> through the whole changelog (perhaps we need a summarized API change log?).
>
> It would also help to have more explicit guidelines on what kinds of API
> changes are riskier and might require additional  visibility via a DISCUSS
> thread.
>
> Also, would it make sense to introduce a new API review stage during
> release validation, and agree to revert/update any API changes that may be
> controversial that were not caught during normal review?
>
> On Mon, 5 Dec 2022 at 06:49 Andrés de la Peña <adelap...@apache.org>
> wrote:
>
>> Indeed that contribution policy should be clearer and not be on a page
>> titled code style, thanks for briging that up.
>>
>> If we consider all those things APIs, and additions are also considered
>> changes that require a DISCUSS thread, it turns out that almost any
>> not-bugfix ticket would require a mail list thread. In fact, if one goes
>> through CHANGES.txt it's easy to see that most entries would have required
>> a DISCUSS thread.
>>
>> I think that such a strict policy would only make us lose agility and
>> increase the burden of almost any contribution. After all, it's not that
>> changes without a DISCUSS thread happen in secret. Changes are publicly
>> visible on their tickets, those tickets are notified on Slack so anyone can
>> jump into the ticket discussions and set themselves as reviewers, and
>> reviewers can ask for DISCUSS threads whenever they think more opinions or
>> broader consensus are needed.
>>
>> Also, a previous DISCUSS thread is not going to impede that any changes
>> are going to be questioned later. We have seen changes that are proposed,
>> discussed and approved as CEPs, reviewed for weeks or months, and finally
>> committed, and still they are questioned shortly after that cycle, and
>> asked to be changed or discussed again. I don't think that an avalanche of
>> DISCUSS threads is going to improve that, since usually the problem is that
>> people don't have the time for deeply looking into the changes when they
>> are happening. I doubt that more notification channels are going to improve
>> that.
>>
>> Of course I'm not saying that there should never DISCUSS threads before
>> starting a change. Probably we can all agree that major changes and things
>> that break compatibility would need previous discussion.
>>
>> On Mon, 5 Dec 2022 at 10:16, Benjamin Lerer <ble...@apache.org> wrote:
>>
>>> Thanks for opening this thread Josh,
>>>
>>> It seems perfectly normal to me that for important changes or questions
>>> we raise some discussion to the mailing list.
>>>
>>> My understanding of the current proposal  implies that for the 4.1
>>> release we should have had to raise over 70 discussion threads.
>>> We have a minimum of 2 commiters required for every patch. Should we not
>>> trust them to update nodetool, the virtual tables or other things on their
>>> own?
>>>
>>> There is already multiple existing ways to track changes in specific
>>> code areas. I am personaly tracking the areas in which I am the most
>>> involved this way and I know that a lot of people do the same.
>>>
>>> To be transparent, It is not clear to me what the underlying issue is?
>>> Do we have some specific cases that illustrate the underlying problem?
>>> Thrift and JMX are from a different time in my opinion.
>>>
>>> Le lun. 5 déc. 2022 à 08:09, Berenguer Blasi <berenguerbl...@gmail.com>
>>> a écrit :
>>>
>>>> +1 to moving that into it's own section outside the coding style page.
>>>>
>>>> Dinesh I also thought in terms of backward compatibility here. But
>>>> notice the discussion is about _any change_ to the API such as adding new
>>>> CQL functions. Would adding or changing an exception type or a user warning
>>>> qualify for a DISCUSS thread also? I wonder if we're talking ourselves into
>>>> opening a DISCUSS for almost every ticket and sthg easy to miss.
>>>>
>>>> I wonder, you guys know the code better, if 'public APIs' could be
>>>> matched to a reasonable set of files (cql parsing, yaml, etc) and have
>>>> jenkins send an email when changes are detected on them. Overkill? bad
>>>> idea? :thinking:...
>>>> On 4/12/22 1:14, Dinesh Joshi wrote:
>>>>
>>>> We should also very clearly list out what is considered a public API.
>>>> The current statement that we have is insufficient:
>>>>
>>>> public APIs, including CQL, virtual tables, JMX, yaml, system
>>>> properties, etc.
>>>>
>>>>
>>>> The guidance on treatment of public APIs should also move out of "Code
>>>> Style" page as it isn't strictly related to code style. Backward
>>>> compatibility of public APIs is a best practice & project policy.
>>>>
>>>>
>>>> On Dec 2, 2022, at 2:08 PM, Benedict <bened...@apache.org> wrote:
>>>>
>>>> I think some of that text also got garbled by mixing up how you
>>>> approach internal APIs and external APIs. We should probably clarify that
>>>> there are different burdens for each. Which is all my fault as the
>>>> formulator. I remember it being much clearer in my head.
>>>>
>>>> My view is the same as yours Josh. Evolving the database’s public APIs
>>>> is something that needs community consensus. The more visibility these
>>>> decisions get, the better the final outcome (usually). Even small API
>>>> changes need to be carefully considered to ensure the API evolves
>>>> coherently, and this is particularly true for something as complex and
>>>> central as CQL.
>>>>
>>>> A DISCUSS thread is a good forcing function to think about what you’re
>>>> trying to achieve and why, and to provide others a chance to spot potential
>>>> flaws, alternatives and interactions with work you may not be aware of.
>>>>
>>>> It would be nice if there were an easy rubric for whether something
>>>> needs feedback, but I don’t think there is. One person’s obvious
>>>> change may be another’s obvious problem. So I think any decision that binds
>>>> the project going forwards should have a lazy consensus DISCUSS thread at
>>>> least.
>>>>
>>>> I don’t think it needs to be burdensome though - trivial API changes
>>>> could begin while the DISCUSS thread is underway, expecting they usually
>>>> won’t raise a murmur.
>>>>
>>>> On 2 Dec 2022, at 19:25, Josh McKenzie <jmcken...@apache.org> wrote:
>>>>
>>>> 
>>>> Came up this morning / afternoon in dev slack:
>>>> https://the-asf.slack.com/archives/CK23JSY2K/p1669981168190189
>>>>
>>>> The gist of it: we're lacking clarity on whether the expectation on the
>>>> project is to hit the dev ML w/a [DISCUSS] thread on _any_ API modification
>>>> or only on modifications where the author feels they are adjusting a
>>>> paradigm / strategy for an API.
>>>>
>>>> The code style section on Public APIs is actually a little unclear:
>>>> https://cassandra.apache.org/_/development/code_style.html
>>>>
>>>> Public APIs
>>>>
>>>> These considerations are especially important for public APIs, including 
>>>> CQL, virtual tables, JMX, yaml, system properties, etc. Any planned 
>>>> additions must be carefully considered in the context of any existing 
>>>> APIs. Where possible the approach of any existing API should be followed. 
>>>> Where the existing API is poorly suited, a strategy should be developed to 
>>>> modify or replace the existing API with one that is more coherent in light 
>>>> of the changes - which should also carefully consider any planned or 
>>>> expected future changes to minimise churn. Any strategy for modifying APIs 
>>>> should be brought to dev@cassandra.apache.org for discussion.
>>>>
>>>>
>>>> My .02:
>>>> 1. We should rename that page to a "code contribution guide" as
>>>> discussed on the slack thread
>>>> 2. *All* publicly facing API changes (tool output, CQL semantics, JMX,
>>>> vtables, .java interfaces targeting user extension, etc) should hit the dev
>>>> ML w/a [DISCUSS] thread.
>>>>
>>>> This takes the burden of trying to determine if a change is consistent
>>>> w/existing strategy or not etc. off the author in isolation and allows devs
>>>> to work concurrently on API changes w/out risk of someone else working on
>>>> something that may inform their work or vice versa.
>>>>
>>>> We've learned that API's are *really really hard* to deprecate,
>>>> disruptive to our users when we change or remove them, and can cause
>>>> serious pain and ecosystem fragmentation when changed. See: Thrift, current
>>>> discussions about JMX, etc. They're the definition of a "one-way-door"
>>>> decision and represent a long-term maintenance burden commitment from the
>>>> project.
>>>>
>>>> Lastly, I'd expect the vast majority of these discuss threads to be
>>>> quick consensus checks resolved via lazy consensus or after some slight
>>>> discussion; ideally this wouldn't represent a huge burden of coordination
>>>> on folks working on changes.
>>>>
>>>> So that's 1 opinion. What other opinions are out there?
>>>>
>>>> ~Josh
>>>>
>>>>
>>>>

Reply via email to