The governance
https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance
says "Correcting typos, docs, website, and comments etc operate a “Commit
Then Review
<https://www.apache.org/foundation/glossary.html#CommitThenReview>” policy"

Kind Regards,
Brandon


On Thu, May 1, 2025 at 3:21 PM Jon Haddad <j...@rustyrazorblade.com> wrote:

> I propose we encourage committers commit docs without review or a JIRA.
>
>
> On Thu, May 1, 2025 at 8:06 AM Jon Haddad <j...@rustyrazorblade.com> wrote:
>
>> Stefan,
>>
>> Any feature developed for Cassandra is a collaborative effort.  The
>> public branches of accord have been available for months.  Have you
>> contributed to the accord docs?  You've had plenty of opportunity.
>>
>> It looks like you're turning your own thread into an airing of
>> grievances.  It's not particularly constructive.  You can be right (we need
>> more docs) without being hostile.
>>
>> Jon
>>
>>
>>
>>
>> On Thu, May 1, 2025 at 7:49 AM Miklosovic, Stefan via dev <
>> dev@cassandra.apache.org> wrote:
>>
>>> Yeah, no surprise, I was thinking the dicussion will go this direction.
>>> I am not completely sure who we are developing this for then. I see the
>>> statements like this and I am pretty disappointed:
>>>
>>>
>>>
>>> "The project obviously aims to serve end users, but the developer
>>> community is the actual project and it is fine to serve that demographic
>>> first, or only. "
>>>
>>>
>>>
>>> What is the actual difference between working in a private fork and
>>> publishing the code publicly almost nobody understands?
>>>
>>>
>>> I get that people work for companies etc. but really, we should reflect
>>> quite hard on what we are doing here.
>>>
>>>
>>>
>>> Let's take Accord, for example. I can not see any justification for
>>> working on something for 3 years and not documenting how to use that
>>> when it really comes to it. Is Accord documentation for users on the
>>> way or not? Is the documentation for CEP-45 going to be done or not? How
>>> are operators outside of the authors of that change one can count on one
>>> hand (and working in one company) supposed to know how to use that?
>>> What is "open source" about that expect of that being online and publicly
>>> accessible?
>>>
>>>
>>>
>>> Over the last couple years Cassandra is getting more and more complex
>>> which might alienate even the developers working on it daily. People
>>> might get out of touch with all of the new features being rolled out
>>> and if this trend will continue without documenting it along the way I am
>>> very afraid that Cassandra becomes an exclusive self-serving club of elite
>>> programmers an average / begginner user has no way to catch up with,
>>> consultants will not know how to consult it and so on and so on. How is
>>> this good for anybody?
>>>
>>>
>>>
>>> What I am asking for is really not a rocket science and nothing "rigid".
>>> I am all open to lower the requirements.
>>>
>>>
>>>
>>> I am not asking for rephrasing whole CEP and present it to a user. I am
>>> asking for the description of the most common usages and scenarios with
>>> most important consequences and all configuration parameters.
>>>
>>>
>>>
>>> Let's take a look at CEP-37.
>>>
>>>
>>>
>>> *https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html
>>> <https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html>*
>>>
>>>
>>>
>>> This is just wonderful and it is an example how it should be done.
>>>
>>>
>>>
>>> Why can not be this done for other CEPs too? What was different for
>>> CEP-37 when docs were written together with the code but it can not be
>>> done similarly for other CEPs as well?
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>>
>>>
>>> *From: *Benedict <bened...@apache.org>
>>> *Date: *Thursday, 1 May 2025 at 14:37
>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
>>> *Cc: *Rolo, Carlos <carlos.r...@netapp.com>, Miklosovic, Stefan <
>>> stefan.mikloso...@netapp.com>, dev@cassandra.apache.org <
>>> dev@cassandra.apache.org>
>>> *Subject: *Re: [DISCUSS] Requirement to document features before
>>> releasing them
>>>
>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>>
>>>
>>>
>>> I am opposed to this. There’s too much imprecision in the “rule” while
>>> simultaneously being much too rigid, and it will be improperly enforced (we
>>> already have lots of rule breaking around modifying public APIs, that
>>> should have discuss threads and do not, for instance). This kind of
>>> arbitrary rule that is unaligned with contributors will likely lead to a
>>> bad and inconsistent documentation, which is worse than no documentation.
>>>
>>>
>>>
>>> We could perhaps stipulate that for a feature to leave experimental
>>> status the community must vote and that documentation should be a
>>> consideration. But this will only capture big changes.
>>>
>>>
>>>
>>> We could perhaps try other ideas like moratoriums on contributions that
>>> are not documentation, to encourage improvements there.
>>>
>>>
>>>
>>> We could perhaps try having LLMs generate documentation that new
>>> contributors could take a first pass at editing for correctness, before a
>>> committer takes a final pass.
>>>
>>>
>>>
>>> At the end of the day though, we’re an OSS project and we do have
>>> features (big and small) designed, implemented and likely only used by the
>>> sole contributor of the feature. We also have features used primarily by
>>> active community members who understand it well enough. I don’t think this
>>> is a bug in the system. The project obviously aims to serve end users, but
>>> the developer community is the actual project and it is fine to serve that
>>> demographic first, or only.
>>>
>>>
>>>
>>> I agree we want to improve our documentation, but this is not the right
>>> way to go about it.
>>>
>>>
>>>
>>> On 1 May 2025, at 13:19, Miklosovic, Stefan via dev <
>>> dev@cassandra.apache.org> wrote:
>>>
>>> 
>>>
>>> I am not completely sure LLMs are the way to go here. Sure, to have 
>>> something
>>> to further refine ... why not. But to just generate something via LLM
>>> and commit that, that would be a no-no from me. These things can go
>>> hallucinate quite quickly, then what? Who is going to proof-read
>>> technical stuff like that? Fixing the hallucinations might take more time
>>> then just writing it from scratch.
>>>
>>>
>>>
>>> Anyway, I would really appreciate if we stayed on track and discussed
>>> the proposition mentioned in my first email - the end goal is to codify the
>>> need to provide documentation together with the feature. If not provided
>>> together, it might be in a separate ticket which will be a blocker for the
>>> next release.
>>>
>>>
>>>
>>> I might initiate the voting thread for that ...
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>>
>>>
>>> *From: *Rolo, Carlos <carlos.r...@netapp.com>
>>> *Date: *Thursday, 1 May 2025 at 12:30
>>> *To: *David Capwell <dcapw...@apple.com>, dev@cassandra.apache.org <
>>> dev@cassandra.apache.org>
>>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>>> *Subject: *Re: [DISCUSS] Requirement to document features before
>>> releasing them
>>>
>>> I am bit out of the loop on how/if this would extend to driver
>>> sub-projects.
>>>
>>> Because this makes 100% sense, and in the driver space as well. Looking
>>> into Java driver docs and making others similar would be a great.
>>>
>>>
>>>
>>> Patrich that LLM suggestion might be a life saver, let me try that!
>>> ------------------------------
>>>
>>> *From:* Miklosovic, Stefan via dev <dev@cassandra.apache.org>
>>> *Sent:* 01 May 2025 08:07
>>> *To:* David Capwell <dcapw...@apple.com>; dev@cassandra.apache.org <
>>> dev@cassandra.apache.org>
>>> *Cc:* Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>>> *Subject:* Re: [DISCUSS] Requirement to document features before
>>> releasing them
>>>
>>>
>>>
>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>>
>>>
>>>
>>> Denser is better. In your oversimplified example of Accord, as a user
>>> who encounters this for the first time, I am definitely interested in what
>>> the limitations are. What might happen quite easily is that if it is not
>>> dense and we just announce it sparsly, then a user takes it all at face
>>> value and if it starts to diverge from your proclamation then they might
>>> feel like they were lied to or they start to be disappointed. You got
>>> me? Users do not like surprises they are discovering themselves on the way
>>> of trying it out (and a lot of time painfully). They just want to know what
>>> they are buying themselves into.
>>>
>>>
>>>
>>> If there are super-cornercase details, that might be omitted as we have
>>> other channels of the communication (Slack, mailing list ...) but in
>>> general I do not see how a lot of documentation would be bad.
>>>
>>>
>>>
>>> It also depends on who you are writing that documentation to. As said,
>>> we talk about user-facing docs here. A documentation for developers where
>>> we are trying to boostrap them / to make them oriented in the code base is
>>> going to be substantially different from a user-facing one.
>>>
>>>
>>>
>>>
>>>
>>> *From: *David Capwell <dcapw...@apple.com>
>>> *Date: *Wednesday, 30 April 2025 at 23:35
>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
>>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>>> *Subject: *Re: [DISCUSS] Requirement to document features before
>>> releasing them
>>>
>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>>
>>>
>>>
>>> I wonder at what level can we enforce this.  What I mean, in modeling
>>> testing I have found some odd behaviors that people were not aware of
>>> (BATCH cell resolution, NULL handling (emptiness…..), etc.)… so if
>>> documentation is dense this can help force people to think through edge
>>> cases or how 2 features interact with each other…. If documentation is
>>> sparse, then you loose this benefit…
>>>
>>>
>>>
>>> Simple example for Accord
>>>
>>>
>>>
>>> # Sparse
>>>
>>>
>>>
>>> Multiple key transaction support, bringing Apache Cassandra cluster to
>>> the RDMS world!
>>>
>>>
>>>
>>> # Dense
>>>
>>>
>>>
>>> …
>>>
>>>
>>>
>>> Here are the current limitations, …
>>>
>>>
>>>
>>> Here is where we alter Apache Cassandra’s behavior to be more inline
>>> with SQL, ...
>>>
>>>
>>>
>>> On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev <
>>> dev@cassandra.apache.org> wrote:
>>>
>>>
>>>
>>>
>>>
>>> To extend the first e-mail to cover the practicalities:
>>>
>>>
>>>
>>>    1. changes introduced to nodetool would not be part of this because
>>>    they are self-documented (docs of help is autogenerated)
>>>    2. introduction of changes into cassandra.yaml is already covered as
>>>    that is what is autogenerated / on website also.
>>>    3. Applying common sense, if it is just enough to mention in
>>>    NEWS.txt, that is also fine.
>>>    4. metrics - I bet there are some which are not documented, we
>>>    should find a way how to autogenerate them into the website.
>>>
>>>
>>>
>>> I am also to blame and showing I am not a hypocrite, I have never
>>> delivered in-depth user documentation of CEP-24 with examples, use cases,
>>> and so on. I am trying to be more aware of the documentation when
>>> delivering features, to raise awareness about that etc. It is easy to not
>>> think about this too much when developers are in a rush and similar. If
>>> there was a hard requirement for the documentation, I would do it right
>>> away and I would not need to deal with this now.
>>>
>>>
>>>
>>> I understand that when delivering heavy-weights like CEP-15 we can not
>>> expect that all the docs will be done upon delivery but I want to stress
>>> the fact that providing usable documentation should be definitely something
>>> to think about when releasing it. Same goes for all other non-trivial
>>> features.
>>>
>>>
>>>
>>>
>>>
>>> *From: *Josh McKenzie <jmcken...@apache.org>
>>> *Date: *Wednesday, 30 April 2025 at 22:11
>>> *To: *dev <dev@cassandra.apache.org>
>>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>>> *Subject: *Re: [DISCUSS] Requirement to document features before
>>> releasing them
>>>
>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments*
>>>
>>>
>>>
>>> This makes intuitive sense to me.
>>>
>>>
>>>
>>> In our case we could tie documentation to the process of promoting a
>>> feature from “experimental” to production ready, though I fear that might
>>> leave wiggle room for primary authors of some features to leave them as
>>> experimental forever, not desiring to take on the burden of documenting
>>> something that’s already merged in and usable by experts.
>>>
>>>
>>>
>>> Curious what others think.
>>>
>>>
>>>
>>> On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote:
>>>
>>> I am on OpenSearchCon and there was a discussion about the documentation
>>> of features. In a nutshell, the policy they seem to have is that there are
>>> some minimal requirements for documentation in place for each feature
>>> introduced. That way, there is no way (or it is greatly minimised) that
>>> there would be a feature released or some user-facing change introduced
>>> without any documentation how to use it.
>>>
>>>
>>>
>>> Under the "documentation", in our case, I mean the docs which would end
>>> up in cassandra.apache.org
>>> <https://urldefense.com/v3/__http:/cassandra.apache.org__;!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjERRjfVMrw$>
>>>  docs.
>>>
>>>
>>>
>>> In their case, the documentation is either part of the change or there
>>> is a documentation issue (in GitHub terms) created which basically blocks
>>> the release when not addressed.
>>>
>>>
>>>
>>> When there is no documentation about a feature or improvement, knob to
>>> tweak etc, there is virtually nobody who knows about that except the
>>> person who committed the code / people who participated in a review. I
>>> think this is detrimental to the project. I do not see the point in
>>> releasing something undocumented when the only people who know what is
>>> going on are the ones who wrote it.
>>>
>>>
>>>
>>> If somebody argued that we have them in CHANGES.txt and NEWS.txt,
>>> neither ends up on the website and I do not think they are appropriate
>>> vehicles for user-facing documentation or for anything beyond few sentences.
>>>
>>>
>>>
>>>
>>> Could we introduce a policy which would require developers to introduce
>>> at least minimal user-facing documentation (if applicable) before
>>> delivering it / before releasing it and it would be part of the reviews?
>>>
>>>
>>>
>>>
>>> For now, while we also add documentation, I feel it is "the best-effort"
>>> approach, it is not part of the official policy when delivering it.
>>>
>>>
>>>
>>> As of now, I can not see any information about documentation among "For
>>> Code Contributions" points:
>>>
>>>
>>>
>>>
>>> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance
>>> <https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/CASSANDRA/Cassandra*Project*Governance__;Kys!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjETp4KSISQ$>
>>>
>>>
>>>
>>> I am looking for adding there a new point:
>>>
>>>
>>>
>>> Code must not be committed when user-facing functionality is not
>>> documented and visible without code inspection.
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>>

Reply via email to