I propose we encourage committers commit docs without review or a JIRA.

On Thu, May 1, 2025 at 8:06 AM Jon Haddad <j...@rustyrazorblade.com> wrote:

> Stefan,
>
> Any feature developed for Cassandra is a collaborative effort.  The public
> branches of accord have been available for months.  Have you contributed to
> the accord docs?  You've had plenty of opportunity.
>
> It looks like you're turning your own thread into an airing of
> grievances.  It's not particularly constructive.  You can be right (we need
> more docs) without being hostile.
>
> Jon
>
>
>
>
> On Thu, May 1, 2025 at 7:49 AM Miklosovic, Stefan via dev <
> dev@cassandra.apache.org> wrote:
>
>> Yeah, no surprise, I was thinking the dicussion will go this direction. I
>> am not completely sure who we are developing this for then. I see the
>> statements like this and I am pretty disappointed:
>>
>>
>>
>> "The project obviously aims to serve end users, but the developer
>> community is the actual project and it is fine to serve that demographic
>> first, or only. "
>>
>>
>>
>> What is the actual difference between working in a private fork and
>> publishing the code publicly almost nobody understands?
>>
>>
>> I get that people work for companies etc. but really, we should reflect
>> quite hard on what we are doing here.
>>
>>
>>
>> Let's take Accord, for example. I can not see any justification for
>> working on something for 3 years and not documenting how to use that
>> when it really comes to it. Is Accord documentation for users on the way
>> or not? Is the documentation for CEP-45 going to be done or not? How are
>> operators outside of the authors of that change one can count on one hand 
>> (and
>> working in one company) supposed to know how to use that? What is "open
>> source" about that expect of that being online and publicly accessible?
>>
>>
>>
>> Over the last couple years Cassandra is getting more and more complex
>> which might alienate even the developers working on it daily. People
>> might get out of touch with all of the new features being rolled out and
>> if this trend will continue without documenting it along the way I am very
>> afraid that Cassandra becomes an exclusive self-serving club of elite
>> programmers an average / begginner user has no way to catch up with,
>> consultants will not know how to consult it and so on and so on. How is
>> this good for anybody?
>>
>>
>>
>> What I am asking for is really not a rocket science and nothing "rigid".
>> I am all open to lower the requirements.
>>
>>
>>
>> I am not asking for rephrasing whole CEP and present it to a user. I am
>> asking for the description of the most common usages and scenarios with
>> most important consequences and all configuration parameters.
>>
>>
>>
>> Let's take a look at CEP-37.
>>
>>
>>
>> *https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html
>> <https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html>*
>>
>>
>>
>> This is just wonderful and it is an example how it should be done.
>>
>>
>>
>> Why can not be this done for other CEPs too? What was different for
>> CEP-37 when docs were written together with the code but it can not be
>> done similarly for other CEPs as well?
>>
>>
>>
>> Regards
>>
>>
>>
>>
>>
>> *From: *Benedict <bened...@apache.org>
>> *Date: *Thursday, 1 May 2025 at 14:37
>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
>> *Cc: *Rolo, Carlos <carlos.r...@netapp.com>, Miklosovic, Stefan <
>> stefan.mikloso...@netapp.com>, dev@cassandra.apache.org <
>> dev@cassandra.apache.org>
>> *Subject: *Re: [DISCUSS] Requirement to document features before
>> releasing them
>>
>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>
>>
>>
>> I am opposed to this. There’s too much imprecision in the “rule” while
>> simultaneously being much too rigid, and it will be improperly enforced (we
>> already have lots of rule breaking around modifying public APIs, that
>> should have discuss threads and do not, for instance). This kind of
>> arbitrary rule that is unaligned with contributors will likely lead to a
>> bad and inconsistent documentation, which is worse than no documentation.
>>
>>
>>
>> We could perhaps stipulate that for a feature to leave experimental
>> status the community must vote and that documentation should be a
>> consideration. But this will only capture big changes.
>>
>>
>>
>> We could perhaps try other ideas like moratoriums on contributions that
>> are not documentation, to encourage improvements there.
>>
>>
>>
>> We could perhaps try having LLMs generate documentation that new
>> contributors could take a first pass at editing for correctness, before a
>> committer takes a final pass.
>>
>>
>>
>> At the end of the day though, we’re an OSS project and we do have
>> features (big and small) designed, implemented and likely only used by the
>> sole contributor of the feature. We also have features used primarily by
>> active community members who understand it well enough. I don’t think this
>> is a bug in the system. The project obviously aims to serve end users, but
>> the developer community is the actual project and it is fine to serve that
>> demographic first, or only.
>>
>>
>>
>> I agree we want to improve our documentation, but this is not the right
>> way to go about it.
>>
>>
>>
>> On 1 May 2025, at 13:19, Miklosovic, Stefan via dev <
>> dev@cassandra.apache.org> wrote:
>>
>> 
>>
>> I am not completely sure LLMs are the way to go here. Sure, to have something
>> to further refine ... why not. But to just generate something via LLM and
>> commit that, that would be a no-no from me. These things can go hallucinate
>> quite quickly, then what? Who is going to proof-read technical stuff
>> like that? Fixing the hallucinations might take more time then just writing
>> it from scratch.
>>
>>
>>
>> Anyway, I would really appreciate if we stayed on track and discussed the
>> proposition mentioned in my first email - the end goal is to codify the
>> need to provide documentation together with the feature. If not provided
>> together, it might be in a separate ticket which will be a blocker for the
>> next release.
>>
>>
>>
>> I might initiate the voting thread for that ...
>>
>>
>>
>> Regards
>>
>>
>>
>>
>>
>> *From: *Rolo, Carlos <carlos.r...@netapp.com>
>> *Date: *Thursday, 1 May 2025 at 12:30
>> *To: *David Capwell <dcapw...@apple.com>, dev@cassandra.apache.org <
>> dev@cassandra.apache.org>
>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>> *Subject: *Re: [DISCUSS] Requirement to document features before
>> releasing them
>>
>> I am bit out of the loop on how/if this would extend to driver
>> sub-projects.
>>
>> Because this makes 100% sense, and in the driver space as well. Looking
>> into Java driver docs and making others similar would be a great.
>>
>>
>>
>> Patrich that LLM suggestion might be a life saver, let me try that!
>> ------------------------------
>>
>> *From:* Miklosovic, Stefan via dev <dev@cassandra.apache.org>
>> *Sent:* 01 May 2025 08:07
>> *To:* David Capwell <dcapw...@apple.com>; dev@cassandra.apache.org <
>> dev@cassandra.apache.org>
>> *Cc:* Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>> *Subject:* Re: [DISCUSS] Requirement to document features before
>> releasing them
>>
>>
>>
>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>
>>
>>
>> Denser is better. In your oversimplified example of Accord, as a user who
>> encounters this for the first time, I am definitely interested in what the
>> limitations are. What might happen quite easily is that if it is not dense
>> and we just announce it sparsly, then a user takes it all at face value and
>> if it starts to diverge from your proclamation then they might feel like
>> they were lied to or they start to be disappointed. You got me? Users do
>> not like surprises they are discovering themselves on the way of trying it
>> out (and a lot of time painfully). They just want to know what they are
>> buying themselves into.
>>
>>
>>
>> If there are super-cornercase details, that might be omitted as we have
>> other channels of the communication (Slack, mailing list ...) but in
>> general I do not see how a lot of documentation would be bad.
>>
>>
>>
>> It also depends on who you are writing that documentation to. As said, we
>> talk about user-facing docs here. A documentation for developers where we
>> are trying to boostrap them / to make them oriented in the code base is
>> going to be substantially different from a user-facing one.
>>
>>
>>
>>
>>
>> *From: *David Capwell <dcapw...@apple.com>
>> *Date: *Wednesday, 30 April 2025 at 23:35
>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org>
>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>> *Subject: *Re: [DISCUSS] Requirement to document features before
>> releasing them
>>
>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments *
>>
>>
>>
>> I wonder at what level can we enforce this.  What I mean, in modeling
>> testing I have found some odd behaviors that people were not aware of
>> (BATCH cell resolution, NULL handling (emptiness…..), etc.)… so if
>> documentation is dense this can help force people to think through edge
>> cases or how 2 features interact with each other…. If documentation is
>> sparse, then you loose this benefit…
>>
>>
>>
>> Simple example for Accord
>>
>>
>>
>> # Sparse
>>
>>
>>
>> Multiple key transaction support, bringing Apache Cassandra cluster to
>> the RDMS world!
>>
>>
>>
>> # Dense
>>
>>
>>
>> …
>>
>>
>>
>> Here are the current limitations, …
>>
>>
>>
>> Here is where we alter Apache Cassandra’s behavior to be more inline with
>> SQL, ...
>>
>>
>>
>> On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev <
>> dev@cassandra.apache.org> wrote:
>>
>>
>>
>>
>>
>> To extend the first e-mail to cover the practicalities:
>>
>>
>>
>>    1. changes introduced to nodetool would not be part of this because
>>    they are self-documented (docs of help is autogenerated)
>>    2. introduction of changes into cassandra.yaml is already covered as
>>    that is what is autogenerated / on website also.
>>    3. Applying common sense, if it is just enough to mention in
>>    NEWS.txt, that is also fine.
>>    4. metrics - I bet there are some which are not documented, we should
>>    find a way how to autogenerate them into the website.
>>
>>
>>
>> I am also to blame and showing I am not a hypocrite, I have never
>> delivered in-depth user documentation of CEP-24 with examples, use cases,
>> and so on. I am trying to be more aware of the documentation when
>> delivering features, to raise awareness about that etc. It is easy to not
>> think about this too much when developers are in a rush and similar. If
>> there was a hard requirement for the documentation, I would do it right
>> away and I would not need to deal with this now.
>>
>>
>>
>> I understand that when delivering heavy-weights like CEP-15 we can not
>> expect that all the docs will be done upon delivery but I want to stress
>> the fact that providing usable documentation should be definitely something
>> to think about when releasing it. Same goes for all other non-trivial
>> features.
>>
>>
>>
>>
>>
>> *From: *Josh McKenzie <jmcken...@apache.org>
>> *Date: *Wednesday, 30 April 2025 at 22:11
>> *To: *dev <dev@cassandra.apache.org>
>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com>
>> *Subject: *Re: [DISCUSS] Requirement to document features before
>> releasing them
>>
>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments*
>>
>>
>>
>> This makes intuitive sense to me.
>>
>>
>>
>> In our case we could tie documentation to the process of promoting a
>> feature from “experimental” to production ready, though I fear that might
>> leave wiggle room for primary authors of some features to leave them as
>> experimental forever, not desiring to take on the burden of documenting
>> something that’s already merged in and usable by experts.
>>
>>
>>
>> Curious what others think.
>>
>>
>>
>> On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote:
>>
>> I am on OpenSearchCon and there was a discussion about the documentation
>> of features. In a nutshell, the policy they seem to have is that there are
>> some minimal requirements for documentation in place for each feature
>> introduced. That way, there is no way (or it is greatly minimised) that
>> there would be a feature released or some user-facing change introduced
>> without any documentation how to use it.
>>
>>
>>
>> Under the "documentation", in our case, I mean the docs which would end
>> up in cassandra.apache.org
>> <https://urldefense.com/v3/__http:/cassandra.apache.org__;!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjERRjfVMrw$>
>>  docs.
>>
>>
>>
>> In their case, the documentation is either part of the change or there is
>> a documentation issue (in GitHub terms) created which basically blocks the
>> release when not addressed.
>>
>>
>>
>> When there is no documentation about a feature or improvement, knob to
>> tweak etc, there is virtually nobody who knows about that except the
>> person who committed the code / people who participated in a review. I
>> think this is detrimental to the project. I do not see the point in
>> releasing something undocumented when the only people who know what is
>> going on are the ones who wrote it.
>>
>>
>>
>> If somebody argued that we have them in CHANGES.txt and NEWS.txt, neither
>> ends up on the website and I do not think they are appropriate vehicles for
>> user-facing documentation or for anything beyond few sentences.
>>
>>
>>
>> Could we introduce a policy which would require developers to introduce
>> at least minimal user-facing documentation (if applicable) before
>> delivering it / before releasing it and it would be part of the reviews?
>>
>>
>>
>> For now, while we also add documentation, I feel it is "the best-effort"
>> approach, it is not part of the official policy when delivering it.
>>
>>
>>
>> As of now, I can not see any information about documentation among "For
>> Code Contributions" points:
>>
>>
>>
>>
>> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance
>> <https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/CASSANDRA/Cassandra*Project*Governance__;Kys!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjETp4KSISQ$>
>>
>>
>>
>> I am looking for adding there a new point:
>>
>>
>>
>> Code must not be committed when user-facing functionality is not
>> documented and visible without code inspection.
>>
>>
>>
>> Regards
>>
>>
>>
>>

Reply via email to