I propose we encourage committers commit docs without review or a JIRA.
On Thu, May 1, 2025 at 8:06 AM Jon Haddad <j...@rustyrazorblade.com> wrote: > Stefan, > > Any feature developed for Cassandra is a collaborative effort. The public > branches of accord have been available for months. Have you contributed to > the accord docs? You've had plenty of opportunity. > > It looks like you're turning your own thread into an airing of > grievances. It's not particularly constructive. You can be right (we need > more docs) without being hostile. > > Jon > > > > > On Thu, May 1, 2025 at 7:49 AM Miklosovic, Stefan via dev < > dev@cassandra.apache.org> wrote: > >> Yeah, no surprise, I was thinking the dicussion will go this direction. I >> am not completely sure who we are developing this for then. I see the >> statements like this and I am pretty disappointed: >> >> >> >> "The project obviously aims to serve end users, but the developer >> community is the actual project and it is fine to serve that demographic >> first, or only. " >> >> >> >> What is the actual difference between working in a private fork and >> publishing the code publicly almost nobody understands? >> >> >> I get that people work for companies etc. but really, we should reflect >> quite hard on what we are doing here. >> >> >> >> Let's take Accord, for example. I can not see any justification for >> working on something for 3 years and not documenting how to use that >> when it really comes to it. Is Accord documentation for users on the way >> or not? Is the documentation for CEP-45 going to be done or not? How are >> operators outside of the authors of that change one can count on one hand >> (and >> working in one company) supposed to know how to use that? What is "open >> source" about that expect of that being online and publicly accessible? >> >> >> >> Over the last couple years Cassandra is getting more and more complex >> which might alienate even the developers working on it daily. People >> might get out of touch with all of the new features being rolled out and >> if this trend will continue without documenting it along the way I am very >> afraid that Cassandra becomes an exclusive self-serving club of elite >> programmers an average / begginner user has no way to catch up with, >> consultants will not know how to consult it and so on and so on. How is >> this good for anybody? >> >> >> >> What I am asking for is really not a rocket science and nothing "rigid". >> I am all open to lower the requirements. >> >> >> >> I am not asking for rephrasing whole CEP and present it to a user. I am >> asking for the description of the most common usages and scenarios with >> most important consequences and all configuration parameters. >> >> >> >> Let's take a look at CEP-37. >> >> >> >> *https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html >> <https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html>* >> >> >> >> This is just wonderful and it is an example how it should be done. >> >> >> >> Why can not be this done for other CEPs too? What was different for >> CEP-37 when docs were written together with the code but it can not be >> done similarly for other CEPs as well? >> >> >> >> Regards >> >> >> >> >> >> *From: *Benedict <bened...@apache.org> >> *Date: *Thursday, 1 May 2025 at 14:37 >> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >> *Cc: *Rolo, Carlos <carlos.r...@netapp.com>, Miklosovic, Stefan < >> stefan.mikloso...@netapp.com>, dev@cassandra.apache.org < >> dev@cassandra.apache.org> >> *Subject: *Re: [DISCUSS] Requirement to document features before >> releasing them >> >> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >> >> >> >> I am opposed to this. There’s too much imprecision in the “rule” while >> simultaneously being much too rigid, and it will be improperly enforced (we >> already have lots of rule breaking around modifying public APIs, that >> should have discuss threads and do not, for instance). This kind of >> arbitrary rule that is unaligned with contributors will likely lead to a >> bad and inconsistent documentation, which is worse than no documentation. >> >> >> >> We could perhaps stipulate that for a feature to leave experimental >> status the community must vote and that documentation should be a >> consideration. But this will only capture big changes. >> >> >> >> We could perhaps try other ideas like moratoriums on contributions that >> are not documentation, to encourage improvements there. >> >> >> >> We could perhaps try having LLMs generate documentation that new >> contributors could take a first pass at editing for correctness, before a >> committer takes a final pass. >> >> >> >> At the end of the day though, we’re an OSS project and we do have >> features (big and small) designed, implemented and likely only used by the >> sole contributor of the feature. We also have features used primarily by >> active community members who understand it well enough. I don’t think this >> is a bug in the system. The project obviously aims to serve end users, but >> the developer community is the actual project and it is fine to serve that >> demographic first, or only. >> >> >> >> I agree we want to improve our documentation, but this is not the right >> way to go about it. >> >> >> >> On 1 May 2025, at 13:19, Miklosovic, Stefan via dev < >> dev@cassandra.apache.org> wrote: >> >> >> >> I am not completely sure LLMs are the way to go here. Sure, to have something >> to further refine ... why not. But to just generate something via LLM and >> commit that, that would be a no-no from me. These things can go hallucinate >> quite quickly, then what? Who is going to proof-read technical stuff >> like that? Fixing the hallucinations might take more time then just writing >> it from scratch. >> >> >> >> Anyway, I would really appreciate if we stayed on track and discussed the >> proposition mentioned in my first email - the end goal is to codify the >> need to provide documentation together with the feature. If not provided >> together, it might be in a separate ticket which will be a blocker for the >> next release. >> >> >> >> I might initiate the voting thread for that ... >> >> >> >> Regards >> >> >> >> >> >> *From: *Rolo, Carlos <carlos.r...@netapp.com> >> *Date: *Thursday, 1 May 2025 at 12:30 >> *To: *David Capwell <dcapw...@apple.com>, dev@cassandra.apache.org < >> dev@cassandra.apache.org> >> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >> *Subject: *Re: [DISCUSS] Requirement to document features before >> releasing them >> >> I am bit out of the loop on how/if this would extend to driver >> sub-projects. >> >> Because this makes 100% sense, and in the driver space as well. Looking >> into Java driver docs and making others similar would be a great. >> >> >> >> Patrich that LLM suggestion might be a life saver, let me try that! >> ------------------------------ >> >> *From:* Miklosovic, Stefan via dev <dev@cassandra.apache.org> >> *Sent:* 01 May 2025 08:07 >> *To:* David Capwell <dcapw...@apple.com>; dev@cassandra.apache.org < >> dev@cassandra.apache.org> >> *Cc:* Miklosovic, Stefan <stefan.mikloso...@netapp.com> >> *Subject:* Re: [DISCUSS] Requirement to document features before >> releasing them >> >> >> >> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >> >> >> >> Denser is better. In your oversimplified example of Accord, as a user who >> encounters this for the first time, I am definitely interested in what the >> limitations are. What might happen quite easily is that if it is not dense >> and we just announce it sparsly, then a user takes it all at face value and >> if it starts to diverge from your proclamation then they might feel like >> they were lied to or they start to be disappointed. You got me? Users do >> not like surprises they are discovering themselves on the way of trying it >> out (and a lot of time painfully). They just want to know what they are >> buying themselves into. >> >> >> >> If there are super-cornercase details, that might be omitted as we have >> other channels of the communication (Slack, mailing list ...) but in >> general I do not see how a lot of documentation would be bad. >> >> >> >> It also depends on who you are writing that documentation to. As said, we >> talk about user-facing docs here. A documentation for developers where we >> are trying to boostrap them / to make them oriented in the code base is >> going to be substantially different from a user-facing one. >> >> >> >> >> >> *From: *David Capwell <dcapw...@apple.com> >> *Date: *Wednesday, 30 April 2025 at 23:35 >> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >> *Subject: *Re: [DISCUSS] Requirement to document features before >> releasing them >> >> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >> >> >> >> I wonder at what level can we enforce this. What I mean, in modeling >> testing I have found some odd behaviors that people were not aware of >> (BATCH cell resolution, NULL handling (emptiness…..), etc.)… so if >> documentation is dense this can help force people to think through edge >> cases or how 2 features interact with each other…. If documentation is >> sparse, then you loose this benefit… >> >> >> >> Simple example for Accord >> >> >> >> # Sparse >> >> >> >> Multiple key transaction support, bringing Apache Cassandra cluster to >> the RDMS world! >> >> >> >> # Dense >> >> >> >> … >> >> >> >> Here are the current limitations, … >> >> >> >> Here is where we alter Apache Cassandra’s behavior to be more inline with >> SQL, ... >> >> >> >> On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev < >> dev@cassandra.apache.org> wrote: >> >> >> >> >> >> To extend the first e-mail to cover the practicalities: >> >> >> >> 1. changes introduced to nodetool would not be part of this because >> they are self-documented (docs of help is autogenerated) >> 2. introduction of changes into cassandra.yaml is already covered as >> that is what is autogenerated / on website also. >> 3. Applying common sense, if it is just enough to mention in >> NEWS.txt, that is also fine. >> 4. metrics - I bet there are some which are not documented, we should >> find a way how to autogenerate them into the website. >> >> >> >> I am also to blame and showing I am not a hypocrite, I have never >> delivered in-depth user documentation of CEP-24 with examples, use cases, >> and so on. I am trying to be more aware of the documentation when >> delivering features, to raise awareness about that etc. It is easy to not >> think about this too much when developers are in a rush and similar. If >> there was a hard requirement for the documentation, I would do it right >> away and I would not need to deal with this now. >> >> >> >> I understand that when delivering heavy-weights like CEP-15 we can not >> expect that all the docs will be done upon delivery but I want to stress >> the fact that providing usable documentation should be definitely something >> to think about when releasing it. Same goes for all other non-trivial >> features. >> >> >> >> >> >> *From: *Josh McKenzie <jmcken...@apache.org> >> *Date: *Wednesday, 30 April 2025 at 22:11 >> *To: *dev <dev@cassandra.apache.org> >> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >> *Subject: *Re: [DISCUSS] Requirement to document features before >> releasing them >> >> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments* >> >> >> >> This makes intuitive sense to me. >> >> >> >> In our case we could tie documentation to the process of promoting a >> feature from “experimental” to production ready, though I fear that might >> leave wiggle room for primary authors of some features to leave them as >> experimental forever, not desiring to take on the burden of documenting >> something that’s already merged in and usable by experts. >> >> >> >> Curious what others think. >> >> >> >> On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote: >> >> I am on OpenSearchCon and there was a discussion about the documentation >> of features. In a nutshell, the policy they seem to have is that there are >> some minimal requirements for documentation in place for each feature >> introduced. That way, there is no way (or it is greatly minimised) that >> there would be a feature released or some user-facing change introduced >> without any documentation how to use it. >> >> >> >> Under the "documentation", in our case, I mean the docs which would end >> up in cassandra.apache.org >> <https://urldefense.com/v3/__http:/cassandra.apache.org__;!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjERRjfVMrw$> >> docs. >> >> >> >> In their case, the documentation is either part of the change or there is >> a documentation issue (in GitHub terms) created which basically blocks the >> release when not addressed. >> >> >> >> When there is no documentation about a feature or improvement, knob to >> tweak etc, there is virtually nobody who knows about that except the >> person who committed the code / people who participated in a review. I >> think this is detrimental to the project. I do not see the point in >> releasing something undocumented when the only people who know what is >> going on are the ones who wrote it. >> >> >> >> If somebody argued that we have them in CHANGES.txt and NEWS.txt, neither >> ends up on the website and I do not think they are appropriate vehicles for >> user-facing documentation or for anything beyond few sentences. >> >> >> >> Could we introduce a policy which would require developers to introduce >> at least minimal user-facing documentation (if applicable) before >> delivering it / before releasing it and it would be part of the reviews? >> >> >> >> For now, while we also add documentation, I feel it is "the best-effort" >> approach, it is not part of the official policy when delivering it. >> >> >> >> As of now, I can not see any information about documentation among "For >> Code Contributions" points: >> >> >> >> >> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance >> <https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/CASSANDRA/Cassandra*Project*Governance__;Kys!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjETp4KSISQ$> >> >> >> >> I am looking for adding there a new point: >> >> >> >> Code must not be committed when user-facing functionality is not >> documented and visible without code inspection. >> >> >> >> Regards >> >> >> >>