The governance https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance says "Correcting typos, docs, website, and comments etc operate a “Commit Then Review <https://www.apache.org/foundation/glossary.html#CommitThenReview>” policy"
Kind Regards, Brandon On Thu, May 1, 2025 at 3:21 PM Jon Haddad <j...@rustyrazorblade.com> wrote: > I propose we encourage committers commit docs without review or a JIRA. > > > On Thu, May 1, 2025 at 8:06 AM Jon Haddad <j...@rustyrazorblade.com> wrote: > >> Stefan, >> >> Any feature developed for Cassandra is a collaborative effort. The >> public branches of accord have been available for months. Have you >> contributed to the accord docs? You've had plenty of opportunity. >> >> It looks like you're turning your own thread into an airing of >> grievances. It's not particularly constructive. You can be right (we need >> more docs) without being hostile. >> >> Jon >> >> >> >> >> On Thu, May 1, 2025 at 7:49 AM Miklosovic, Stefan via dev < >> dev@cassandra.apache.org> wrote: >> >>> Yeah, no surprise, I was thinking the dicussion will go this direction. >>> I am not completely sure who we are developing this for then. I see the >>> statements like this and I am pretty disappointed: >>> >>> >>> >>> "The project obviously aims to serve end users, but the developer >>> community is the actual project and it is fine to serve that demographic >>> first, or only. " >>> >>> >>> >>> What is the actual difference between working in a private fork and >>> publishing the code publicly almost nobody understands? >>> >>> >>> I get that people work for companies etc. but really, we should reflect >>> quite hard on what we are doing here. >>> >>> >>> >>> Let's take Accord, for example. I can not see any justification for >>> working on something for 3 years and not documenting how to use that >>> when it really comes to it. Is Accord documentation for users on the >>> way or not? Is the documentation for CEP-45 going to be done or not? How >>> are operators outside of the authors of that change one can count on one >>> hand (and working in one company) supposed to know how to use that? >>> What is "open source" about that expect of that being online and publicly >>> accessible? >>> >>> >>> >>> Over the last couple years Cassandra is getting more and more complex >>> which might alienate even the developers working on it daily. People >>> might get out of touch with all of the new features being rolled out >>> and if this trend will continue without documenting it along the way I am >>> very afraid that Cassandra becomes an exclusive self-serving club of elite >>> programmers an average / begginner user has no way to catch up with, >>> consultants will not know how to consult it and so on and so on. How is >>> this good for anybody? >>> >>> >>> >>> What I am asking for is really not a rocket science and nothing "rigid". >>> I am all open to lower the requirements. >>> >>> >>> >>> I am not asking for rephrasing whole CEP and present it to a user. I am >>> asking for the description of the most common usages and scenarios with >>> most important consequences and all configuration parameters. >>> >>> >>> >>> Let's take a look at CEP-37. >>> >>> >>> >>> *https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html >>> <https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html>* >>> >>> >>> >>> This is just wonderful and it is an example how it should be done. >>> >>> >>> >>> Why can not be this done for other CEPs too? What was different for >>> CEP-37 when docs were written together with the code but it can not be >>> done similarly for other CEPs as well? >>> >>> >>> >>> Regards >>> >>> >>> >>> >>> >>> *From: *Benedict <bened...@apache.org> >>> *Date: *Thursday, 1 May 2025 at 14:37 >>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>> *Cc: *Rolo, Carlos <carlos.r...@netapp.com>, Miklosovic, Stefan < >>> stefan.mikloso...@netapp.com>, dev@cassandra.apache.org < >>> dev@cassandra.apache.org> >>> *Subject: *Re: [DISCUSS] Requirement to document features before >>> releasing them >>> >>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>> >>> >>> >>> I am opposed to this. There’s too much imprecision in the “rule” while >>> simultaneously being much too rigid, and it will be improperly enforced (we >>> already have lots of rule breaking around modifying public APIs, that >>> should have discuss threads and do not, for instance). This kind of >>> arbitrary rule that is unaligned with contributors will likely lead to a >>> bad and inconsistent documentation, which is worse than no documentation. >>> >>> >>> >>> We could perhaps stipulate that for a feature to leave experimental >>> status the community must vote and that documentation should be a >>> consideration. But this will only capture big changes. >>> >>> >>> >>> We could perhaps try other ideas like moratoriums on contributions that >>> are not documentation, to encourage improvements there. >>> >>> >>> >>> We could perhaps try having LLMs generate documentation that new >>> contributors could take a first pass at editing for correctness, before a >>> committer takes a final pass. >>> >>> >>> >>> At the end of the day though, we’re an OSS project and we do have >>> features (big and small) designed, implemented and likely only used by the >>> sole contributor of the feature. We also have features used primarily by >>> active community members who understand it well enough. I don’t think this >>> is a bug in the system. The project obviously aims to serve end users, but >>> the developer community is the actual project and it is fine to serve that >>> demographic first, or only. >>> >>> >>> >>> I agree we want to improve our documentation, but this is not the right >>> way to go about it. >>> >>> >>> >>> On 1 May 2025, at 13:19, Miklosovic, Stefan via dev < >>> dev@cassandra.apache.org> wrote: >>> >>> >>> >>> I am not completely sure LLMs are the way to go here. Sure, to have >>> something >>> to further refine ... why not. But to just generate something via LLM >>> and commit that, that would be a no-no from me. These things can go >>> hallucinate quite quickly, then what? Who is going to proof-read >>> technical stuff like that? Fixing the hallucinations might take more time >>> then just writing it from scratch. >>> >>> >>> >>> Anyway, I would really appreciate if we stayed on track and discussed >>> the proposition mentioned in my first email - the end goal is to codify the >>> need to provide documentation together with the feature. If not provided >>> together, it might be in a separate ticket which will be a blocker for the >>> next release. >>> >>> >>> >>> I might initiate the voting thread for that ... >>> >>> >>> >>> Regards >>> >>> >>> >>> >>> >>> *From: *Rolo, Carlos <carlos.r...@netapp.com> >>> *Date: *Thursday, 1 May 2025 at 12:30 >>> *To: *David Capwell <dcapw...@apple.com>, dev@cassandra.apache.org < >>> dev@cassandra.apache.org> >>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>> *Subject: *Re: [DISCUSS] Requirement to document features before >>> releasing them >>> >>> I am bit out of the loop on how/if this would extend to driver >>> sub-projects. >>> >>> Because this makes 100% sense, and in the driver space as well. Looking >>> into Java driver docs and making others similar would be a great. >>> >>> >>> >>> Patrich that LLM suggestion might be a life saver, let me try that! >>> ------------------------------ >>> >>> *From:* Miklosovic, Stefan via dev <dev@cassandra.apache.org> >>> *Sent:* 01 May 2025 08:07 >>> *To:* David Capwell <dcapw...@apple.com>; dev@cassandra.apache.org < >>> dev@cassandra.apache.org> >>> *Cc:* Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>> *Subject:* Re: [DISCUSS] Requirement to document features before >>> releasing them >>> >>> >>> >>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>> >>> >>> >>> Denser is better. In your oversimplified example of Accord, as a user >>> who encounters this for the first time, I am definitely interested in what >>> the limitations are. What might happen quite easily is that if it is not >>> dense and we just announce it sparsly, then a user takes it all at face >>> value and if it starts to diverge from your proclamation then they might >>> feel like they were lied to or they start to be disappointed. You got >>> me? Users do not like surprises they are discovering themselves on the way >>> of trying it out (and a lot of time painfully). They just want to know what >>> they are buying themselves into. >>> >>> >>> >>> If there are super-cornercase details, that might be omitted as we have >>> other channels of the communication (Slack, mailing list ...) but in >>> general I do not see how a lot of documentation would be bad. >>> >>> >>> >>> It also depends on who you are writing that documentation to. As said, >>> we talk about user-facing docs here. A documentation for developers where >>> we are trying to boostrap them / to make them oriented in the code base is >>> going to be substantially different from a user-facing one. >>> >>> >>> >>> >>> >>> *From: *David Capwell <dcapw...@apple.com> >>> *Date: *Wednesday, 30 April 2025 at 23:35 >>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>> *Subject: *Re: [DISCUSS] Requirement to document features before >>> releasing them >>> >>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>> >>> >>> >>> I wonder at what level can we enforce this. What I mean, in modeling >>> testing I have found some odd behaviors that people were not aware of >>> (BATCH cell resolution, NULL handling (emptiness…..), etc.)… so if >>> documentation is dense this can help force people to think through edge >>> cases or how 2 features interact with each other…. If documentation is >>> sparse, then you loose this benefit… >>> >>> >>> >>> Simple example for Accord >>> >>> >>> >>> # Sparse >>> >>> >>> >>> Multiple key transaction support, bringing Apache Cassandra cluster to >>> the RDMS world! >>> >>> >>> >>> # Dense >>> >>> >>> >>> … >>> >>> >>> >>> Here are the current limitations, … >>> >>> >>> >>> Here is where we alter Apache Cassandra’s behavior to be more inline >>> with SQL, ... >>> >>> >>> >>> On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev < >>> dev@cassandra.apache.org> wrote: >>> >>> >>> >>> >>> >>> To extend the first e-mail to cover the practicalities: >>> >>> >>> >>> 1. changes introduced to nodetool would not be part of this because >>> they are self-documented (docs of help is autogenerated) >>> 2. introduction of changes into cassandra.yaml is already covered as >>> that is what is autogenerated / on website also. >>> 3. Applying common sense, if it is just enough to mention in >>> NEWS.txt, that is also fine. >>> 4. metrics - I bet there are some which are not documented, we >>> should find a way how to autogenerate them into the website. >>> >>> >>> >>> I am also to blame and showing I am not a hypocrite, I have never >>> delivered in-depth user documentation of CEP-24 with examples, use cases, >>> and so on. I am trying to be more aware of the documentation when >>> delivering features, to raise awareness about that etc. It is easy to not >>> think about this too much when developers are in a rush and similar. If >>> there was a hard requirement for the documentation, I would do it right >>> away and I would not need to deal with this now. >>> >>> >>> >>> I understand that when delivering heavy-weights like CEP-15 we can not >>> expect that all the docs will be done upon delivery but I want to stress >>> the fact that providing usable documentation should be definitely something >>> to think about when releasing it. Same goes for all other non-trivial >>> features. >>> >>> >>> >>> >>> >>> *From: *Josh McKenzie <jmcken...@apache.org> >>> *Date: *Wednesday, 30 April 2025 at 22:11 >>> *To: *dev <dev@cassandra.apache.org> >>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>> *Subject: *Re: [DISCUSS] Requirement to document features before >>> releasing them >>> >>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments* >>> >>> >>> >>> This makes intuitive sense to me. >>> >>> >>> >>> In our case we could tie documentation to the process of promoting a >>> feature from “experimental” to production ready, though I fear that might >>> leave wiggle room for primary authors of some features to leave them as >>> experimental forever, not desiring to take on the burden of documenting >>> something that’s already merged in and usable by experts. >>> >>> >>> >>> Curious what others think. >>> >>> >>> >>> On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote: >>> >>> I am on OpenSearchCon and there was a discussion about the documentation >>> of features. In a nutshell, the policy they seem to have is that there are >>> some minimal requirements for documentation in place for each feature >>> introduced. That way, there is no way (or it is greatly minimised) that >>> there would be a feature released or some user-facing change introduced >>> without any documentation how to use it. >>> >>> >>> >>> Under the "documentation", in our case, I mean the docs which would end >>> up in cassandra.apache.org >>> <https://urldefense.com/v3/__http:/cassandra.apache.org__;!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjERRjfVMrw$> >>> docs. >>> >>> >>> >>> In their case, the documentation is either part of the change or there >>> is a documentation issue (in GitHub terms) created which basically blocks >>> the release when not addressed. >>> >>> >>> >>> When there is no documentation about a feature or improvement, knob to >>> tweak etc, there is virtually nobody who knows about that except the >>> person who committed the code / people who participated in a review. I >>> think this is detrimental to the project. I do not see the point in >>> releasing something undocumented when the only people who know what is >>> going on are the ones who wrote it. >>> >>> >>> >>> If somebody argued that we have them in CHANGES.txt and NEWS.txt, >>> neither ends up on the website and I do not think they are appropriate >>> vehicles for user-facing documentation or for anything beyond few sentences. >>> >>> >>> >>> >>> Could we introduce a policy which would require developers to introduce >>> at least minimal user-facing documentation (if applicable) before >>> delivering it / before releasing it and it would be part of the reviews? >>> >>> >>> >>> >>> For now, while we also add documentation, I feel it is "the best-effort" >>> approach, it is not part of the official policy when delivering it. >>> >>> >>> >>> As of now, I can not see any information about documentation among "For >>> Code Contributions" points: >>> >>> >>> >>> >>> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance >>> <https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/CASSANDRA/Cassandra*Project*Governance__;Kys!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjETp4KSISQ$> >>> >>> >>> >>> I am looking for adding there a new point: >>> >>> >>> >>> Code must not be committed when user-facing functionality is not >>> documented and visible without code inspection. >>> >>> >>> >>> Regards >>> >>> >>> >>>