Oh, well then, that’s perfect. I looked at our contributing to the docs page and found it empty:
https://cassandra.apache.org/_/docdev/index.html Will merge in my UCS changes tomorrow. Jon On Thu, May 1, 2025 at 1:41 PM Brandon Williams <dri...@gmail.com> wrote: > The governanceays "Correcting typos, docs, website, and comments etc > operate a “Commit Then Review > <https://www.apache.org/foundation/glossary.html#CommitThenReview>” > policy" > > Kind Regards, > Brandon > > > On Thu, May 1, 2025 at 3:21 PM Jon Haddad <j...@rustyrazorblade.com> wrote: > >> I propose we encourage committers commit docs without review or a JIRA. >> >> >> On Thu, May 1, 2025 at 8:06 AM Jon Haddad <j...@rustyrazorblade.com> >> wrote: >> >>> Stefan, >>> >>> Any feature developed for Cassandra is a collaborative effort. The >>> public branches of accord have been available for months. Have you >>> contributed to the accord docs? You've had plenty of opportunity. >>> >>> It looks like you're turning your own thread into an airing of >>> grievances. It's not particularly constructive. You can be right (we need >>> more docs) without being hostile. >>> >>> Jon >>> >>> >>> >>> >>> On Thu, May 1, 2025 at 7:49 AM Miklosovic, Stefan via dev < >>> dev@cassandra.apache.org> wrote: >>> >>>> Yeah, no surprise, I was thinking the dicussion will go this direction. >>>> I am not completely sure who we are developing this for then. I see the >>>> statements like this and I am pretty disappointed: >>>> >>>> >>>> >>>> "The project obviously aims to serve end users, but the developer >>>> community is the actual project and it is fine to serve that demographic >>>> first, or only. " >>>> >>>> >>>> >>>> What is the actual difference between working in a private fork and >>>> publishing the code publicly almost nobody understands? >>>> >>>> >>>> I get that people work for companies etc. but really, we should reflect >>>> quite hard on what we are doing here. >>>> >>>> >>>> >>>> Let's take Accord, for example. I can not see any justification for >>>> working on something for 3 years and not documenting how to use that >>>> when it really comes to it. Is Accord documentation for users on the >>>> way or not? Is the documentation for CEP-45 going to be done or not? How >>>> are operators outside of the authors of that change one can count on one >>>> hand (and working in one company) supposed to know how to use that? >>>> What is "open source" about that expect of that being online and publicly >>>> accessible? >>>> >>>> >>>> >>>> Over the last couple years Cassandra is getting more and more complex >>>> which might alienate even the developers working on it daily. People >>>> might get out of touch with all of the new features being rolled out >>>> and if this trend will continue without documenting it along the way I am >>>> very afraid that Cassandra becomes an exclusive self-serving club of elite >>>> programmers an average / begginner user has no way to catch up with, >>>> consultants will not know how to consult it and so on and so on. How is >>>> this good for anybody? >>>> >>>> >>>> >>>> What I am asking for is really not a rocket science and nothing >>>> "rigid". I am all open to lower the requirements. >>>> >>>> >>>> >>>> I am not asking for rephrasing whole CEP and present it to a user. I am >>>> asking for the description of the most common usages and scenarios with >>>> most important consequences and all configuration parameters. >>>> >>>> >>>> >>>> Let's take a look at CEP-37. >>>> >>>> >>>> >>>> *https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html >>>> <https://cassandra.apache.org/doc/trunk/cassandra/managing/operating/auto_repair.html>* >>>> >>>> >>>> >>>> This is just wonderful and it is an example how it should be done. >>>> >>>> >>>> >>>> Why can not be this done for other CEPs too? What was different for >>>> CEP-37 when docs were written together with the code but it can not be >>>> done similarly for other CEPs as well? >>>> >>>> >>>> >>>> Regards >>>> >>>> >>>> >>>> >>>> >>>> *From: *Benedict <bened...@apache.org> >>>> *Date: *Thursday, 1 May 2025 at 14:37 >>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>>> *Cc: *Rolo, Carlos <carlos.r...@netapp.com>, Miklosovic, Stefan < >>>> stefan.mikloso...@netapp.com>, dev@cassandra.apache.org < >>>> dev@cassandra.apache.org> >>>> *Subject: *Re: [DISCUSS] Requirement to document features before >>>> releasing them >>>> >>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>>> >>>> >>>> >>>> I am opposed to this. There’s too much imprecision in the “rule” while >>>> simultaneously being much too rigid, and it will be improperly enforced (we >>>> already have lots of rule breaking around modifying public APIs, that >>>> should have discuss threads and do not, for instance). This kind of >>>> arbitrary rule that is unaligned with contributors will likely lead to a >>>> bad and inconsistent documentation, which is worse than no documentation. >>>> >>>> >>>> >>>> We could perhaps stipulate that for a feature to leave experimental >>>> status the community must vote and that documentation should be a >>>> consideration. But this will only capture big changes. >>>> >>>> >>>> >>>> We could perhaps try other ideas like moratoriums on contributions that >>>> are not documentation, to encourage improvements there. >>>> >>>> >>>> >>>> We could perhaps try having LLMs generate documentation that new >>>> contributors could take a first pass at editing for correctness, before a >>>> committer takes a final pass. >>>> >>>> >>>> >>>> At the end of the day though, we’re an OSS project and we do have >>>> features (big and small) designed, implemented and likely only used by the >>>> sole contributor of the feature. We also have features used primarily by >>>> active community members who understand it well enough. I don’t think this >>>> is a bug in the system. The project obviously aims to serve end users, but >>>> the developer community is the actual project and it is fine to serve that >>>> demographic first, or only. >>>> >>>> >>>> >>>> I agree we want to improve our documentation, but this is not the right >>>> way to go about it. >>>> >>>> >>>> >>>> On 1 May 2025, at 13:19, Miklosovic, Stefan via dev < >>>> dev@cassandra.apache.org> wrote: >>>> >>>> >>>> >>>> I am not completely sure LLMs are the way to go here. Sure, to have >>>> something >>>> to further refine ... why not. But to just generate something via LLM >>>> and commit that, that would be a no-no from me. These things can go >>>> hallucinate quite quickly, then what? Who is going to proof-read >>>> technical stuff like that? Fixing the hallucinations might take more time >>>> then just writing it from scratch. >>>> >>>> >>>> >>>> Anyway, I would really appreciate if we stayed on track and discussed >>>> the proposition mentioned in my first email - the end goal is to codify the >>>> need to provide documentation together with the feature. If not provided >>>> together, it might be in a separate ticket which will be a blocker for the >>>> next release. >>>> >>>> >>>> >>>> I might initiate the voting thread for that ... >>>> >>>> >>>> >>>> Regards >>>> >>>> >>>> >>>> >>>> >>>> *From: *Rolo, Carlos <carlos.r...@netapp.com> >>>> *Date: *Thursday, 1 May 2025 at 12:30 >>>> *To: *David Capwell <dcapw...@apple.com>, dev@cassandra.apache.org < >>>> dev@cassandra.apache.org> >>>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>>> *Subject: *Re: [DISCUSS] Requirement to document features before >>>> releasing them >>>> >>>> I am bit out of the loop on how/if this would extend to driver >>>> sub-projects. >>>> >>>> Because this makes 100% sense, and in the driver space as well. Looking >>>> into Java driver docs and making others similar would be a great. >>>> >>>> >>>> >>>> Patrich that LLM suggestion might be a life saver, let me try that! >>>> ------------------------------ >>>> >>>> *From:* Miklosovic, Stefan via dev <dev@cassandra.apache.org> >>>> *Sent:* 01 May 2025 08:07 >>>> *To:* David Capwell <dcapw...@apple.com>; dev@cassandra.apache.org < >>>> dev@cassandra.apache.org> >>>> *Cc:* Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>>> *Subject:* Re: [DISCUSS] Requirement to document features before >>>> releasing them >>>> >>>> >>>> >>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>>> >>>> >>>> >>>> Denser is better. In your oversimplified example of Accord, as a user >>>> who encounters this for the first time, I am definitely interested in what >>>> the limitations are. What might happen quite easily is that if it is not >>>> dense and we just announce it sparsly, then a user takes it all at face >>>> value and if it starts to diverge from your proclamation then they might >>>> feel like they were lied to or they start to be disappointed. You got >>>> me? Users do not like surprises they are discovering themselves on the way >>>> of trying it out (and a lot of time painfully). They just want to know what >>>> they are buying themselves into. >>>> >>>> >>>> >>>> If there are super-cornercase details, that might be omitted as we have >>>> other channels of the communication (Slack, mailing list ...) but in >>>> general I do not see how a lot of documentation would be bad. >>>> >>>> >>>> >>>> It also depends on who you are writing that documentation to. As said, >>>> we talk about user-facing docs here. A documentation for developers where >>>> we are trying to boostrap them / to make them oriented in the code base is >>>> going to be substantially different from a user-facing one. >>>> >>>> >>>> >>>> >>>> >>>> *From: *David Capwell <dcapw...@apple.com> >>>> *Date: *Wednesday, 30 April 2025 at 23:35 >>>> *To: *dev@cassandra.apache.org <dev@cassandra.apache.org> >>>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>>> *Subject: *Re: [DISCUSS] Requirement to document features before >>>> releasing them >>>> >>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments * >>>> >>>> >>>> >>>> I wonder at what level can we enforce this. What I mean, in modeling >>>> testing I have found some odd behaviors that people were not aware of >>>> (BATCH cell resolution, NULL handling (emptiness…..), etc.)… so if >>>> documentation is dense this can help force people to think through edge >>>> cases or how 2 features interact with each other…. If documentation is >>>> sparse, then you loose this benefit… >>>> >>>> >>>> >>>> Simple example for Accord >>>> >>>> >>>> >>>> # Sparse >>>> >>>> >>>> >>>> Multiple key transaction support, bringing Apache Cassandra cluster to >>>> the RDMS world! >>>> >>>> >>>> >>>> # Dense >>>> >>>> >>>> >>>> … >>>> >>>> >>>> >>>> Here are the current limitations, … >>>> >>>> >>>> >>>> Here is where we alter Apache Cassandra’s behavior to be more inline >>>> with SQL, ... >>>> >>>> >>>> >>>> On Apr 30, 2025, at 1:38 PM, Miklosovic, Stefan via dev < >>>> dev@cassandra.apache.org> wrote: >>>> >>>> >>>> >>>> >>>> >>>> To extend the first e-mail to cover the practicalities: >>>> >>>> >>>> >>>> 1. changes introduced to nodetool would not be part of this because >>>> they are self-documented (docs of help is autogenerated) >>>> 2. introduction of changes into cassandra.yaml is already covered >>>> as that is what is autogenerated / on website also. >>>> 3. Applying common sense, if it is just enough to mention in >>>> NEWS.txt, that is also fine. >>>> 4. metrics - I bet there are some which are not documented, we >>>> should find a way how to autogenerate them into the website. >>>> >>>> >>>> >>>> I am also to blame and showing I am not a hypocrite, I have never >>>> delivered in-depth user documentation of CEP-24 with examples, use cases, >>>> and so on. I am trying to be more aware of the documentation when >>>> delivering features, to raise awareness about that etc. It is easy to not >>>> think about this too much when developers are in a rush and similar. If >>>> there was a hard requirement for the documentation, I would do it right >>>> away and I would not need to deal with this now. >>>> >>>> >>>> >>>> I understand that when delivering heavy-weights like CEP-15 we can not >>>> expect that all the docs will be done upon delivery but I want to stress >>>> the fact that providing usable documentation should be definitely something >>>> to think about when releasing it. Same goes for all other non-trivial >>>> features. >>>> >>>> >>>> >>>> >>>> >>>> *From: *Josh McKenzie <jmcken...@apache.org> >>>> *Date: *Wednesday, 30 April 2025 at 22:11 >>>> *To: *dev <dev@cassandra.apache.org> >>>> *Cc: *Miklosovic, Stefan <stefan.mikloso...@netapp.com> >>>> *Subject: *Re: [DISCUSS] Requirement to document features before >>>> releasing them >>>> >>>> *EXTERNAL EMAIL - USE CAUTION when clicking links or attachments* >>>> >>>> >>>> >>>> This makes intuitive sense to me. >>>> >>>> >>>> >>>> In our case we could tie documentation to the process of promoting a >>>> feature from “experimental” to production ready, though I fear that might >>>> leave wiggle room for primary authors of some features to leave them as >>>> experimental forever, not desiring to take on the burden of documenting >>>> something that’s already merged in and usable by experts. >>>> >>>> >>>> >>>> Curious what others think. >>>> >>>> >>>> >>>> On Wed, Apr 30, 2025, at 12:10 PM, Miklosovic, Stefan via dev wrote: >>>> >>>> I am on OpenSearchCon and there was a discussion about the >>>> documentation of features. In a nutshell, the policy they seem to have is >>>> that there are some minimal requirements for documentation in place for >>>> each feature introduced. That way, there is no way (or it is greatly >>>> minimised) that there would be a feature released or some user-facing >>>> change introduced without any documentation how to use it. >>>> >>>> >>>> >>>> Under the "documentation", in our case, I mean the docs which would end >>>> up in cassandra.apache.org >>>> <https://urldefense.com/v3/__http:/cassandra.apache.org__;!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjERRjfVMrw$> >>>> docs. >>>> >>>> >>>> >>>> In their case, the documentation is either part of the change or there >>>> is a documentation issue (in GitHub terms) created which basically blocks >>>> the release when not addressed. >>>> >>>> >>>> >>>> When there is no documentation about a feature or improvement, knob to >>>> tweak etc, there is virtually nobody who knows about that except the >>>> person who committed the code / people who participated in a review. I >>>> think this is detrimental to the project. I do not see the point in >>>> releasing something undocumented when the only people who know what is >>>> going on are the ones who wrote it. >>>> >>>> >>>> >>>> If somebody argued that we have them in CHANGES.txt and NEWS.txt, >>>> neither ends up on the website and I do not think they are appropriate >>>> vehicles for user-facing documentation or for anything beyond few >>>> sentences. >>>> >>>> >>>> >>>> >>>> Could we introduce a policy which would require developers to introduce >>>> at least minimal user-facing documentation (if applicable) before >>>> delivering it / before releasing it and it would be part of the reviews? >>>> >>>> >>>> >>>> >>>> For now, while we also add documentation, I feel it is "the >>>> best-effort" approach, it is not part of the official policy when >>>> delivering it. >>>> >>>> >>>> >>>> As of now, I can not see any information about documentation among "For >>>> Code Contributions" points: >>>> >>>> >>>> >>>> >>>> https://cwiki.apache.org/confluence/display/CASSANDRA/Cassandra+Project+Governance >>>> <https://urldefense.com/v3/__https:/cwiki.apache.org/confluence/display/CASSANDRA/Cassandra*Project*Governance__;Kys!!Nhn8V6BzJA!Q2uU9Ab38CiJSRJuSPI9bIKJfTgR9yuneyK2LGgK4a4YNMwL2jD1yVsG018wQlMrMAgKI9CfFzOtXbLNjETp4KSISQ$> >>>> >>>> >>>> >>>> I am looking for adding there a new point: >>>> >>>> >>>> >>>> Code must not be committed when user-facing functionality is not >>>> documented and visible without code inspection. >>>> >>>> >>>> >>>> Regards >>>> >>>> >>>> >>>>