Re: [DISCUSS] in-tree AI-assisted code-review prompts

Alex Petrov Fri, 15 May 2026 00:18:18 -0700

I have spotted some deficiencies, particularly when reviewing large patches. I 
have an experiment running that might improve the situation. I’ll report as 
soon I have a result.


On Thu, May 14, 2026, at 12:31 PM, Štefan Miklošovič wrote:
> I just merged (1) and created (2) for tracking the patch of Alex. (1) and (2) 
> don't collide. 
> 
> It would be cool to include this (2) in upcoming weeks, let's just live with 
> what Alex provided for a while to evaluate that set of skills. If the general 
> vibe is OK I would approach the merge. Let's give it what ... few weeks? 
> Until the end of the month  at least.
> 
> (1) https://issues.apache.org/jira/browse/CASSANDRA-21301
> (2) https://issues.apache.org/jira/browse/CASSANDRA-21373
> 
> On Mon, May 11, 2026 at 3:21 PM Štefan Miklošovič <[email protected]> 
> wrote:
>> BTW I really appreciate TLA+ machinery in that patch, I let it scan 
>> compression dictionaries code and how we disperse notifications around the 
>> cluster when a dict is trained etc. and it spit out stuff like this. There 
>> is an IDEA plugin for TLA+ I ran it in and it just worked and verified :) I 
>> can imagine these specs might be theoretically something we commit into the 
>> repo as well when applicable. That way we would at least conceptually codify 
>> the protocols and could elaborate on them on a high level and run some 
>> formal verifications etc ... Really appreciate this aspect of it.
>> 
>> (1) https://gist.github.com/smiklosovic/24b4db51f9ee2b64d76cb0bbb104e29a
>> 
>> On Mon, May 11, 2026 at 11:31 AM C. Scott Andreas <[email protected]> 
>> wrote:
>>> Alex - thanks so much for putting this together and sharing.
>>> 
>>> Here are three additional data loss / corruption bugs identified by Arjun 
>>> Ashok using this set of skills last week:
>>> 
>>> – https://issues.apache.org/jira/browse/CASSANDRA-21356: 
>>> CursorBasedCompaction: ReusableLivenessInfo.isExpiring() incorrectly 
>>> returns true for tombstone cells, corrupting cursor-compacted SSTable 
>>> format and cell reconciliation
>>> – https://issues.apache.org/jira/browse/CASSANDRA-21357: 
>>> CursorBasedCompaction: prevUnfilteredSize always written as 0 in 
>>> SSTableCursorWriter
>>> – https://issues.apache.org/jira/browse/CASSANDRA-21358: 
>>> CursorBasedCompaction: Final index block width off by one byte in 
>>> SSTableCursorWriter#appendBIGIndex()
>>> 
>>> Stepping back a bit --
>>> 
>>> This set of skills combined with the Opus model have enabled folks to find 
>>> 14 data loss, corruption, and correctness bugs in the project in the past 
>>> ~two weeks. These are bugs that likely would have gone undetected - and if 
>>> encountered in the wild, would have required extensive manual fuzz testing 
>>> to reproduce and identify.
>>> 
>>> In the case of the the issue that I'd found and reported: 
>>> https://issues.apache.org/jira/browse/CASSANDRA-21340: GROUP BY queries 
>>> silently return incomplete results due to premature SRP abort
>>> 
>>> I found this by invoking the skill with the prompt "Review Cassandra's 
>>> implementation of GROUP BY for correctness. Identify edge cases that might 
>>> result in incorrect responses. After identifying candidate bugs, fan out 
>>> subagents to write unit tests and fuzz tests attempting to reproduce them. 
>>> Assess their veracity, and present them in order of concern."
>>> 
>>> In less than 30 minutes while sitting on the sofa, the model and skill 
>>> identified CASSANDRA-21340. In another hour, I was able to establish its 
>>> veracity, then leave the model and prompt behind to work through the issue 
>>> and write up the Jira ticket by hand.
>>> 
>>> I'm *really* impressed by what this set of skills enable, and I think they 
>>> may be transformative for quality in Apache Cassandra – especially when 
>>> combined with the ability to write in-JVM dtests; Harry tests; and to use 
>>> the Simulator. These also make it a lot easier to use each of these tools.
>>> 
>>> Here's how I'm thinking about this work so far:
>>> 
>>> – The ensemble review skills are a great first-pass review that can be used 
>>> by anyone preparing a patch to identify potential issues.
>>> – They're incredible for pointing at existing and/or new + experimental 
>>> components in Cassandra to find serious correctness issues.
>>> – I'm sure we'd find latent issues if we directed the skills at interaction 
>>> between multiple components, like "range tombstones x short read protection 
>>> x reverse reads x compact storage" (etc).
>>> – I think these skills could be generalized to support bug-finding and 
>>> validation in other Apache projects.
>>> – I also think there is a generalization of these skills that could be 
>>> applied to CPU + allocation profiling and optimization.
>>> 
>>> For those who have access to a suitable model, I'd love to hear your 
>>> experience attempting to find a latent bug in the database.
>>> 
>>> I was shocked how easy it was, and am hopeful for what this might do for 
>>> quality and data integrity in the project.
>>> 
>>> – Scott
>>> 
>>>> On May 8, 2026, at 5:22 PM, Alex Petrov <[email protected]> wrote:
>>>> 
>>>> 
>>>> I would recommend Opus 4.6+ for /deep-review, but /shallow-review is 
>>>> probably fine with sonnet.
>>>> 
>>>> Maybe time permitting, I can do evals for different models at some point.
>>>> 
>>>>> Review process is always a bottleneck and introducing such skills should 
>>>>> help to make it faster and more reliable. 
>>>> This is hope here, but this is also just a start: we need to reduce 
>>>> false-positives, and do more with specifications (P, TLA+) for critical 
>>>> parts of code.
>>>> 
>>>> On Fri, May 8, 2026, at 5:56 PM, Dmitry Konstantinov wrote:
>>>>> Hi, Alex, thank you a lot for sharing it. I have been using Claude code 
>>>>> for review of my changes but in a very basic ad-hoc way, it works for 
>>>>> simple issues. The skills look much much more powerful. I am going to 
>>>>> read and try them in the upcoming weeks.
>>>>> Review process is always a bottleneck and introducing such skills should 
>>>>> help to make it faster and more reliable. 
>>>>> 
>>>>> A question: what model(s) do you use to run them? Is Sonet 4.6 enough? 
>>>>> 
>>>>> Thanks,
>>>>> Dmitry
>>>>> 
>>>>> On Fri, 8 May 2026 at 14:03, Alex Petrov <[email protected]> wrote:
>>>>>> __
>>>>>> Hello folks,
>>>>>> 
>>>>>> We have been working on some tooling [1] around Apache Cassandra 
>>>>>> correctness, and wanted to share it with Cassandra community. 
>>>>>> 
>>>>>> We have approached this by "indexing" ~3k Cassandra issues and 
>>>>>> extracting common patterns from them, generalizing them, then running 
>>>>>> evals, tweaking, and extending them until we were had a strong signal 
>>>>>> that it performs better than the run-of-the mill code review skill. We 
>>>>>> have benchmarked it against some popular OSS skills (by presenting bugs 
>>>>>> we knew existed from "indexing" Apache Kafka, inferring commit bug 
>>>>>> source from the fix, and making sure benchmarked skills actually find 
>>>>>> it).
>>>>>> 
>>>>>> In addition, I did my best to codify some things I knew about 
>>>>>> correctness, researching code, and writing repros, and what I could find 
>>>>>> in research papers and public blog posts. 
>>>>>> 
>>>>>> So far we were able to find (at very least) following issues (in reality 
>>>>>> the number is higher but I have a backlog of potential leads to 
>>>>>> investigate and reproduce longer than the time I have available for 
>>>>>> these pursuits).
>>>>>>  • deep review + fuzzer:
>>>>>>    • CASSANDRA-21307 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21307>: Lower bound 
>>>>>> [SSTABLE_UPPER_BOUND(row000063)] is bigger than first returned value
>>>>>>    • CASSANDRA-21292 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21292>: Row re-inserted 
>>>>>> at the exact start of a range tombstone disappears after major compaction
>>>>>>    • CASSANDRA-21255 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21255>: Differentiate 
>>>>>> between legitimate cases where the first entry is the same as the last 
>>>>>> entry and empty bounds in SSTableCursorWriter#addIndexBlock()
>>>>>>  • shallow + deep review:
>>>>>>    • (latent) issue of unused keepFrom in linearSubtract 
>>>>>> https://github.com/apache/cassandra-accord/pull/272
>>>>>>    • CASSANDRA-21336 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21336>: 
>>>>>> CursorBasedCompaction: trailing present columns are silently dropped in 
>>>>>> encodeLargeColumnsSubset()
>>>>>>    • CASSANDRA-21340 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21340>: GROUP BY 
>>>>>> queries silently return incomplete results due to premature SRP abort
>>>>>>    • CASSANDRA-21352 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21352> TCM: 
>>>>>> AtomicLongBackedProcessor sort inversion
>>>>>>    • CASSANDRA-21353 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21353> putShortVolatile 
>>>>>> is not volatile in InMemoryTrie
>>>>>>  • Via specifications:
>>>>>>    • CASSANDRA-21337 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21337>: Difference in 
>>>>>> behavior between Cursor-Based compaction and "Regular" compaction
>>>>>>    • CASSANDRA-21336 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21336>: 
>>>>>> CursorBasedCompaction: trailing present columns are silently dropped in 
>>>>>> encodeLargeColumnsSubset()
>>>>>>    • CASSANDRA-21339 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21339>: 
>>>>>> CursorBasedCompaction: expiring cells, same timestamp, same ldt, 
>>>>>> different ttl
>>>>>>    • CASSANDRA-21338 
>>>>>> <https://issues.apache.org/jira/browse/CASSANDRA-21338>: value 
>>>>>> comparison direction reversed in CursorCompactor
>>>>>> A few folks were using this skill to test some of subsystems, and might 
>>>>>> report more issues that I am not directly attributing here. I have also 
>>>>>> used these skills for self-review and have caught a couple of issues 
>>>>>> before they made it into the codebase.
>>>>>> 
>>>>>> Despite some early success, I still consider this a very raw set of 
>>>>>> prompts, but I think this has utility, and based on the success we have 
>>>>>> seen so far, can be helpful and is (according to my measurement 
>>>>>> methodology) fairing better than one-shot code review prompts that an 
>>>>>> LLM would generate by user request.
>>>>>> 
>>>>>> Since I was focusing on finding issues, running evals, and trying 
>>>>>> several other methodologies that did not make into this version/cut, I 
>>>>>> did not have a chance to sit and re-read the entire final result just 
>>>>>> yet, which is why I am not suggesting merging this into Cassandra 
>>>>>> codebase until we better vet it, but with your help and feedback maybe 
>>>>>> we can do this quicker. 
>>>>>> 
>>>>>> Hope you find this useful, please share your opinion, experience, and 
>>>>>> criticism.
>>>>>> 
>>>>>> Happy bug hunting!
>>>>>> --Alex
>>>>>> 
>>>>>> [1] https://github.com/apache/cassandra/pull/4794
>>>>>> 
>>>>>> 
>>>>>> On Mon, Apr 13, 2026, at 1:12 PM, Štefan Miklošovič wrote:
>>>>>>> I noticed this PR just landed. 
>>>>>>> 
>>>>>>> Volunteers reviewing / improving greatly appreciated!
>>>>>>> 
>>>>>>> (1) https://github.com/apache/cassandra/pull/4734
>>>>>>> 
>>>>>>> On Thu, Feb 26, 2026 at 5:43 PM Jon Haddad <[email protected]> 
>>>>>>> wrote:
>>>>>>>> I wanted to share a couple of other things I thought of.  I wrote this:
>>>>>>>> 
>>>>>>>> > C*'s technical debt will make using an agent in the codebase much 
>>>>>>>> > harder than using one in my own
>>>>>>>> 
>>>>>>>> I want to clarify my intent with this statement.  I was trying to 
>>>>>>>> convey that I've had the luxury of refactoring my code several times, 
>>>>>>>> because I don't have to worry about messing with other people's 
>>>>>>>> branches.  I usually write something, use it briefly, find its faults, 
>>>>>>>> redo it, and iterate several times.  I never consider anything done 
>>>>>>>> and am always looking to improve. This is very difficult with a 
>>>>>>>> project involving many people who have in-flight branches spanning 
>>>>>>>> several months.  Changes I consider no-brainers might be a headache 
>>>>>>>> for C*.  For example, I can just add a code formatter and rewrite 
>>>>>>>> every file in the codebase.  I make major changes regularly without 
>>>>>>>> any consequences. Here, it impacts dozens of people.  I proactively 
>>>>>>>> improve my code's architecture because there are few, if any, negative 
>>>>>>>> reasons not to.  It's enabled me to pay off a ton of technical debt 
>>>>>>>> that accumulated over the eight years I handwrote everything.
>>>>>>>> 
>>>>>>>> Another example: I've been working on an orchestration tool around 
>>>>>>>> easy-db-lab to automate running my tests across several clusters in 
>>>>>>>> parallel.  I recently refactored it to split the REST server code from 
>>>>>>>> the execution into Gradle submodules.  Now I can create different 
>>>>>>>> agents specializing in each module's content, which slims down the 
>>>>>>>> context for each agent.  Since I have a very clear boundary on each 
>>>>>>>> agent's responsibility, I avoid the overhead of having one agent 
>>>>>>>> manage one huge codebase.  I can specifically tell that one agent is 
>>>>>>>> responsible for this directory, and its expertise is in Ktor.  Another 
>>>>>>>> agent is a Gradle expert.  Another is Kubernetes.  When I work on 
>>>>>>>> tasks they can be decomposed into task lists for each specialized 
>>>>>>>> agent.
>>>>>>>> 
>>>>>>>> I've always thought this would be a great architectural improvement 
>>>>>>>> for the C* codebase regardless of LLMs. For example, putting the CQL 
>>>>>>>> parser in a standalone module would allow us to publish it so people 
>>>>>>>> could consume it in their own ecosystem without pulling in C*-all.  
>>>>>>>> Isolating a few of these subsystems could reduce cognitive overhead 
>>>>>>>> and simplify test design.  I'm sure making the commit log reader 
>>>>>>>> standalone would make it much easier to use in the sidecar. Easily 
>>>>>>>> using the SSTable readers and writers without all the other 
>>>>>>>> dependencies would reduce workarounds in bulk analytics and make these 
>>>>>>>> types of projects more feasible, benefiting the wider ecosystem.
>>>>>>>> 
>>>>>>>> Regardless of this approach, creating a devcontainer environment for 
>>>>>>>> the project and pushing the image to GHCR would also be beneficial.  I 
>>>>>>>> am now using one with each of my tools.  I don't trust Claude not to 
>>>>>>>> wipe my system, so I sandbox it in a container. It only has access to 
>>>>>>>> the local project and cannot push code or reach GitHub.  Devcontainers 
>>>>>>>> are supported directly in IDEA, Zed, and VSCode.  You can also launch 
>>>>>>>> them directly from GitHub or use the Claude mobile app.  I haven't 
>>>>>>>> spent much time on this yet though, I still prefer two big 5k screens 
>>>>>>>> and a deafening mechanical keyboard.
>>>>>>>> 
>>>>>>>> Jon
>>>>>>>> 
>>>>>>>> [1] 
>>>>>>>> https://github.com/rustyrazorblade/easy-db-lab/blob/main/.devcontainer/devcontainer.json
>>>>>>>> [2] 
>>>>>>>> https://github.com/rustyrazorblade/easy-db-lab/blob/main/.devcontainer/Dockerfile
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> On Thu, Feb 26, 2026 at 12:58 AM Štefan Miklošovič 
>>>>>>>> <[email protected]> wrote:
>>>>>>>>> Thank you Jon for sharing,that was very helpful. All these insights 
>>>>>>>>> are invaluable.
>>>>>>>>> 
>>>>>>>>> On Wed, Feb 25, 2026 at 11:50 PM Jon Haddad 
>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>> Regarding ant, we'd probably want a wrapper shell script that is 
>>>>>>>>>> more LLM-friendly, hiding the excessive text and providing more 
>>>>>>>>>> actionable output.  You can also delegate any task to a subagent so 
>>>>>>>>>> you don't waste your context on the `ant` output, and use Claude's 
>>>>>>>>>> new Agent Teams [1] feature to have a "builder" agent run in its own 
>>>>>>>>>> process.  
>>>>>>>>>> Docs help Claude find code, big time.  You can give it your 
>>>>>>>>>> organizational structure and that institutional knowledge so it 
>>>>>>>>>> doesn't have to pull in many tokens from dozens of files.  It 
>>>>>>>>>> *definitely* works.  I've pushed over a quarter million LOC this 
>>>>>>>>>> month alone [1], and many of you may already know I'm obsessed with 
>>>>>>>>>> efficiency.  I constantly test new ideas and approaches to refine my 
>>>>>>>>>> process; I've found good documentation is *critical*.
>>>>>>>>>> 
>>>>>>>>>> I've recently started working with both Spec-Kit (Microsoft, but it 
>>>>>>>>>> looks abandoned) and OpenSpec, as both are designed to maintain 
>>>>>>>>>> long-term memory for a project's product requirements and technical 
>>>>>>>>>> decisions.  OpenSpec is supposed to work better for brownfield and 
>>>>>>>>>> iterative projects.  I haven't tried BMAD yet.  It seemed a bit more 
>>>>>>>>>> heavyweight, but it may be better for this project than my personal 
>>>>>>>>>> ones, where I don't collaborate with anyone.
>>>>>>>>>> 
>>>>>>>>>> I have found that the best results come from loosely coupled 
>>>>>>>>>> systems.  C*'s technical debt will make using an agent in the 
>>>>>>>>>> codebase much harder than using one in my own.  I haven't tried to 
>>>>>>>>>> work on a patch in C* yet with an agent, but when I do I'll be sure 
>>>>>>>>>> to share what I've learned.
>>>>>>>>>> 
>>>>>>>>>> Today I introduced OpenSpec to easy-db-lab, you can see what it 
>>>>>>>>>> looks like [3] if you're curious.  A number of markdown commands 
>>>>>>>>>> were added to the repo, and Spec-Kit was removed.  I haven't 
>>>>>>>>>> reviewed it yet.  By the time you read this I will have likely made 
>>>>>>>>>> some changes in a review. If you want to see the before and after, 
>>>>>>>>>> the pre-review commit is c6a94e1. 
>>>>>>>>>> 
>>>>>>>>>> Jon
>>>>>>>>>> 
>>>>>>>>>> [1] https://code.claude.com/docs/en/agent-teams
>>>>>>>>>> [2] my 2 main projects, not including client work:
>>>>>>>>>> git log --since="$(date +%Y-%m-01)" --numstat --pretty=tformat: | 
>>>>>>>>>> awk 'NF==3 {added+=$1; removed+=$2} END {print "Added:", added, 
>>>>>>>>>> "Removed:", removed}'
>>>>>>>>>> Added: 90339 Removed: 45222
>>>>>>>>>> 
>>>>>>>>>> git log --since="$(date +%Y-%m-01)" --numstat --pretty=tformat: | 
>>>>>>>>>> awk 'NF==3 {added+=$1; removed+=$2} END {print "Added:", added, 
>>>>>>>>>> "Removed:", removed}'
>>>>>>>>>> Added: 124863 Removed: 52923
>>>>>>>>>> 
>>>>>>>>>> 
>>>>>>>>>> [3] https://github.com/rustyrazorblade/easy-db-lab/pull/530/changes
>>>>>>>>>> 
>>>>>>>>>> On Wed, Feb 25, 2026 at 6:18 AM David Capwell <[email protected]> 
>>>>>>>>>> wrote:
>>>>>>>>>>> I’m not against memory / skills being added, but do want to request 
>>>>>>>>>>> we think / test to make sure we can quantify the gains
>>>>>>>>>>> 
>>>>>>>>>>> <arxiv-logo-fb.png>
>>>>>>>>>>> Evaluating AGENTS.md: Are Repository-Level Context Files Helpful 
>>>>>>>>>>> for Coding Agents? <https://arxiv.org/abs/2602.11988>
>>>>>>>>>>> arxiv.org <https://arxiv.org/abs/2602.11988>
>>>>>>>>>>> 
>>>>>>>>>>> <arxiv-logo-fb.png>
>>>>>>>>>>> SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse 
>>>>>>>>>>> Tasks <https://arxiv.org/abs/2602.12670>
>>>>>>>>>>> arxiv.org <https://arxiv.org/abs/2602.12670>
>>>>>>>>>>> 
>>>>>>>>>>> 
>>>>>>>>>>> These papers actually match my lived experience with this projects 
>>>>>>>>>>> and others.  
>>>>>>>>>>> 
>>>>>>>>>>> 1) using /init to create CLAUDE.md / AGENTS.md yields negative 
>>>>>>>>>>> results.  This is how I started and have moved away.  What is the 
>>>>>>>>>>> context you need 100% of the thing? It’s things that Claude can’t 
>>>>>>>>>>> discover easy such as tribal knowledge (such as link to our style 
>>>>>>>>>>> guide).  
>>>>>>>>>>> 2) Ant is horrible for agents, not to figure out what to do (Claude 
>>>>>>>>>>> is good at that) but at context bloat… do “ant jar” and you add 
>>>>>>>>>>> like 10-20k tokens… you MUST have tooling to fix this (I ban Claude 
>>>>>>>>>>> from touching ant command, it’s only allowed to run “ai-build”, and 
>>>>>>>>>>> “ai-ci-test” as these fix the context problems; rtk “might” work 
>>>>>>>>>>> here, not tested as in on leave)
>>>>>>>>>>> 3) Claude doesn’t need docs to find code, that actually confuses it 
>>>>>>>>>>> more.  When it needs to modify code it’s going to have to explore 
>>>>>>>>>>> and will most likely find what it needs.  I agree docs for humans 
>>>>>>>>>>> would help, but let’s keep it out of AI memory files.
>>>>>>>>>>> 4) I only really use sonnet/opus 4.5+, these claims might not be 
>>>>>>>>>>> true for older models or the open weight models.
>>>>>>>>>>> 
>>>>>>>>>>> As for skills, the following makes sense to me but I really hope a 
>>>>>>>>>>> human writes as AI doesn’t do well at understanding the WHY well 
>>>>>>>>>>> and makes bad assumptions: property testing, stateful property 
>>>>>>>>>>> testing, harry, The Simulator.  I left out cqltester because I 
>>>>>>>>>>> found Claude doesn’t suck at it, so not sure what a skill would 
>>>>>>>>>>> add. The others I found it struggles with and produces bad quality 
>>>>>>>>>>> tests.
>>>>>>>>>>> 
>>>>>>>>>>> Last comment: Stefan, your link about ai code in the project didn’t 
>>>>>>>>>>> take into account what happened in the PR.  Our global static state 
>>>>>>>>>>> world caused a single test to fail which required a complete 
>>>>>>>>>>> rewrite of the patch that I ended up doing by hand.  So that patch 
>>>>>>>>>>> ended up being 100% human.
>>>>>>>>>>> 
>>>>>>>>>>> Sent from my iPhone
>>>>>>>>>>> 
>>>>>>>>>>>> On Feb 18, 2026, at 6:29 PM, Štefan Miklošovič 
>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>> These are great points. I like how granular the approach of having
>>>>>>>>>>>> multiple files is. That means we do not need to craft one
>>>>>>>>>>>> "uber-claude.md" but we can do this iteratively and per specific
>>>>>>>>>>>> domain which is easier to handle.
>>>>>>>>>>>> 
>>>>>>>>>>>> One consequence of having these "context files" is that a 
>>>>>>>>>>>> contributor
>>>>>>>>>>>> does not even need to use any AI whatsoever in order to be more
>>>>>>>>>>>> productive and organized. There is a lot of time lost when a new
>>>>>>>>>>>> contributor wants to understand how the project "thinks", what are
>>>>>>>>>>>> do-s and dont-s etc. All stuff which appears once a patch is
>>>>>>>>>>>> submitted. If we explained to everybody in plain English how this 
>>>>>>>>>>>> all
>>>>>>>>>>>> works on a detailed level, per domain, that would be tremendously
>>>>>>>>>>>> helpful even without AI.
>>>>>>>>>>>> 
>>>>>>>>>>>> It will be interesting to watch how these files are written. To
>>>>>>>>>>>> formalize and write it down is quite a task on its own.
>>>>>>>>>>>> 
>>>>>>>>>>>> 
>>>>>>>>>>>> On Wed, Feb 18, 2026 at 6:47 PM Patrick McFadin 
>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> 
>>>>>>>>>>>>> Context size is the hardest thing to manage right now in agentic 
>>>>>>>>>>>>> coding. I’ve stopped using MCP and switched to skills as a result.
>>>>>>>>>>>>> A couple of things worth noting. You can use many multiple 
>>>>>>>>>>>>> CLAUDE.md/AGENT.md files in a large code base. I’m started doing 
>>>>>>>>>>>>> this and it is remarkable. For example, in the pylib directory a 
>>>>>>>>>>>>> CLAUDE.md file would provide the Python specific info if making 
>>>>>>>>>>>>> changes. The standard layout for each should be- What is this- 
>>>>>>>>>>>>> Where do I get more information- How do I run or test- What are 
>>>>>>>>>>>>> the non-nogetialble rules- What does done look like
>>>>>>>>>>>>> Imagine one in all sorts of places. fqtool, sstableloader, 
>>>>>>>>>>>>> o.a.c.io.*, o.a.c.repair.* etc etc. And they can evolve over time 
>>>>>>>>>>>>> as people use them.
>>>>>>>>>>>>> The other thing to bring up is Brokk built by Jonathan Ellis. He 
>>>>>>>>>>>>> specifically built it for large code bases and specifically tests 
>>>>>>>>>>>>> on the Cassandra code base. (I’ll let him jump in here)
>>>>>>>>>>>>> Patrick
>>>>>>>>>>>>> On Feb 18, 2026, at 8:51 AM, Josh McKenzie <[email protected]> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> I’ve had trouble using Claude effectively on C*’s large codebase 
>>>>>>>>>>>>> without a lot of repeated “repo discovery” prompting.
>>>>>>>>>>>>> Just to keep beating the drum: I've had trouble working in our 
>>>>>>>>>>>>> codebase effectively without a lot of repeated "repo discovery" 
>>>>>>>>>>>>> time. In fact, a huge portion of the time I spend working on the 
>>>>>>>>>>>>> codebase consists of reading into adjacent coupled classes and 
>>>>>>>>>>>>> modules since things are a) not consistently or thoroughly 
>>>>>>>>>>>>> documented, and b) generally not that decoupled.
>>>>>>>>>>>>> This is also / primarily a "human <-> information interfacing 
>>>>>>>>>>>>> efficiency problem" and it just so happens LLM's and agents being 
>>>>>>>>>>>>> blocked from working on our codebase is giving us an immediate 
>>>>>>>>>>>>> short-term pain-proxy for something I strongly believe has been a 
>>>>>>>>>>>>> long-term tax on us.
>>>>>>>>>>>>> On Wed, Feb 18, 2026, at 10:04 AM, Isaac Reath wrote:
>>>>>>>>>>>>> I'm a +1 for the same reason that Josh lays out. Markdown files 
>>>>>>>>>>>>> that detail the structure of the repo, how to build & run tests, 
>>>>>>>>>>>>> how to get checkstyle to pass, etc. are all very valuable to new 
>>>>>>>>>>>>> contributors even if LLMs went away today.
>>>>>>>>>>>>> On Tue, Feb 17, 2026 at 7:33 PM Jon Haddad 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> It's all part of the same topic, Yifan.  You're making a 
>>>>>>>>>>>>> distinction without a difference. We could just as easily be 
>>>>>>>>>>>>> discussing supporting certain MCP servers like serena, or baking 
>>>>>>>>>>>>> claude into a devcontainer.  It's all relevant. There's no need 
>>>>>>>>>>>>> to police the discussion.
>>>>>>>>>>>>> On Tue, Feb 17, 2026 at 4:25 PM Yifan Cai <[email protected]> 
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>> The original post was about adding AI tooling, prompt, command, 
>>>>>>>>>>>>> or skill. The thread is shifted to AI memory files.
>>>>>>>>>>>>> I do not have an objection to any of these, but want to make sure 
>>>>>>>>>>>>> that we are still on the original topic.
>>>>>>>>>>>>> IMO, AI tooling has a clear scope / definition and is easier to 
>>>>>>>>>>>>> reach consensus on. Meanwhile, AI memory files are vague to 
>>>>>>>>>>>>> define clearly. Different developers on different domains could 
>>>>>>>>>>>>> have quite different preferences.
>>>>>>>>>>>>> - Yifan
>>>>>>>>>>>>> On Tue, Feb 17, 2026 at 3:37 PM Dmitry Konstantinov 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> I do not have my one but here there are few examples from oher 
>>>>>>>>>>>>> Apache 
>>>>>>>>>>>>> projects:https://github.com/apache/camel/blob/main/AGENTS.mdhttps://github.com/apache/ignite-3/blob/main/CLAUDE.mdhttps://github.com/apache/superset/blob/master/superset/mcp_service/CLAUDE.md
>>>>>>>>>>>>> On Tue, 17 Feb 2026 at 23:22, Jon Haddad 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> I think a few folks are already using CLAUDE.md files in their 
>>>>>>>>>>>>> repo and they're just not committing them.Anyone want to share 
>>>>>>>>>>>>> what's already done?  I'm happy to help share what I know about 
>>>>>>>>>>>>> the agentic side of things, but since I don't do much in the way 
>>>>>>>>>>>>> of patching C* it would be a lot of guessing.
>>>>>>>>>>>>> If I'm wrong and nobody shares one, I'll take a stab at it.
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> On Tue, Feb 17, 2026 at 3:08 PM Štefan Miklošovič 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>> Great feedback everybody! Really appreciate it!
>>>>>>>>>>>>> Reading what Jon posted ... Jon, I think you are the most 
>>>>>>>>>>>>> experiencedin this based on what you wrote. Would you mind doing 
>>>>>>>>>>>>> some POC herefor Cassandra repo? For the trunk it is enough ... 
>>>>>>>>>>>>> Something we mightbuild further on. I think we need to build the 
>>>>>>>>>>>>> foundations of that andput some structure into it and all things 
>>>>>>>>>>>>> considered I think you arebest for the job here.
>>>>>>>>>>>>> If the basics are there we can play with it more before merging, 
>>>>>>>>>>>>> thisis not something which needs to be done "tomorrow", we can 
>>>>>>>>>>>>> collaborateon something together for some time and add things 
>>>>>>>>>>>>> into it as patchescome. I think it takes some time to "tune" it.
>>>>>>>>>>>>> Everybody else feel free to help! My experience in this space 
>>>>>>>>>>>>> islimited, I think there are people who are using it more often 
>>>>>>>>>>>>> than mefor sure.
>>>>>>>>>>>>> Regards
>>>>>>>>>>>>> On Wed, Feb 18, 2026 at 12:59 AM Joel Shepherd 
>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>> There's been some momentum building for AGENTS.md files, both on 
>>>>>>>>>>>>>> theproject and on the agent side:
>>>>>>>>>>>>>>     https://agents.md/
>>>>>>>>>>>>>> Same idea and benefits, but it might help to align folks on a 
>>>>>>>>>>>>>> "standard"that will work well across agents.
>>>>>>>>>>>>>> I also think that more and better code documentation can be 
>>>>>>>>>>>>>> verybeneficial when using agents to help with working out 
>>>>>>>>>>>>>> implementationdetails. I spent a bunch of time in January 
>>>>>>>>>>>>>> writing an introduction toApache Ratis (Raft as a 
>>>>>>>>>>>>>> library:https://github.com/apache/ratis/blob/master/ratis-docs/src/site/markdown/index.md).The
>>>>>>>>>>>>>>  code itself is pretty well-documented but it was hard for me 
>>>>>>>>>>>>>> tobuild a mental model of how to integrate with. AI was very 
>>>>>>>>>>>>>> effective intaking the granular in-code documentation and 
>>>>>>>>>>>>>> synthesizing an overviewfrom it. Going the other way, the 
>>>>>>>>>>>>>> in-code documentation has made itpossible for me to deep dive 
>>>>>>>>>>>>>> the Ratis code to root cause bugs, etc.Agents can get a lot out 
>>>>>>>>>>>>>> of good class- and method-level documentation.
>>>>>>>>>>>>>> -- Joel.
>>>>>>>>>>>>>> On 2/16/2026 8:03 PM, Bernardo Botella wrote:CAUTION: This email 
>>>>>>>>>>>>>> originated from outside of the organization. Do not click links 
>>>>>>>>>>>>>> or open attachments unless you can confirm the sender and know 
>>>>>>>>>>>>>> the content is safe.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> Thanks for bringing this up Stefan!!
>>>>>>>>>>>>>>> A really interesting topic indeed.
>>>>>>>>>>>>>>> 
>>>>>>>>>>>>>>> I’ve also heard ideas around even having Claude.md type of 
>>>>>>>>>>>>>>> files that help LLMs understand the code base without having to 
>>>>>>>>>>>>>>> do a full scan every time.
>>>>>>>>>>>>>>> So, all and all, putting together something that we as a 
>>>>>>>>>>>>>>> community think that describe good practices + repository 
>>>>>>>>>>>>>>> information not only for the main Cassandra repository, but 
>>>>>>>>>>>>>>> also for its subprojects, will definitely help contributors 
>>>>>>>>>>>>>>> adhere to standards and us reviewers to ensure that some steps 
>>>>>>>>>>>>>>> at least will have been considered.
>>>>>>>>>>>>>>> Things like:- Repository structure. What every folder is- Tests 
>>>>>>>>>>>>>>> suits and how they work and run- Git commits standards- 
>>>>>>>>>>>>>>> Specific project lint rules (like braces in new lines!)- 
>>>>>>>>>>>>>>> Preferred wording style for patches/documentation
>>>>>>>>>>>>>>> Committed to the projects, and accesible to LLMs, sound like 
>>>>>>>>>>>>>>> really useful context for those type of contributions (that are 
>>>>>>>>>>>>>>> going to keep happening regardless).
>>>>>>>>>>>>>>> So curious to read what others think.Bernardo
>>>>>>>>>>>>>>> PD. Totally agree that this should change nothing of the 
>>>>>>>>>>>>>>> quality bar for code reviews and merged code
>>>>>>>>>>>>>>>> On Feb 16, 2026, at 6:27 PM, Štefan Miklošovič 
>>>>>>>>>>>>>>>> <[email protected]> wrote:
>>>>>>>>>>>>>>>> Hey,
>>>>>>>>>>>>>>>> This happened recently in kernel space. (1), (2).
>>>>>>>>>>>>>>>> What that is doing, as I understand it, is that you can point 
>>>>>>>>>>>>>>>> LLM tothese resources and then it would be more capable when 
>>>>>>>>>>>>>>>> reviewingpatches or even writing them. It is kind of a guide / 
>>>>>>>>>>>>>>>> context providedto AI prompt.
>>>>>>>>>>>>>>>> I can imagine we would just compile something similar, merge 
>>>>>>>>>>>>>>>> it to therepo, then if somebody is prompting it then they 
>>>>>>>>>>>>>>>> would have an easierjob etc etc, less error prone ... adhered 
>>>>>>>>>>>>>>>> to code style etc ...
>>>>>>>>>>>>>>>> This might look like a controversial topic but I think we need 
>>>>>>>>>>>>>>>> todiscuss this. The usage of AI is just more and more 
>>>>>>>>>>>>>>>> frequent. FromCassandra's perspective there is just this (3) 
>>>>>>>>>>>>>>>> but I do not think wereached any conclusions there (please 
>>>>>>>>>>>>>>>> correct me if I am wrong wherewe are at with AI generated 
>>>>>>>>>>>>>>>> patches).
>>>>>>>>>>>>>>>> This is becoming an elephant in the room, I am noticing that 
>>>>>>>>>>>>>>>> somepatches for Cassandra were prompted by AI completely. I 
>>>>>>>>>>>>>>>> think it wouldbe way better if we make it easy for everybody 
>>>>>>>>>>>>>>>> contributing like that.
>>>>>>>>>>>>>>>> This does not mean that we, as committers, would believe what 
>>>>>>>>>>>>>>>> AIgenerated blindlessly. Not at all. It would still need to go 
>>>>>>>>>>>>>>>> over theformal review as anything else. But acting like this 
>>>>>>>>>>>>>>>> is not happeningand people are just not going to use AI when 
>>>>>>>>>>>>>>>> trying to contribute isnot right. We should embrace it in some 
>>>>>>>>>>>>>>>> form ...
>>>>>>>>>>>>>>>> 1) https://github.com/masoncl/review-prompts2) 
>>>>>>>>>>>>>>>> https://lore.kernel.org/lkml/[email protected]/3)
>>>>>>>>>>>>>>>>  
>>>>>>>>>>>>>>>> https://lists.apache.org/thread/j90jn83oz9gy88g08yzv3rgyy0vdqrv7
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>>>>>>>>> --Dmitry Konstantinov
>>>>>>>>>>>>> 
>>>>>>>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Dmitry Konstantinov
>>>> 
>>>

Re: [DISCUSS] in-tree AI-assisted code-review prompts

Reply via email to