Please can I have an invite to the Slack workspace on this email. I'd like to take a look through some of the items for first time contributors :-)
Thanks! On Fri, 27 Oct 2023 at 18:10, Josh McKenzie <jmcken...@apache.org> wrote: > In case you're keeping score on how frequently these are coming out: *please > stop*. ;) > > Silver lining - looks like we have a lot to discuss this round! Last > update was late July and we've been churning through the 5.0 freeze and > stabilization phase. > > > > *[New Contributors Getting Started]* > Check out https://the-asf.slack.com, channel #cassandra-dev. Reply > directly to me on this email if you need an invite for your account, and > reach out to the @cassandra_mentors alias in the channel if you need to get > oriented. > > We have a list of curated "getting started" tickets you can find here, > filtered to "ToDo" (i.e. not yet worked): > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2160&quickFilter=2162&quickFilter=2652 > . > > *Helpful links:* > - Getting Started with Development on C*: > https://cassandra.apache.org/_/development/gettingstarted.html > - Building and IDE integration (worktrees are your friend; msg me on slack > if you need pointers): https://cassandra.apache.org/_/development/ide.html > - Code Style: https://cassandra.apache.org/_/development/code_style.html > > > > *[Dev mailing list]* > > https://lists.apache.org/list?dev@cassandra.apache.org:dfr=2023-7-20%7Cdto=2023-10-27 > : > > My last email of shame was 35 threads. Drumroll for this one... > 91. *Yeesh*. Let me stick to highlights. > > Ekaterina pushed through dropping JDK8 support and adding JDK17 support... > back in July. If you didn't know about it by know, consider yourself doubly > notified. :) . > https://lists.apache.org/thread/9pwz3vtpf88fly27psc7yxvcv0lwbz8k I think > I can speak on behalf of all of us when I say: *Thank You Ekaterina.* > > This came up recently on another thread about when to branch 5.1, but we > discussed our freeze plans and exception rules for TCM and Accord here: > https://lists.apache.org/thread/mzj3dq8b7mzf60k6mkby88b9n9ywmsgw. Mick > was essentially looking for a similar waiver for Vector search since it was > well abstracted, depended on SAI and external libs, and in general > shouldn't be too big of a disruption to get into 5.0. General consensus at > the time was "sure", and the work has since been completed. But here's the > reminder and link for posterity (and in case you missed it). > > Jaydeep reached out about a potential short-term solution to detecting > token-ownership mismatch while we don't yet have TCM; this seems more > pressing now as we're looking at a 5.0 without yet having TCM in it. The > dev ML thread is here: > https://lists.apache.org/thread/4p0orhom42g36osnknqj3fqmqhvqml1g, and he > created https://issues.apache.org/jira/browse/CASSANDRA-18758 dealing > with the topic. There's a relatively modest (7 files, just over 300 lines) > PR available here: https://github.com/apache/cassandra/pull/2595/files; I > haven't looked into it, but it might be worth considering getting this into > 5.0 since it looks like we're moving to cutting w/out TCM. Any thoughts? > > We had a pretty good discussion about automated repair scheduling, > discussing whether it should live in the DB proper vs. in the sidecar, pros > and cons, pressures, etc. Not sure if things moved beyond that; I know > there's at least a few implementations out there that haven't yet made > their way back to the ASF project proper. Thread: > https://lists.apache.org/thread/glvmkwknf91rxc5l6w4d4m1kcvlr6mrv. My hope > is we can avoid the gridlock we hit for a long time with the sidecar where > there are multiple implementations with different tradeoffs and everyone's > disincentivized from accepting a solution different from their own in-house > one since it'd theoretically require re-tooling. Tough problem with no easy > solutions, but would love to see this become a first class citizen in the > ecosystem. > > Paulo brought up a discussion about moving to disk_access_mode = > mmap_index_only on 5.0. Seemed to be a consensus there but I'm not sure we > actually changed that in the 5.0 branch? Thread: > https://lists.apache.org/thread/nhp6vftc4kc3dxskngxy5rpo1lp19drw. Just > pulled on cassandra-5.0 and it looks like auto + hasLargeAddressSpace() == > .mmap rather than .mmap_index_only. > > David Capwell worked on adding some retries to repair messages when > they're failing to make the process more robust: > https://lists.apache.org/thread/wxv6k6slljqcw73xcmpxj4kn5lz95jd1. > Reception was positive enough that he went so far as to back-port it and > also work on some for IR. Looks like he could use a reviewer here: > https://issues.apache.org/jira/browse/CASSANDRA-18962 - and this is patch > available. > > Mike Adamson reached out about adding / taking a dependency on jvector: > https://lists.apache.org/thread/zkqg7mk9hp35zn0cf1tvywc2m3l63jrn. The > general gist of it was "looks good, written by committer(s) / pmc members, > permissvely licensed. Go for it". Some discussion about copyright holders > and whether that matters from an ASF perspective, and we've further had > some good discussion about the application of generative AI tooling to not > just code contributed to the ASF, but also in dependencies we bring into > the project. If you're curious about more details, check out the Apache > LEGAL-656 JIRA here: https://issues.apache.org/jira/browse/LEGAL-656. The > TL;DR comment is from Roman here: > https://issues.apache.org/jira/browse/LEGAL-656?focusedCommentId=17779813&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17779813 > . > > Maxim Muzafarov keeps fighting the good fight of helping to clean up our > codebase; he opened a thread about Cassandra's code style and source > analysis here: > https://lists.apache.org/thread/lr90ckt7scgs4tqjwd2t7928plngo5zl. We have > a label for "code-polishing" that you can check that we're holding off on > until after Accord and TCM merge so they don't take on a painful rebase > burden mid-integration work ( > https://issues.apache.org/jira/issues/?jql=labels%20%3D%20code-polishing). > > Mick had some ideas around improving how we announce and handle having > broken branches and merging to them: > https://lists.apache.org/thread/n7zhzk4svdh1v3pswkrfwxw4o3g2f6xy. The > gist of this: it's not great when a branch is straight up broken in ASF CI > and then folks merge more code on top of that break; makes it harder to > root out what's going on. We didn't _really_ get too far in closure on how > we'd prevent this case in the future beyond "email the dev ML, post in > #cassandra-dev slack, and... pray?". I'm in favor of a slack-bot that yells > at us hourly if our builds are formally broken so we can't forget, with the > assumption it _should_ be a pretty rare situation. If anyone else has input > here that'd be helpful. > > Builds for 5.0 and trunk are now based on in-tree build scripts (found in > .build). The scripts were moved from the cassandra-builds repo here: > https://github.com/apache/cassandra-builds, where you can find build > scripts used for other branches. Expect this to continue to evolve as we > take some of the best learnings from circleci and other build systems and > integrate them upstream. > > Claude discovered that our documentation for development dependencies is > out of date: > https://lists.apache.org/thread/91l7x7r0w7yycndslfc8kjs74s3jyqr2. Looks > like Abe's working on an update there, but if anyone has opinions or cycles > to help out this is high leverage work. > > Yifan Cai reached out about merging some changes for CQLSSTableWriter to > 4.0 and up. Since this is offline tools only the general consensus was "go > for it": https://lists.apache.org/thread/nwqdmqzoht2nyw9hg8o061vh6vk2oxd5 > > Maxim could use a reviewer for allowing UPDATE on settings virtual tables > (ML: https://lists.apache.org/thread/rsgtwdlg411d76kptkbxv292hnv1s1c5, > original ML thread here: > https://lists.apache.org/thread/8kywzv24n0dp07mhvch7hwhjypssoh0l, JIRA: > https://issues.apache.org/jira/browse/CASSANDRA-15254). I have to imagine > most users would prefer to use CQL to interact w/their node settings than > JMX, though I assume most of us have some Stockholm Syndrome at this point. > > Amit Pawar reached out about how we're approaching our defaults for the > CommitLog (mmap vs. the new DirectI/O they have a PR up for). The general > consensus was "that looks and sounds great, and we shouldn't change > defaults until it's had time to bake as an option". > https://lists.apache.org/thread/t6v0p10737p0joob2vcsdt0r3g8zt94q > > > > *[CI]* > https://butler.cassandra.apache.org/#/ > > Since late July (~ 3 months): > > 3.0: 9 -> 18 > • Was hovering around 12 ish for a good while there > 3.11: 16 -> 20 > • There's a lot more variance on this one. Curious why the delta from 3.0. > 4.0: 24 -> 11 > • Looks like long-term trend is around the 8? mark > 4.1: 12 -> 12 > • Pretty stable around 12 failures here > 5.0: Averaging around 10 > • Do we have too many branches yet? > trunk: 16 -> 12 > • One pretty big spike in there when CI was transitioning over, but on the > whole in a pretty "tame" place. > > Low-grade noise on each of the branches. Spot-checking failures on 3.0, > 4.0, and trunk, nothing really pops as being commonalities between them. > > > *[What's been closed out]* > Updated quick-filter with new, ridiculous 90 day duration: > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=484&quickFilter=2278 > JQL sorted by priority then type: > https://issues.apache.org/jira/issues/?jql=project%20%3D%20CASSANDRA%20and%20resolution%20%3D%20fixed%20and%20resolved%20%3E%20-90d%20order%20by%20priority%20DESC%2C%20type%20DESC > > Due to the sheer volume of tickets (170 in the past 90 days!), I'll > refrain from including them all in this email thread here. I should be > considerably less "compressed for time" in the near future, so fingers > crossed we can get back to a more digestible volume on these updates on a > monthly cadence as we go into aggressive "release-mode". > > Being a part of an open-source community that's this mature, in a domain > this complex, that's not only firing on all cylinders but going further and > self-improving and accelerating is really gratifying and humbling for me. > Thanks everyone for being a part of this. > > ~Josh >