Thanks Jeff for the very comprehensive list of actions taken this year. Can't wait to put my hands on 4.0 once it's released
On Fri, Dec 22, 2017 at 10:20 PM, Jeff Jirsa <jji...@gmail.com> wrote: > Happy holidays all, > > I imagine most people are about to disappear to celebrate holidays, so I > wanted to try to summarize the state of Cassandra dev for 2017, as I see > it. Standard disclaimers apply (this is my personal opinion, not that of my > employer, not officially endorsed by the Apache Cassandra PMC, or the ASF). > > Some quick stats about Cassandra development efforts in 2017 (using > imperfect git log | awk/sed counting, only looking at trunk, buyer beware, > it's probably off by a few): > > The first commit of 2017 was: Ben Manes, transforming the on-heap cache to > Caffeine ( > https://github.com/apache/cassandra/commit/c607d76413be81a0e125c5780e068d > 7ab7594612 > ) > Alex Petrov removed the most code (~7500 lines, according to github) > Benjamin Lerer added the most code (~8000 lines, according to github) > We put to bed the tick/tock release cycle, but still cut 14 different > releases across 5 different branches. > We had a total of 136 different contributors, with 48 of those contributors > contributing more than one patch during the year. > We had a total of 47 different reviewers > There were 661 non-merge commits to trunk > There were 56 non-merge commits to docs/ > We end the year with roughly 173 pending changes for 4.0 > We resolved (either fixed or disqualified) 781 issues in JIRA > I count something like 273 email threads to dev@, and 903 email threads to > user@ > The project added Stefan Podkowinski, Joel Knighton, Ariel Weisberg, Alex > Petrov, Blake Eggleston, and Philip Thompson as committers. > The project added Josh McKenzie, Marcus Eriksson and Jon Haddad to the > Apache Cassandra PMC > > At NGCC (which Eric and Gary managed to organize with the help of > Instaclustr sponsoring, an achievement in itself), we had people talk > about: > - Two different talks (from Apple and FB/Instagram). I'm struggling to > describe these in simple terms, they both sorta involving using hints and > changing some of the consistency concepts to help deal with latency / > durability / availability, especially in cross-DC workloads. Grouping these > together isn't really fair, but no one-email summary is going to be fair to > either of these talks. If you missed NGCC, I guess you get to wait for the > JIRAs / patches. > - A new storage engine (FB/Instagram) using RocksDB > - Some notes on using CDC at scale (and some proposed changes to make it > easier) from Uber ( > https://github.com/ngcc/ngcc2017/blob/master/CassandraDataIngestion.pdf ) > - Michael Shuler (Datastax / Cassandra PMC / release master / etc) spent > some time talking about testing and CI. > > Some other big'ish development efforts worth mentioning (from personal > memory, perhaps the worst possible way to create such a list): > - We spent a fair amount of time talking about testing. Francois @ > Instagram lead the way in codifying a new set of principles around testing > and quality ( > https://lists.apache.org/thread.html/0854341ae3ab41ceed2ae8a03f2486 > cf2325e4fca6fd800bf4297dd4@%3Cdev.cassandra.apache.org%3E > / https://issues.apache.org/jira/browse/CASSANDRA-13497 ). > - We've also spent some time making tests work in CircleCI, which should > make life much easier for occasional contributors - no need to figure out > how to run tests in ASF Jenkins. > - The internode messaging rewrite to use async/netty is probably the single > largest that comes to mind. It went in earlier this year, and should make > it easier to have HUGE clusters. All of you running thousand instance > clusters will probably benefit from this patch (I know you're out there, > I've talked to you in IRC) - will be in 4.0 ( > https://issues.apache.org/jira/browse/CASSANDRA-8457 ) > - We have a company working on making Cassandra happy with proprietary > flash storage and PPC64LE (IBM's recent patches, > https://developer.ibm.com/linuxonpower/2017/03/31/using- > capi-improve-performance-apache-cassandra-work-progress-update/ > ) > - We have a new commitlog mode added for the first time in quite some time > - the GroupCommitLog will be in 4.0 ( > https://issues.apache.org/jira/browse/CASSANDRA-13530 ) > - Michael Kjellman spent some time porting dtests from nose to pytest, and > from python 2.7 to python 3, removing dependencies on dead projects like > pycassa and the old thrift-cql library. Still needs to be reviewed ( > https://issues.apache.org/jira/browse/CASSANDRA-14134 ) > - Robert Stupp spent some time porting to java9 - again, still need to be > reviewed ( https://issues.apache.org/jira/browse/CASSANDRA-9608 ) > > Overall, the state of the project appears to be strong. We're seeing active > contributions driven primarily by users (like you), the 8099/3.0 engine is > looking pretty good here in December, and the code base is stabilizing > towards a product all of us should be happy to run in production. Despite > some irrationally skeptical sky-is-falling threads near the end of 2016, I > feel confident in saying it was a pretty good year for Cassandra, and as > the project continues to move forward, I'm looking forward to seeing 4.0 > launch in 2018 (hopefully with a real user conference!) > > - Jeff >