Last Thursday, DataStax put together a meeting of the active Cassandra committers in San Mateo. Dave Brosius was unable to make it to the West coast, but Brandon, Eric, Gary, Jason, Pavel, Sylvain, Vijay, Yuki, and I were able to attend, with Aleksey and Jake able to attend part time over Google Hangout.
We started by asking each committer to outline his top 3 priorities for 2.0. There was pretty broad consensus around the following big items, which I will break out into separate threads: * Streaming and repair * Counters There was also a lot of consensus that we'll be able to ship some form of Triggers [1] in 2.0. Gary's suggestion was to focus on getting the functionality nailed down first, then worry about classloader voodoo to allow live reloading. There was also general agreement that we need to split jar loading from trigger definition, to allow a single trigger to be applied to be multiple tables. There was less consensus around CAS [2], primarily because of implementation difficulties. (I've since read up some more on Paxos and Spinnaker and posted my thoughts to the ticket.) Other subjects discussed: A single Cassandra process does not scale well beyond 12 physical cores. Further research is needed to understand why. One possibility is GC overhead. Vijay is going to test Azul's Zing VM to confirm or refute this. Server-side aggregation functions [3]. This would remove the need to pull a lot of data over the wire to a client unnecessarily. There was some unease around moving beyond the relatively simple queries we've traditionally supported, but I think there was general agreement that this can be addressed by fencing aggregation to a single partition unless explicitly allowed otherwise a la ALLOW FILTERING [4]. Extending cross-datacenter forwarding [5] to a "star" model. That is, in the case of three or more datacenters, instead of the original coordinator in DC A sending to replicas in DC B & C, A would forward to B, which would forward to C. Thus, the bandwidth required for any one DC would be constant as more datacenters are added. Vnode improvements such as a vnode-aware replication strategy [6]. Cluster merging and splitting -- if I have multiple applications using a single cassandra cluster, and one gets a lot more traffic than the others, I may want to split that out into its own cluster. I think there was a concrete proposal as to how this could work but someone else will have to fill that in because I didn't write it down. Auto-paging of SELECT queries for CQL [7], or put another way, transparent cursors for the native CQL driver. Make the storage engine more CQL-aware. Low-hanging fruit here includes a prefix dictionary for all the composite cell names [8]. Resurrecting the StorageProxy API aka Fat Client. ("Does it even work right now?" "Not really.") Reducing context switches and increasing fairness in client connections. HSHA prefers to accept new connections vs servicing existing ones, so overload situations are problematic. "Gossip is unreliable at 100s of nodes." Here again I missed any concrete proposals to address this. [1] https://issues.apache.org/jira/browse/CASSANDRA-1311. Start with https://issues.apache.org/jira/browse/CASSANDRA-1311?focusedCommentId=13492827&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13492827 for the parts relevant to Vijay's proof of concept patch. [2] https://issues.apache.org/jira/browse/CASSANDRA-5062 [3] https://issues.apache.org/jira/browse/CASSANDRA-4914 [4] https://issues.apache.org/jira/browse/CASSANDRA-4915 [5] https://issues.apache.org/jira/browse/CASSANDRA-3577 [6] https://issues.apache.org/jira/browse/CASSANDRA-4123 [7] https://issues.apache.org/jira/browse/CASSANDRA-4415 [8] https://issues.apache.org/jira/browse/CASSANDRA-4175 -- Jonathan Ellis Project Chair, Apache Cassandra co-founder, http://www.datastax.com @spyced