But you already said that your have “very wide rows”, so pulling massive amounts of data off a single node is very likely to completely dwarf the connect time. Again, doing the gets in parallel from multiple nodes, with parallel requests, would be so much more performant. How many nodes are we talking about?
One of the secrets of Cassandra is to use more, smaller requests in parallel, rather than massive requests to a single coordinator node. -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 8:28 PM To: user@cassandra.apache.org Subject: Re: Data partitioning and composite partition key Mainly lower latency and (network overhead) in multi-get requests (WHERE IN (….)). The coordinator needs to connect only to one node vs potentially all the nodes in the cluster. On Aug 29, 2014, at 5:23 PM, Jack Krupansky <j...@basetechnology.com> wrote: Okay, but what benefit do you think you get from having the partitions on the same node – since they would be separate partitions anyway? I mean, what exactly do you think you’re going to do with them, that wouldn’t be a whole lot more performant by being able to process data in parallel from separate nodes? I mean, the whole point of Cassandra is scalability and distributed processing, right? -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 7:31 PM To: user@cassandra.apache.org Subject: Re: Data partitioning and composite partition key Hi Jack, I think you missed the point of my email which was trying to avoid the problem of having very wide rows :) In the notation of sensorId-datatime, the datatime is a datetime bucket, say a day. The CQL rows would still be keyed by the actual time of the event. So you’d end up having SesonId->Datetime Bucket (day/week/month)->actual event. What I wanted to be able to do was to colocate all the events related to a sensor id on a single node (token). See "High Throughput Timelines” at http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra - Drew On Aug 29, 2014, at 3:58 PM, Jack Krupansky <j...@basetechnology.com> wrote: With CQL3, you, the developer, get to decide whether to place a primary key column in the partition key or as a clustering column. So, make sensorID the partition key and datetime as a clustering column. -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 6:48 PM To: user@cassandra.apache.org Subject: Data partitioning and composite partition key Hey Guys, AFAIK, currently Cassandra partitions (thrift) rows using the row key, basically uses the hash(row_key) to decide what node that row needs to be stored on. Now there are times when there is a need to shard a wide row, say storing events per sensor, so you’d have sensorId-datetime row key so you don’t end up with very large rows. Is there a way to have the partitioner use only the “sensorId” part of the row key for the hash? This way we would be able to store all the data relating to a sensor in one node. Another use case of this would be multi-tenancy: Say we have accounts and accounts have users. So we would have the following tables: CREATE TABLE account ( id timeuuid PRIMARY KEY, company text //timezone ); CREATE TABLE user ( id timeuuid PRIMARY KEY, accountId timeuuid, email text, password text ); // Get users by account CREATE TABLE user_account_index ( accountId timeuuid, userId timeuuid, PRIMARY KEY(acid, id) ); Say I want to get all the users that belong to an account. I would first have to get the results from user_account_index and then use a multi-get (WHERE IN) to get the records from user table. Now this multi-get part could potentially query a lot of different nodes in the cluster. It’d be great if there was a way to limit storage of users of an account to a single node so that way multi-get would only need to query a single node. Note that the problem cannot be simply fixed by using (accountId, id) as the primary key for the user table since that would create a problem of having a very large number of (thrift) rows in the users table. I did look thru the code and JIRA and I couldn’t really find a solution. The closest I got was to have a custom partitioner, but then you can’t have a partitioner per keyspace and that’s not even something that’d be implemented in future based on the following JIRA: https://issues.apache.org/jira/browse/CASSANDRA-295 Any ideas are much appreciated. Best, Drew