Re: Data partitioning and composite partition key

2014-08-29 Thread Jack Krupansky
talking about? One of the secrets of Cassandra is to use more, smaller requests in parallel, rather than massive requests to a single coordinator node. -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 8:28 PM To: user@cassandra.apache.org Subject: Re: Data partitioning

Re: Data partitioning and composite partition key

2014-08-29 Thread Drew Kutcharian
of Cassandra is scalability and distributed > processing, right? > > -- Jack Krupansky > > From: Drew Kutcharian > Sent: Friday, August 29, 2014 7:31 PM > To: user@cassandra.apache.org > Subject: Re: Data partitioning and composite partition key > > Hi Jack, &

Re: Data partitioning and composite partition key

2014-08-29 Thread Jack Krupansky
separate nodes? I mean, the whole point of Cassandra is scalability and distributed processing, right? -- Jack Krupansky From: Drew Kutcharian Sent: Friday, August 29, 2014 7:31 PM To: user@cassandra.apache.org Subject: Re: Data partitioning and composite partition key Hi Jack, I think you

Re: Data partitioning and composite partition key

2014-08-29 Thread Drew Kutcharian
Hi Rob, I agree that one should not mess around with the default partitioner. But there might be value in improving the Murmur3 partitioner to be “Composite Aware”. Since we can have composites in row keys now, why not be able to use only a part of the row key for partitioning? Makes sense? I

Re: Data partitioning and composite partition key

2014-08-29 Thread Robert Coli
On Fri, Aug 29, 2014 at 3:48 PM, Drew Kutcharian wrote: > AFAIK, currently Cassandra partitions (thrift) rows using the row key, > basically uses the hash(row_key) to decide what node that row needs to be > stored on. Now there are times when there is a need to shard a wide row, > say storing eve

Re: Data partitioning and composite partition key

2014-08-29 Thread Drew Kutcharian
gt; > From: Drew Kutcharian > Sent: Friday, August 29, 2014 6:48 PM > To: user@cassandra.apache.org > Subject: Data partitioning and composite partition key > > Hey Guys, > > AFAIK, currently Cassandra partitions (thrift) rows using the row key, > basically uses th

Re: Data partitioning and composite partition key

2014-08-29 Thread Jack Krupansky
@cassandra.apache.org Subject: Data partitioning and composite partition key Hey Guys, AFAIK, currently Cassandra partitions (thrift) rows using the row key, basically uses the hash(row_key) to decide what node that row needs to be stored on. Now there are times when there is a need to shard a

Data partitioning and composite partition key

2014-08-29 Thread Drew Kutcharian
Hey Guys, AFAIK, currently Cassandra partitions (thrift) rows using the row key, basically uses the hash(row_key) to decide what node that row needs to be stored on. Now there are times when there is a need to shard a wide row, say storing events per sensor, so you’d have sensorId-datetime row

Re: Could table partitioning be implemented using a customer compaction strategy?

2014-08-15 Thread DuyHai Doan
Check that: https://issues.apache.org/jira/browse/CASSANDRA-6602 There is a patch for a compaction strategy dedicated to time series data. The discussion is also interesting in the comments. On Fri, Aug 15, 2014 at 6:28 AM, Kevin Burton wrote: > We use log structured tables to hold logs for

Could table partitioning be implemented using a customer compaction strategy?

2014-08-14 Thread Kevin Burton
We use log structured tables to hold logs for analysis. It's basically append only, and immutable. Every record has a timestamp for each record inserted. Having this in ONE big monolithic table can be problematic. 1. compactions have to compact old data that might not even be used often. 2.

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-19 Thread Bryan Talbot
I think there are several issues in your schema and queries. First, the schema can't efficiently return the single newest post for every author. It can efficiently return the newest N posts for a particular author. On Fri, May 16, 2014 at 11:53 PM, 後藤 泰陽 wrote: > > But I consider LIMIT to be a

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-17 Thread Matope Ono
; >> >> 2014/05/16 23:54、Jonathan Lacefield のメール: >> >> Hello, >> >> Have you looked at using the CLUSTERING ORDER BY and LIMIT features of >> CQL3? >> >> These may help you achieve your goals. >> >> >> http://www.data

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-17 Thread DuyHai Doan
LIMIT features of > CQL3? > > These may help you achieve your goals. > > > http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refClstrOrdr.html > > http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html > > Jonathan Lacefield > So

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-17 Thread 後藤 泰陽
documentation/cql/3.1/cql/cql_reference/select_r.html > > Jonathan Lacefield > Solutions Architect, DataStax > (404) 822 3487 > > > > > > > On Fri, May 16, 2014 at 12:23 AM, Matope Ono wrote: > Hi, I'm modeling some queries in CQL3. > > I&#

Query first 1 columns for each partitioning keys in CQL?

2014-05-16 Thread Matope Ono
Hi, I'm modeling some queries in CQL3. I'd like to query first 1 columns for each partitioning keys in CQL3. For example: create table posts( > author ascii, > created_at timeuuid, > entry text, > primary key(author,created_at) > ); > insert into posts(author,creat

Re: Query first 1 columns for each partitioning keys in CQL?

2014-05-16 Thread Jonathan Lacefield
Jonathan Lacefield Solutions Architect, DataStax (404) 822 3487 <http://www.linkedin.com/in/jlacefield> <http://www.datastax.com/cassandrasummit14> On Fri, May 16, 2014 at 12:23 AM, Matope Ono wrote: > Hi, I'm modeling some queries in CQL3. > > I'd like to query first

Re: Cassandra partitioning and limit

2014-01-24 Thread Robert Coli
On Fri, Jan 24, 2014 at 5:39 AM, Jean Paul Adant wrote: > I'm using cassandra 1.1.9 > I have this columnFamily, created with hector API. Here is its cql2 > descrition. > You should not use CQL2, it will be removed from future versions of Cassandra. =Rob

Cassandra partitioning and limit

2014-01-24 Thread Jean Paul Adant
per second) and i'm using composite keys as column name. - Almost all datas are written on columns under the same row key Question? I understand row partitioning, and understood that all columns will be on the same row, so on the same partitioner (so on the same machine) As columnsFamilies have l

Re: Mixed random & ordered partitioning?

2012-01-26 Thread aaron morton
> What is be the effective difference between hashing the keys myself and > letting the random partitioner do it? This is what the RandomPartitioner calls https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java#L209 > Is this advisable? I would try to a

Mixed random & ordered partitioning?

2012-01-25 Thread Todd Fast
I want to do ranged row queries for a few of my column families, but best practice seems to be to use the random partitioner. Splitting my column families between two clusters (one random, one ordered) seems like a pretty expensive compromise. Instead, I'm thinking of using the order-preservin

Re: Partitioning, tokens, and sequential keys

2011-08-17 Thread aaron morton
> One question on nodetool ring, the "owns" refers to how many of the possible > keys each node owns, not the actual node size correct? yes > So you could technically have a load of 15gb, 60gb, and 15gb on a three node > cluster, but if you have the tokens set correctly each would own 33.33%. Ye

Re: Partitioning, tokens, and sequential keys

2011-08-17 Thread David McNelis
Well, I think what happened was that we had three tokens generated, 0, 567x, and 1134x... but the way that we read the comments in the yaml file, we just set the second two nodes with the initial token and left the token for the seed node blank. Then we started the seed node, started the other

Re: Partitioning, tokens, and sequential keys

2011-08-16 Thread Jonathan Ellis
Yes, that looks about right. Totally baffled how the wiki script could spit out those tokens for a 3-node cluster. On Tue, Aug 16, 2011 at 2:04 PM, David McNelis wrote: > Currently we have the initial_token for the seed node blank, and then the > three tokens we ended  up with are: > 56713727820

Re: Partitioning, tokens, and sequential keys

2011-08-16 Thread David McNelis
Currently we have the initial_token for the seed node blank, and then the three tokens we ended up with are: 56713727820156410577229101238628035242 61396109050359754194262152792166260437 113427455640312821154458202477256070485 I would assume that we'd want to take the node that is 613961090503597

Re: Partitioning, tokens, and sequential keys

2011-08-16 Thread Jonathan Ellis
what tokens did you end up using? are you sure it's actually due to different amounts of rows? have you run cleanup and compact to make sure it's not unused data / obsolete replicas taking up the space? On Tue, Aug 16, 2011 at 1:41 PM, David McNelis wrote: > We are currently running a three nod

Partitioning, tokens, and sequential keys

2011-08-16 Thread David McNelis
We are currently running a three node cluster where we assigned the initial tokens using the Python script that is in the Wiki, and we're currently using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM however we're seeing one node taken on over 60% of the data as we load data.

Re: Partitioning

2011-02-16 Thread A J
d this shortcoming? > > > > Thanks in advance. > > > > Peter > > > > *发件人:* Aaron Morton [mailto:aa...@thelastpickle.com] > *发送时间:* 2011年2月16日 3:56 > *收件人:* user@cassandra.apache.org > *主题:* Re: Partitioning > > > > You can using the Network Topolog

Re: Partitioning

2011-02-16 Thread Wangpei (Peter)
shortcoming? Thanks in advance. Peter 发件人: Aaron Morton [mailto:aa...@thelastpickle.com] 发送时间: 2011年2月16日 3:56 收件人: user@cassandra.apache.org 主题: Re: Partitioning You can using the Network Topology Strategy see http://wiki.apache.org/cassandra/Operations?highlight=(topology)|(network

Re: Partitioning

2011-02-15 Thread Aaron Morton
osting of the same. I am unable to subscribe for some reason. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Partitioning-tp6028132p6028132.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Partitioning

2011-02-15 Thread RW>N
-apache-org.3065146.n2.nabble.com/Partitioning-tp6028132p6028132.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread Aaron Morton
meant to say that I do see announcements about streaming in the output, but these are stuck at 0%. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851.html Sent from the

Re: Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread buddhasystem
Correction -- what I meant to say that I do see announcements about streaming in the output, but these are stuck at 0%. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851

Re-partitioning the cluster with nodetool: what's happening?

2011-01-25 Thread buddhasystem
message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960843.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.

Re: OPP and controlling partitioning

2010-11-18 Thread Claudio Martella
elps. > Aaron > > On 18 Nov, 2010,at 02:22 AM, Claudio Martella > wrote: > >> @Adi: >> >> Yes, that's exactly the reason for the OPP in the Subject :) >> >> @Aaron: >> >> Thanks for the complete answer. >> >> 1) In my case

Re: OPP and controlling partitioning

2010-11-17 Thread Aaron Morton
nks for the complete answer. 1) In my case "vertexid_" is a uuid. Could you send me some reference on how to achieve this partitioning based on this prefix and orderpreservingpartitioning? I can't find docs about it. 2) Ah, that's a pity, I guess I'll have to extend cassandra&#

Re: OPP and controlling partitioning

2010-11-17 Thread Claudio Martella
@Adi: Yes, that's exactly the reason for the OPP in the Subject :) @Aaron: Thanks for the complete answer. 1) In my case "vertexid_" is a uuid. Could you send me some reference on how to achieve this partitioning based on this prefix and orderpreservingpartitioning? I can'

Re: OPP and controlling partitioning

2010-11-15 Thread Adi
does have a concept of "ascending ordering", so i thought > about OPP, but to my understanding OPP does not grant that all the data > starting with the same prefix will end up in the same cassandra node, > but only some of it. My set of data about a vertex could still be split > bet

Re: OPP and controlling partitioning

2010-11-15 Thread Aaron Morton
node, but only some of it. My set of data about a vertex could still be split between two cassandra nodes in case the token ends up being a key in the middle of the set, right? What i require exactly is: (1) to have all the rows belonging to the same vertexid (which is a uuid) on the same cassandra no

OPP and controlling partitioning

2010-11-15 Thread Claudio Martella
e same cassandra node, but only some of it. My set of data about a vertex could still be split between two cassandra nodes in case the token ends up being a key in the middle of the set, right? What i require exactly is: (1) to have all the rows belonging to the same vertexid (which is a uuid) on

Re: about key sorting and token partitioning

2010-11-10 Thread Peter Schuller
> I am using cassandra to store a message steam, and want to use timestamps > (like mmddhhMIss or something alike) as the keys. > So if I use RandomPartitioner, I will loose the order when using > get_range_slices(). > If I use OrderPreservingPartitioner, how should I configure cassandra to > m

about key sorting and token partitioning

2010-11-10 Thread zangds
Hi, I am using cassandra to store a message steam, and want to use timestamps (like mmddhhMIss or something alike) as the keys. So if I use RandomPartitioner, I will loose the order when using get_range_slices(). If I use OrderPreservingPartitioner, how should I configure cassandra to make l

Re: Several CFs and partitioning : which key rabge is used

2010-05-17 Thread Jonathan Ellis
There is only one partitioner, and that alone is what determines key -> token mapping. CF has nothing to do with it. On Mon, May 17, 2010 at 4:55 AM, Miriam Allalouf wrote: > Hi, > I have a basic question regarding key ranges and partitions. > Assuming we have two CF column familes, each is asso

Several CFs and partitioning : which key rabge is used

2010-05-17 Thread Miriam Allalouf
Hi, I have a basic question regarding key ranges and partitions. Assuming we have two CF column familes, each is associated with different KEY range and compare order. Now, Cassandra supports only one "range" of token values and key node assignement --- so, how each key range (that belong to a dif