talking about?
One of the secrets of Cassandra is to use more, smaller requests in parallel,
rather than massive requests to a single coordinator node.
-- Jack Krupansky
From: Drew Kutcharian
Sent: Friday, August 29, 2014 8:28 PM
To: user@cassandra.apache.org
Subject: Re: Data partitioning
of Cassandra is scalability and distributed
> processing, right?
>
> -- Jack Krupansky
>
> From: Drew Kutcharian
> Sent: Friday, August 29, 2014 7:31 PM
> To: user@cassandra.apache.org
> Subject: Re: Data partitioning and composite partition key
>
> Hi Jack,
&
separate nodes?
I mean, the whole point of Cassandra is scalability and distributed processing,
right?
-- Jack Krupansky
From: Drew Kutcharian
Sent: Friday, August 29, 2014 7:31 PM
To: user@cassandra.apache.org
Subject: Re: Data partitioning and composite partition key
Hi Jack,
I think you
Hi Rob,
I agree that one should not mess around with the default partitioner. But there
might be value in improving the Murmur3 partitioner to be “Composite Aware”.
Since we can have composites in row keys now, why not be able to use only a
part of the row key for partitioning? Makes sense?
I
On Fri, Aug 29, 2014 at 3:48 PM, Drew Kutcharian wrote:
> AFAIK, currently Cassandra partitions (thrift) rows using the row key,
> basically uses the hash(row_key) to decide what node that row needs to be
> stored on. Now there are times when there is a need to shard a wide row,
> say storing eve
gt;
> From: Drew Kutcharian
> Sent: Friday, August 29, 2014 6:48 PM
> To: user@cassandra.apache.org
> Subject: Data partitioning and composite partition key
>
> Hey Guys,
>
> AFAIK, currently Cassandra partitions (thrift) rows using the row key,
> basically uses th
@cassandra.apache.org
Subject: Data partitioning and composite partition key
Hey Guys,
AFAIK, currently Cassandra partitions (thrift) rows using the row key,
basically uses the hash(row_key) to decide what node that row needs to be
stored on. Now there are times when there is a need to shard a
Hey Guys,
AFAIK, currently Cassandra partitions (thrift) rows using the row key,
basically uses the hash(row_key) to decide what node that row needs to be
stored on. Now there are times when there is a need to shard a wide row, say
storing events per sensor, so you’d have sensorId-datetime row
Check that: https://issues.apache.org/jira/browse/CASSANDRA-6602
There is a patch for a compaction strategy dedicated to time series data.
The discussion is also interesting in the comments.
On Fri, Aug 15, 2014 at 6:28 AM, Kevin Burton wrote:
> We use log structured tables to hold logs for
We use log structured tables to hold logs for analysis.
It's basically append only, and immutable. Every record has a timestamp
for each record inserted.
Having this in ONE big monolithic table can be problematic.
1. compactions have to compact old data that might not even be used often.
2.
I think there are several issues in your schema and queries.
First, the schema can't efficiently return the single newest post for every
author. It can efficiently return the newest N posts for a particular
author.
On Fri, May 16, 2014 at 11:53 PM, 後藤 泰陽 wrote:
>
> But I consider LIMIT to be a
;
>>
>> 2014/05/16 23:54、Jonathan Lacefield のメール:
>>
>> Hello,
>>
>> Have you looked at using the CLUSTERING ORDER BY and LIMIT features of
>> CQL3?
>>
>> These may help you achieve your goals.
>>
>>
>> http://www.data
LIMIT features of
> CQL3?
>
> These may help you achieve your goals.
>
>
> http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/refClstrOrdr.html
>
> http://www.datastax.com/documentation/cql/3.1/cql/cql_reference/select_r.html
>
> Jonathan Lacefield
> So
documentation/cql/3.1/cql/cql_reference/select_r.html
>
> Jonathan Lacefield
> Solutions Architect, DataStax
> (404) 822 3487
>
>
>
>
>
>
> On Fri, May 16, 2014 at 12:23 AM, Matope Ono wrote:
> Hi, I'm modeling some queries in CQL3.
>
> I
Hi, I'm modeling some queries in CQL3.
I'd like to query first 1 columns for each partitioning keys in CQL3.
For example:
create table posts(
> author ascii,
> created_at timeuuid,
> entry text,
> primary key(author,created_at)
> );
> insert into posts(author,creat
Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487
<http://www.linkedin.com/in/jlacefield>
<http://www.datastax.com/cassandrasummit14>
On Fri, May 16, 2014 at 12:23 AM, Matope Ono wrote:
> Hi, I'm modeling some queries in CQL3.
>
> I'd like to query first
On Fri, Jan 24, 2014 at 5:39 AM, Jean Paul Adant
wrote:
> I'm using cassandra 1.1.9
> I have this columnFamily, created with hector API. Here is its cql2
> descrition.
>
You should not use CQL2, it will be removed from future versions of
Cassandra.
=Rob
per second) and i'm using
composite keys as column name.
- Almost all datas are written on columns under the same row key
Question?
I understand row partitioning, and understood that all columns will be on
the same row, so on the same partitioner (so on the same machine)
As columnsFamilies have l
> What is be the effective difference between hashing the keys myself and
> letting the random partitioner do it?
This is what the RandomPartitioner calls
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/utils/FBUtilities.java#L209
> Is this advisable?
I would try to a
I want to do ranged row queries for a few of my column families, but
best practice seems to be to use the random partitioner. Splitting my
column families between two clusters (one random, one ordered) seems
like a pretty expensive compromise.
Instead, I'm thinking of using the order-preservin
> One question on nodetool ring, the "owns" refers to how many of the possible
> keys each node owns, not the actual node size correct?
yes
> So you could technically have a load of 15gb, 60gb, and 15gb on a three node
> cluster, but if you have the tokens set correctly each would own 33.33%.
Ye
Well, I think what happened was that we had three tokens generated, 0,
567x, and 1134x... but the way that we read the comments in the yaml file,
we just set the second two nodes with the initial token and left the token
for the seed node blank. Then we started the seed node, started the other
Yes, that looks about right.
Totally baffled how the wiki script could spit out those tokens for a
3-node cluster.
On Tue, Aug 16, 2011 at 2:04 PM, David McNelis
wrote:
> Currently we have the initial_token for the seed node blank, and then the
> three tokens we ended up with are:
> 56713727820
Currently we have the initial_token for the seed node blank, and then the
three tokens we ended up with are:
56713727820156410577229101238628035242
61396109050359754194262152792166260437
113427455640312821154458202477256070485
I would assume that we'd want to take the node that
is 613961090503597
what tokens did you end up using?
are you sure it's actually due to different amounts of rows? have you
run cleanup and compact to make sure it's not unused data / obsolete
replicas taking up the space?
On Tue, Aug 16, 2011 at 1:41 PM, David McNelis
wrote:
> We are currently running a three nod
We are currently running a three node cluster where we assigned the initial
tokens using the Python script that is in the Wiki, and we're currently
using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM
however we're seeing one node taken on over 60% of the data as we load
data.
d this shortcoming?
>
>
>
> Thanks in advance.
>
>
>
> Peter
>
>
>
> *发件人:* Aaron Morton [mailto:aa...@thelastpickle.com]
> *发送时间:* 2011年2月16日 3:56
> *收件人:* user@cassandra.apache.org
> *主题:* Re: Partitioning
>
>
>
> You can using the Network Topolog
shortcoming?
Thanks in advance.
Peter
发件人: Aaron Morton [mailto:aa...@thelastpickle.com]
发送时间: 2011年2月16日 3:56
收件人: user@cassandra.apache.org
主题: Re: Partitioning
You can using the Network Topology Strategy see
http://wiki.apache.org/cassandra/Operations?highlight=(topology)|(network
osting of the same. I am unable to subscribe for
some reason.
--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Partitioning-tp6028132p6028132.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
-apache-org.3065146.n2.nabble.com/Partitioning-tp6028132p6028132.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
meant to say that I do see announcements about streaming
in the output, but these are stuck at 0%.
--
View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851.html
Sent from the
Correction -- what I meant to say that I do see announcements about streaming
in the output, but these are stuck at 0%.
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960851
message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Re-partitioning-the-cluster-with-nodetool-what-s-happening-tp5960843p5960843.html
Sent from the cassandra-u...@incubator.apache.org mailing list archive at
Nabble.com.
elps.
> Aaron
>
> On 18 Nov, 2010,at 02:22 AM, Claudio Martella
> wrote:
>
>> @Adi:
>>
>> Yes, that's exactly the reason for the OPP in the Subject :)
>>
>> @Aaron:
>>
>> Thanks for the complete answer.
>>
>> 1) In my case
nks for the complete answer.
1) In my case "vertexid_" is a uuid. Could you send me some reference on
how to achieve this partitioning based on this prefix and
orderpreservingpartitioning? I can't find docs about it.
2) Ah, that's a pity, I guess I'll have to extend cassandra
@Adi:
Yes, that's exactly the reason for the OPP in the Subject :)
@Aaron:
Thanks for the complete answer.
1) In my case "vertexid_" is a uuid. Could you send me some reference on
how to achieve this partitioning based on this prefix and
orderpreservingpartitioning? I can'
does have a concept of "ascending ordering", so i thought
> about OPP, but to my understanding OPP does not grant that all the data
> starting with the same prefix will end up in the same cassandra node,
> but only some of it. My set of data about a vertex could still be split
> bet
node,
but only some of it. My set of data about a vertex could still be split
between two cassandra nodes in case the token ends up being a key in the
middle of the set, right?
What i require exactly is:
(1) to have all the rows belonging to the same vertexid (which is a
uuid) on the same cassandra no
e same cassandra node,
but only some of it. My set of data about a vertex could still be split
between two cassandra nodes in case the token ends up being a key in the
middle of the set, right?
What i require exactly is:
(1) to have all the rows belonging to the same vertexid (which is a
uuid) on
> I am using cassandra to store a message steam, and want to use timestamps
> (like mmddhhMIss or something alike) as the keys.
> So if I use RandomPartitioner, I will loose the order when using
> get_range_slices().
> If I use OrderPreservingPartitioner, how should I configure cassandra to
> m
Hi,
I am using cassandra to store a message steam, and want to use timestamps (like
mmddhhMIss or something alike) as the keys.
So if I use RandomPartitioner, I will loose the order when using
get_range_slices().
If I use OrderPreservingPartitioner, how should I configure cassandra to make
l
There is only one partitioner, and that alone is what determines key
-> token mapping. CF has nothing to do with it.
On Mon, May 17, 2010 at 4:55 AM, Miriam Allalouf
wrote:
> Hi,
> I have a basic question regarding key ranges and partitions.
> Assuming we have two CF column familes, each is asso
Hi,
I have a basic question regarding key ranges and partitions.
Assuming we have two CF column familes, each is associated with
different KEY range and compare order.
Now, Cassandra supports only one "range" of token values and key node
assignement --- so, how each key range (that belong to a dif
43 matches
Mail list logo