Re: Doubt in Row key range scan

2012-05-28 Thread Luís Ferreira
wal | Developer - Big Data(I&D)| 9731648376 | www.mu-sigma.com > > From: Pierre Chalamet [mailto:pie...@chalamet.net] > Sent: Monday, May 28, 2012 3:31 PM > To: user@cassandra.apache.org > Subject: Re: Doubt in Row key range scan > > Hi, > > It's normal. &g

RE: Doubt in Row key range scan

2012-05-28 Thread Prakrati Agrawal
oubt in Row key range scan Hi, It's normal. Keys to replicas are determined with a hash (md5) when using the random partitionner (which you are using I guess). You probably want to switch to the order preserving partionner or tweak your data model in order to rely on 2nd index for such

Re: Doubt in Row key range scan

2012-05-28 Thread Alain RODRIGUEZ
You are using the Random Partitioner. Using the RP is a good thing because you avoid hot spots, but it has its defaults too. You can't scan a slice of row, they won't be ordered because all your keys are stored using their md5 values. You should review your data model to use columns to order your

Re: Doubt in Row key range scan

2012-05-28 Thread Pierre Chalamet
ierre -Original Message- From: Prakrati Agrawal Date: Mon, 28 May 2012 04:39:46 To: user@cassandra.apache.org Reply-To: user@cassandra.apache.org Subject: Doubt in Row key range scan Dear all I have stored my data into Cassandra database in the format "tickerID_date". Now when I spec

Doubt in Row key range scan

2012-05-28 Thread Prakrati Agrawal
Dear all I have stored my data into Cassandra database in the format "tickerID_date". Now when I specify the row key range like 1_2012/05/24(start) to 1_2012/05/27(end) it says that the end key md5 value is lesser than start key md5 value. So I changed my start key to 1_2012/05/27 and end key

Range scan

2011-08-26 Thread Bill Hastings
How does range scan work in Cassandra? Does the read of a key perform the read across all the SSTables that contain the key and return the row or are SSTables processed sequentially? If I have a key k and its columns are spread across N SSTables then does the read of key k return the row with all

Re: Range scan returns more results then expected (Using ByteOrderedPartitioner)

2011-01-24 Thread Maxim Veksler
I see, now it makes perfect sense. Thank you. On Mon, Jan 24, 2011 at 11:05 PM, Aaron Morton wrote: > It's not pattern matching, it's comparing / ordering the byte values. You > are asking to return 100 keys in ascending order where the value of the key > (after the partitioner has been applied)

Re: Range scan returns more results then expected (Using ByteOrderedPartitioner)

2011-01-24 Thread Aaron Morton
It's not pattern matching, it's comparing / ordering the byte values. You are asking to return 100 keys in ascending order where the value of the key (after the partitioner has been applied) is greater than "1_265_8_12" If you want to do a seek and partial scan, you could use an end value in the li

Range scan returns more results then expected (Using ByteOrderedPartitioner)

2011-01-24 Thread Maxim Veksler
Hello, Cassandra is configure as following: conf/cassandra.yaml | grep 'partitioner:' partitioner: org.apache.cassandra.dht.ByteOrderedPartitioner Why yet doing range query on part of the key return more results then expected (column, CF and keyspace names masked): [default@KEYSPACE] list CF1

Re: HowTo: Range scan on secondary indexes?

2010-11-14 Thread André Fiedler
Ah ok, so i have to build a hash index to get all relevant data sets first, than cassa performs the range scan. Good to know, thx a lot! :o) 2010/11/14 Jonathan Ellis > Because 0.7.0 indexes are more like a Hash index than a B-tree. > > On Sun, Nov 14, 2010 at 12:39 PM, André Fiedler

Re: HowTo: Range scan on secondary indexes?

2010-11-14 Thread Jonathan Ellis
Because 0.7.0 indexes are more like a Hash index than a B-tree. On Sun, Nov 14, 2010 at 12:39 PM, André Fiedler wrote: > Ok, i read this before. Could you explain (short) why i have to do this? In > my opinion it isn't necessary, i want to understand why it is. ;o) > thx André! > > 2010/11/14 Nat

Re: HowTo: Range scan on secondary indexes?

2010-11-14 Thread André Fiedler
Ok, i read this before. Could you explain (short) why i have to do this? In my opinion it isn't necessary, i want to understand why it is. ;o) thx André! 2010/11/14 Nate McCall > You must have a clause with an EQ operator on an indexed column present. > See: > http://wiki.apache.org/cassandra/A

Re: HowTo: Range scan on secondary indexes?

2010-11-14 Thread Nate McCall
You must have a clause with an EQ operator on an indexed column present. See: http://wiki.apache.org/cassandra/API07#IndexClause and: http://wiki.apache.org/cassandra/API07#get_indexed_slices For more details. On Sun, Nov 14, 2010 at 12:09 PM, André Fiedler wrote: > Hi, i wrote a question on t

HowTo: Range scan on secondary indexes?

2010-11-14 Thread André Fiedler
Hi, i wrote a question on the phpcassa group, but i think its more cassandra related. Would be nice, if you get some time and take a look: http://groups.google.com/group/phpcassa/browse_thread/thread/1b6acb5f7dccb94f greetings André

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Jonathan Ellis
I see what you mean -- you have understood correctly. On Mon, Mar 29, 2010 at 8:13 AM, Henrik Schröder wrote: > On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote: >> >> On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder >> wrote: >> > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: >> >>

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Mike Malone
On Mon, Mar 29, 2010 at 7:13 AM, Henrik Schröder wrote: > On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote: > >> On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder >> wrote: >> > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis >> wrote: >> >> It's a unique index then? And you're trying to read

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Henrik Schröder
On Mon, Mar 29, 2010 at 14:15, Jonathan Ellis wrote: > On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder > wrote: > > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: > >> It's a unique index then? And you're trying to read things ordered by > >> the index, not just "give me keys with that

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Jonathan Ellis
On Mon, Mar 29, 2010 at 4:06 AM, Henrik Schröder wrote: > On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: >> It's a unique index then?  And you're trying to read things ordered by >> the index, not just "give me keys with that have a column with this >> value?" > > Yes, because if we have mo

Re: Range scan performance in 0.6.0 beta2

2010-03-29 Thread Henrik Schröder
On Fri, Mar 26, 2010 at 14:47, Jonathan Ellis wrote: > On Fri, Mar 26, 2010 at 7:40 AM, Henrik Schröder > wrote: > > For each indexvalue we insert a row where the key is indexid + ":" + > > indexvalue encoded as hex string, and the row contains only one column, > > where the name is the object k

Re: Range scan performance in 0.6.0 beta2

2010-03-26 Thread Jonathan Ellis
On Fri, Mar 26, 2010 at 7:40 AM, Henrik Schröder wrote: > For each indexvalue we insert a row where the key is indexid + ":" + > indexvalue encoded as hex string, and the row contains only one column, > where the name is the object key encoded as a bytearray, and the value is > empty. It's a uniq

Re: Range scan performance in 0.6.0 beta2

2010-03-26 Thread Henrik Schröder
> > So all the values for an entire index will be in one row? That > doesn't sound good. > > You really want to put each index [and each table] in its own CF, but > until we can do that dynamically (0.7) you could at least make the > index row keys a tuple of (indexid, indexvalue) and the column n

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Jonathan Ellis
On Thu, Mar 25, 2010 at 8:33 AM, Henrik Schröder wrote: > Hi everyone, > > We're trying to implement a virtual datastore for our users where they can > set up "tables" and "indexes" to store objects and have them indexed on > arbitrary properties. And we did a test implementation for Cassandra in

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Sylvain Lebresne
On Thu, Mar 25, 2010 at 5:31 PM, Henrik Schröder wrote: > On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote: >> >> I don't know If that could play any role, but if ever you have >> disabled the assertions >> when running cassandra (that is, you removed the -ea line in >> cassandra.in.sh), the

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Nathan McCall
I noticed you turned Key caching off in your ColumnFamily declaration, have you tried experimenting with this on and playing key caching configuration? Also, have you looked at the JMX output for what commands are pending execution? That is always helpful to me in hunting down bottlenecks. -Nate

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Henrik Schröder
On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote: > I don't know If that could play any role, but if ever you have > disabled the assertions > when running cassandra (that is, you removed the -ea line in > cassandra.in.sh), there > was a bug in 0.6beta2 that will make read in row with lots o

Re: Range scan performance in 0.6.0 beta2

2010-03-25 Thread Sylvain Lebresne
I don't know If that could play any role, but if ever you have disabled the assertions when running cassandra (that is, you removed the -ea line in cassandra.in.sh), there was a bug in 0.6beta2 that will make read in row with lots of columns quite slow. Another problem you may have is if you have

Range scan performance in 0.6.0 beta2

2010-03-25 Thread Henrik Schröder
Hi everyone, We're trying to implement a virtual datastore for our users where they can set up "tables" and "indexes" to store objects and have them indexed on arbitrary properties. And we did a test implementation for Cassandra in the following way: Objects are stored in one columnfamily, each k