This doesn’t seem like a reasonable use case for Cassandra. I mean, it’s not a 
typical “database” use case.

-- Jack Krupansky

From: Philo Yang 
Sent: Thursday, July 31, 2014 1:44 PM
To: user@cassandra.apache.org 
Subject: select many rows one time or select many times?

Hi all, 

I have a cluster of 2.0.6 and one of my tables is like this:
CREATE TABLE word (
  user text,
  word text,
  flag double,
  PRIMARY KEY (user, word)
)

each "user" has about 10000 "word" per node. I have a requirement of selecting 
all rows where user='someuser' and word is in a large set whose size is about 
1000 . 

In C* document, it is not recommended to use "select ... in" just like:

select from word where user='someuser' and word in ('a','b','aa','ab',...) 

So now I select all rows where user='someuser' and filtrate them via client 
rather than via C*. Of course, I use Datastax Java Driver to page the resultset 
by setFetchSize(1000).  Is it the best way? I found the system's load is high 
because of large range query, should I change to select for only one row each 
time and select 1000 times?

just like:
select from word where user='someuser' and word = 'a';
select from word where user='someuser' and word = 'b';

select from word where user='someuser' and word = 'c';

.....

Which method will cause lower pressure on Cassandra cluster?

Thanks, 
Philo Yang

Reply via email to