I have about a million rows (each row with 100 cols) of the form 
domain/!date/!id  (e.g. gwm.com/!20100430/!CFRA4500) So I am interested in 
getting all the ids (all cols) for a particular domain/date (e.g. 
"gwm.ml.com/!20100430/!A" "gwm.ml.com/!20100430/!D"). I am looping in chunks of 
6000 rows / 500 cols at a time. However, it is taken in my 5 node cluster (each 
 machine has 32gb in ram, RF=3 and OPP, v0.6.1) 36 secs to get all the required 
rows (stats below); which I think it is a bit high. I am wondering if a 
possible cause it's the way my string keys are constructed (suggestions are 
welcome) that makes Cassandra work 'harder' when doing a 'range slices'. Does 
Cassandra examines all row keys to search for matches? Are there any settings I 
can tweak to try to make the retrieval faster?

Thanks

Carlos

row(s) found 6000 in 35086ms
total cols(s) found 593502
row bytes 228000
col bytes 38422670
total bytes 38650670  (36.86015 MB)




This email message and any attachments are for the sole use of the intended 
recipients and may contain proprietary and/or confidential information which 
may be privileged or otherwise protected from disclosure. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not an 
intended recipient, please contact the sender by reply email and destroy the 
original message and any copies of the message as well as any attachments to 
the original message.

Reply via email to