Hi all,

   I am using C* 1.2.4 and using CQL3 with Astyanax to consume large amount of 
user based data (around 50-100K / sec).  Requests come in based on user cookies 
which I then need to link to a user (as users can change their cookies).  This 
is done using a link table:

CREATE TABLE cookie_user_lookup (
cookie TEXT PRIMARY KEY,
user_id BIGINT,
        creation_time TIMESTAMP
) with  
compression={'crc_check_chance':0.1,'sstable_compression':'LZ4Compressor'} and
compaction={'class':'LeveledCompactionStrategy'} and
gc_grace_seconds = 86400;

As I said, I am handling a large number of these per second and wanted to get 
your take on how best to do the lookup.  I find that there are 3 ways:

 *   Serially fetch 1 by 1.  The latency is very low at 0.1 ms but multiplying 
that by thousands per second becomes substantial.  This is too slow
 *   Serially fetch 1 by 1 but on separate threads.  This would require a very 
large number of concurrent connections (unless I change to datastax's binary 
protocol) as well as threads.  Seems heavy
 *   Batch fetch.  This is what I'm doing now where I build a very large select 
* from cookie_user_lookup where cookie in (a,b,c,.. Etc).  I am actually doing 
around 10K of these at a time and getting a response time in my cluster of 
around 100 ms.  This is very acceptable but wanted to get everyone's take as I 
have seen messages about this "starving" the request pool.  Note that I'm 
running in HSHA and am rarely seeing any reads waiting.

I appreciate your input!

Reply via email to