Hi all, I am using C* 1.2.4 and using CQL3 with Astyanax to consume large amount of user based data (around 50-100K / sec). Requests come in based on user cookies which I then need to link to a user (as users can change their cookies). This is done using a link table:
CREATE TABLE cookie_user_lookup ( cookie TEXT PRIMARY KEY, user_id BIGINT, creation_time TIMESTAMP ) with compression={'crc_check_chance':0.1,'sstable_compression':'LZ4Compressor'} and compaction={'class':'LeveledCompactionStrategy'} and gc_grace_seconds = 86400; As I said, I am handling a large number of these per second and wanted to get your take on how best to do the lookup. I find that there are 3 ways: * Serially fetch 1 by 1. The latency is very low at 0.1 ms but multiplying that by thousands per second becomes substantial. This is too slow * Serially fetch 1 by 1 but on separate threads. This would require a very large number of concurrent connections (unless I change to datastax's binary protocol) as well as threads. Seems heavy * Batch fetch. This is what I'm doing now where I build a very large select * from cookie_user_lookup where cookie in (a,b,c,.. Etc). I am actually doing around 10K of these at a time and getting a response time in my cluster of around 100 ms. This is very acceptable but wanted to get everyone's take as I have seen messages about this "starving" the request pool. Note that I'm running in HSHA and am rarely seeing any reads waiting. I appreciate your input!