loading all rows from cassandra using multiple (python) clients in parallel

John R. Frank Mon, 22 Apr 2013 05:16:27 -0700

Cassandra Experts,

I understand that when using Cassandra's recommended RandomPartitioner (orMurmur3Partitioner), it is not possible to do meaningful range queries onkeys, because the rows are distributed around the cluster using the md5hash of the key. These hashes are called "tokens."

Nonetheless, it would be very useful to split up a large table amongstmany compute workers by assigning each a range of tokens. Using CQL3, itappears possible to issue queries directly against the tokens, however thefollowing python does not work:


http://stackoverflow.com/questions/16137944/loading-all-rows-from-cassandra-using-multiple-python-clients-in-parallel

I would ideally like to make this work with pycassa, because I prefer itsmore pythonic interface.


Am I just not invoking CQL3 correctly through the cql package?

Is there a better way to do this?


Thanks for any pointers!

John

loading all rows from cassandra using multiple (python) clients in parallel

Reply via email to