If you want to process 1 million rows use Hadoop with Hive or Pig. If you use Hadoop you are not doing things in real time.
You may need to rephrase the problem. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/02/2012, at 11:00 AM, Martin Arrowsmith wrote: > Hi Experts, > > My program is such that it queries all keys on Cassandra. I want to do this > as quick as possible, in order to get as close to real-time as possible. > > One solution I heard was to use the sstables2json tool, and read the data in > as JSON. I understand that reading from each line in Cassandra might take > longer. > > Are there any other ideas for doing this ? Or can you confirm that > sstables2json is the way to go. > > Querying 100 rows in Cassandra the normal way is fast enough. I'd like to > query a million rows, do some calculations on them, and spit out the result > like it's real time. > > Thanks for any help you can give, > > Martin