Hi,

We have Hadoop jobs that read data from our Cassandra column families and write 
some data back to another column families.
The input column families are pretty simple CQL3 tables without wide rows.
In Hadoop jobs we set up corresponding WHERE clause in 
ConfigHelper.setInputWhereClauses(...), so we don't process the whole table at 
once. 
NeverĀ  the less, sometimes the amount of data returned by input query is bigĀ  
enough to cause TimedOutExceptions.

To mitigate this, I'd like to configure Hadoop job in a such way that it 
sequentially fetches input rows by smaller portions.

I'm looking at the ConfigHelper.setRangeBatchSize() and 
CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if that's 
what I need and if yes, which one should I use for those purposes.

Any help is appreciated.

Hadoop version is 1.1.2, Cassandra version is 1.2.8.

Reply via email to