Re: Cassandra input paging for Hadoop

2013-09-16 Thread Aaron Morton
> Or CqlPagingRecordReader supports paging through the entire result set? Supports paging through the entire result set. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder & Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 12/09/2013, at 5

Re[2]: Cassandra input paging for Hadoop

2013-09-11 Thread Renat Gilfanov
Hello, So it means that job will process only first "cassandra.input.page.row.size" rows, and ignore the rest? Or CqlPagingRecordReader supports paging through the entire result set?   Aaron Morton : >>> >>>I'm looking at the ConfigHelper.setRangeBatchSize() and >>>CqlConfigHelper.setInputCQL

Re: Cassandra input paging for Hadoop

2013-09-11 Thread Aaron Morton
>> >> I'm looking at the ConfigHelper.setRangeBatchSize() and >> CqlConfigHelper.setInputCQLPageRowSize() methods, but a bit confused if >> that's what I need and if yes, which one should I use for those purposes. If you are using CQL 3 via Hadoop CqlConfigHelper.setInputCQLPageRowSize is the one

Re: Cassandra input paging for Hadoop

2013-09-11 Thread Jiaan Zeng
Speaking of thrift client, i.e. ColumnFamilyInputFormat, yes, ConfigHelper.setRangeBatchSize() can reduce the number of rows sent to Cassandra. Depend on how big your column is, you may also want to increase thrift message length through setThriftMaxMessageLengthInMb(). Hope that helps. On Tue,

Cassandra input paging for Hadoop

2013-09-10 Thread Renat Gilfanov
Hi, We have Hadoop jobs that read data from our Cassandra column families and write some data back to another column families. The input column families are pretty simple CQL3 tables without wide rows. In Hadoop jobs we set up corresponding WHERE clause in ConfigHelper.setInputWhereClauses(...)