I was getting client timeouts in ColumnFamilyRecordReader.maybeInit() when
MapReducing.  So I've reduced the Range Batch Size to 256 (from 4096) and
this seems to have fixed my problem, although it has slowed things down a
bit -- presumably because there are 16x more calls to get_range_slices.
While I was in that code I noticed that a new client was being created for
each batch get.  By decreasing the batch size, I've increased this
overhead.  I'm thinking of re-writing ColumnFamilyRecordReader to do some
connection pooling.  Anyone have any thoughts on that?
joost.

Reply via email to