I have written some code to avoid thrift reconnection, it just keeps the 
connection open between get_range_slices calls. 
I can extract that and put it up but not until early next week.

/Johan

On 23 apr 2010, at 05.09, Jonathan Ellis wrote:

> That would be an easy win, sure.
> 
> On Thu, Apr 22, 2010 at 9:27 PM, Joost Ouwerkerk <jo...@openplaces.org> wrote:
>> I was getting client timeouts in ColumnFamilyRecordReader.maybeInit() when
>> MapReducing.  So I've reduced the Range Batch Size to 256 (from 4096) and
>> this seems to have fixed my problem, although it has slowed things down a
>> bit -- presumably because there are 16x more calls to get_range_slices.
>> While I was in that code I noticed that a new client was being created for
>> each batch get.  By decreasing the batch size, I've increased this
>> overhead.  I'm thinking of re-writing ColumnFamilyRecordReader to do some
>> connection pooling.  Anyone have any thoughts on that?
>> joost.
>> 

Reply via email to