Making it configurable seems like a good thing. There is a JIRA (owned by Sanjay) that describes that some of these configuration variables on the client side might become "undocumented"; tjois means that they might change semantics from a release to another.
thanks, dhruba On Wed, Sep 2, 2009 at 7:45 PM, Jay Kreps <jay.kr...@gmail.com> wrote: > Hey Guys, > > I am interested in increasing the throughput of an HDFS read while > transferring data between datacenters that are geographically far > apart and hence have a network latency of around 60ms. I see in the > HDFS code that the DFSClient and DataNode seem to hardcode their > socket buffer sizes to 128KB (DFSClient.createBlockOutputStream and > DataNode.startDataNode). Is there a reason for this? > > I want to expose this value as a configurable property so that when i > read over the high-latency link I can set the ideal buffer size for > this particular application (around 800KB for our desired bandwidth). > Is there a reason this is not done currently? Would you take a patch > that added this property? Am I looking at totally the wrong code? > > -Jay >