Hi Colin, Do we have a JIRA already for this? Is it https://issues.apache.org/jira/browse/HDFS-4307?
On Mon, Jun 10, 2013 at 11:05 PM, Todd Lipcon <t...@cloudera.com> wrote: > +1 for dropping the client side expiry down to something like 1-2 seconds. > I'd rather do that than up the server side, since the server side resource > (DN threads) is likely to be more contended. > > -Todd > > On Fri, Jun 7, 2013 at 4:29 PM, Colin McCabe <cmcc...@alumni.cmu.edu> wrote: > >> Hi all, >> >> HDFS-941 added dfs.datanode.socket.reuse.keepalive. This allows >> DataXceiver worker threads in the DataNode to linger for a second or >> two after finishing a request, in case the client wants to send >> another request. On the client side, HDFS-941 added a SocketCache, so >> that subsequent client requests could reuse the same socket. Sockets >> were closed purely by an LRU eviction policy. >> >> Later, HDFS-3373 added a minimum expiration time to the SocketCache, >> and added a thread which periodically closed old sockets. >> >> However, the default timeout for SocketCache (which is now called >> PeerCache) is much longer than the DN would possibly keep the socket >> open. Specifically, dfs.client.socketcache.expiryMsec defaults to 2 * >> 60 * 1000 (2 minutes), whereas dfs.datanode.socket.reuse.keepalive >> defaults to 1000 (1 second). >> >> I'm not sure why we have such a big disparity here. It seems like >> this will inevitably lead to clients trying to use sockets which have >> gone stale, because the server closes them way before the client >> expires them. Unless I'm missing something, we should probably either >> lengthen the keepalive, or shorten the socket cache expiry, or both. >> >> thoughts? >> Colin >> > > > > -- > Todd Lipcon > Software Engineer, Cloudera -- Harsh J