Hi all, HDFS-941 added dfs.datanode.socket.reuse.keepalive. This allows DataXceiver worker threads in the DataNode to linger for a second or two after finishing a request, in case the client wants to send another request. On the client side, HDFS-941 added a SocketCache, so that subsequent client requests could reuse the same socket. Sockets were closed purely by an LRU eviction policy.
Later, HDFS-3373 added a minimum expiration time to the SocketCache, and added a thread which periodically closed old sockets. However, the default timeout for SocketCache (which is now called PeerCache) is much longer than the DN would possibly keep the socket open. Specifically, dfs.client.socketcache.expiryMsec defaults to 2 * 60 * 1000 (2 minutes), whereas dfs.datanode.socket.reuse.keepalive defaults to 1000 (1 second). I'm not sure why we have such a big disparity here. It seems like this will inevitably lead to clients trying to use sockets which have gone stale, because the server closes them way before the client expires them. Unless I'm missing something, we should probably either lengthen the keepalive, or shorten the socket cache expiry, or both. thoughts? Colin