jiwan created HDFS-3729:
---------------------------
Summary: when datanode is blocked in
BlockReceiver.receiveBlock(...) by disk error/pressure, DFSClient is blocked
and no timeout mechanism
Key: HDFS-3729
URL: https://issues.apache.org/jira/browse/HDFS-3729
Project: Hadoop HDFS
Issue Type: Bug
Components: hdfs client
Reporter: jiwan
Our hadoop/hbase in taobao.com cluster is blocked by DFSClient somtimes. The
reason is disk error or too much load, but HEART_BEAT in PacketResponder is
normal, so DFSClient wait forever until method of disk read/write return. I
searched issues in jira, and nothing for this issue, we plan to do some work to
fix this bug. The innial idea is to add timeout mechanism for the DFSClient
write function. Does some guys have comments about this?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira