Re: [PR] HDFS-17397. Choose another DN as soon as possible, when encountering network issues [hadoop]

via GitHub Mon, 25 Mar 2024 02:50:29 -0700


Hexiaoqiao commented on PR #6591:
URL: https://github.com/apache/hadoop/pull/6591#issuecomment-2017607444


   @xleoken Thanks for your proposal. I am not sure this is the proper solution 
for your case as @ZanderXu mentioned. IIUC, you expect to fast fail when meet 
network issue between client and the first DataNode while write data to 
pipeline, right? IMO, it is difficult to determine to do that because,
   a. Sometimes we could not determine that it is client to the first DataNode 
network issue only using the time cost of ACK. Timeout between DataNodes in 
pipeline could also lead client wait time out IMO.
   b. If fast fail and recovery pipeline, the time cost could be more 
considerable, such as re-create pipeline and transfer data will involve more 
time cost when have writen out more than 10MB.
   
   For this case, we have discussed times, I think we need to split it to two 
step, report metrics back to client, then improve strategy (fast fail or switch 
dn or some other way based on different metrics). FYI.
   Thanks again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] HDFS-17397. Choose another DN as soon as possible, when encountering network issues [hadoop]

Reply via email to