[ https://issues.apache.org/jira/browse/HDFS-17808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18016690#comment-18016690 ]
ASF GitHub Bot commented on HDFS-17808: --------------------------------------- Hexiaoqiao commented on PR #7810: URL: https://github.com/apache/hadoop/pull/7810#issuecomment-3231836694 Thanks @hfutatzhanghb involve me here. Have quick review, I am not sure if close then create new stream will solve this issue, such as some DataNodes cost long time to restart or even can't restart again, then this solution will not be available, right? Compare to replica/pipeline case, there is one step to copy data then build pipeline again, but for EC, the first step could not be feasible. For one word, we need to define issue clearly. Thanks again. > EC: End block group in advance to prevent write failure for long-time running > OutputStream > ------------------------------------------------------------------------------------------ > > Key: HDFS-17808 > URL: https://issues.apache.org/jira/browse/HDFS-17808 > Project: Hadoop HDFS > Issue Type: Improvement > Components: ec, erasure-coding > Reporter: farmmamba > Assignee: farmmamba > Priority: Major > Labels: pull-request-available > > Recently, we met an EC problem in our production. > User creates an output stream to write ec files. That output stream writes > some bytes and will be idle for a long time until data is ready. If we > restart our cluster's datanodes to version up, those applications will > finally fail due to not have enough healthy streamers. > > This Jira try to solve above problem by end block group in advance when we > already have failed streamers but less than parity number. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org