Re: [VOTE] Merge HDFS-5535 Rolling Upgrade Improvement
+1 for the merge. I just got caught up on the current state of the branch, and it looks good. End user documentation is in place. I deployed a cluster built from the branch, and then I used the documentation to test various scenarios of rolling upgrade, downgrade and rollback. Everything worked as expected. This looks ready to merge. Nice work, everyone! Chris Nauroth Hortonworks http://hortonworks.com/ On Thu, Feb 27, 2014 at 7:02 PM, Kihwal Lee wrote: > +1 > > > On Feb 25, 2014, at 3:42 PM, "Tsz Wo Sze" wrote: > > > > Hi hdfs-dev, > > > > We propose merging the HDFS-5535 branch to trunk. > > > > HDFS Rolling Upgrade is a feature to allow upgrading individual HDFS > daemons. In Hadoop v2, HDFS supports highly-available (HA) namenode > services and wire compatibility. These two capabilities make it feasible to > upgrade HDFS without incurring HDFS downtime. We make such improvement in > the HDFS-5535 branch. > > > > The HDFS-5535 branch is ready to be merged to trunk. As this being > written, there are 48 subtasks in HDFS-5535; 44 subtasks are already > completed. The core developments including feature development, unit tests > and user doc, are already done. The merge patch posted a few ago already > passed Jenkins. I will post a updated patch to trigger Jenkins again for > the latest code base. > > > > The remaining JIRAs are: > > > > HDFS-3225: Revist upgrade snapshots, roll back, finalize to enable > rolling upgrades (assigned to Sanjay) > > HDFS-6000: Avoid saving namespace when starting rolling upgrade > (assigned to Jing) > > HDFS-6013: add rollingUpgrade information to latest UI (assigned to > Vinay) > > HDFS-6016: Update datanode replacement policy to make writes more robust > (assigned to Kihwal) > > > > HDFS-6000 will be committed soon. All other issues are further > improvements which can be done after merge. > > > > The other remaining works are: > > - Revise the design doc > > - Post a test plan (Haohui is working on it.) > > - Execute the manual tests (Haohui and Fengdong will work on it.) > > > > The work was a collective effort of Nathan Roberts, Sanjay Radia, Suresh > Srinivas, Kihwal Lee, Jing Zhao, Arpit Agarwal, Brandon Li, Haohui Mai, > Vinayakumar B, Fengdong Yu, Chris Nauroth and Tsz-Wo Nicholas Sze, who have > proposed the design, worked on the code, reviewed patches, tested the > features and authored documentation. We thank everyone that who has gave > us valuable comments and feedback on the feature. > > > > The vote runs for 7 days. Here is my +1 on the merge. > > > > Thanks. > > Tsz-Wo > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
Re: [VOTE] Merge HDFS-5535 Rolling Upgrade Improvement
+1 for the merging. On Fri, Feb 28, 2014 at 10:58 AM, Chris Nauroth wrote: > +1 for the merge. > > I just got caught up on the current state of the branch, and it looks good. > End user documentation is in place. I deployed a cluster built from the > branch, and then I used the documentation to test various scenarios of > rolling upgrade, downgrade and rollback. Everything worked as expected. > This looks ready to merge. > > Nice work, everyone! > > Chris Nauroth > Hortonworks > http://hortonworks.com/ > > > > On Thu, Feb 27, 2014 at 7:02 PM, Kihwal Lee wrote: > > > +1 > > > > > On Feb 25, 2014, at 3:42 PM, "Tsz Wo Sze" wrote: > > > > > > Hi hdfs-dev, > > > > > > We propose merging the HDFS-5535 branch to trunk. > > > > > > HDFS Rolling Upgrade is a feature to allow upgrading individual HDFS > > daemons. In Hadoop v2, HDFS supports highly-available (HA) namenode > > services and wire compatibility. These two capabilities make it feasible > to > > upgrade HDFS without incurring HDFS downtime. We make such improvement > in > > the HDFS-5535 branch. > > > > > > The HDFS-5535 branch is ready to be merged to trunk. As this being > > written, there are 48 subtasks in HDFS-5535; 44 subtasks are already > > completed. The core developments including feature development, unit > tests > > and user doc, are already done. The merge patch posted a few ago already > > passed Jenkins. I will post a updated patch to trigger Jenkins again for > > the latest code base. > > > > > > The remaining JIRAs are: > > > > > > HDFS-3225: Revist upgrade snapshots, roll back, finalize to enable > > rolling upgrades (assigned to Sanjay) > > > HDFS-6000: Avoid saving namespace when starting rolling upgrade > > (assigned to Jing) > > > HDFS-6013: add rollingUpgrade information to latest UI (assigned to > > Vinay) > > > HDFS-6016: Update datanode replacement policy to make writes more > robust > > (assigned to Kihwal) > > > > > > HDFS-6000 will be committed soon. All other issues are further > > improvements which can be done after merge. > > > > > > The other remaining works are: > > > - Revise the design doc > > > - Post a test plan (Haohui is working on it.) > > > - Execute the manual tests (Haohui and Fengdong will work on it.) > > > > > > The work was a collective effort of Nathan Roberts, Sanjay Radia, > Suresh > > Srinivas, Kihwal Lee, Jing Zhao, Arpit Agarwal, Brandon Li, Haohui Mai, > > Vinayakumar B, Fengdong Yu, Chris Nauroth and Tsz-Wo Nicholas Sze, who > have > > proposed the design, worked on the code, reviewed patches, tested the > > features and authored documentation. We thank everyone that who has gave > > us valuable comments and feedback on the feature. > > > > > > The vote runs for 7 days. Here is my +1 on the merge. > > > > > > Thanks. > > > Tsz-Wo > > > > -- > CONFIDENTIALITY NOTICE > NOTICE: This message is intended for the use of the individual or entity to > which it is addressed and may contain information that is confidential, > privileged and exempt from disclosure under applicable law. If the reader > of this message is not the intended recipient, you are hereby notified that > any printing, copying, dissemination, distribution, disclosure or > forwarding of this communication is strictly prohibited. If you have > received this communication in error, please contact the sender immediately > and delete it from your system. Thank You. > -- CONFIDENTIALITY NOTICE NOTICE: This message is intended for the use of the individual or entity to which it is addressed and may contain information that is confidential, privileged and exempt from disclosure under applicable law. If the reader of this message is not the intended recipient, you are hereby notified that any printing, copying, dissemination, distribution, disclosure or forwarding of this communication is strictly prohibited. If you have received this communication in error, please contact the sender immediately and delete it from your system. Thank You.
[jira] [Resolved] (HDFS-6031) NN running newer software rejects loading the fsimage during rolling upgrade.
[ https://issues.apache.org/jira/browse/HDFS-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-6031. -- Resolution: Fixed Fix Version/s: HDFS-5535 (Rolling upgrades) Hadoop Flags: Reviewed Thanks Haohui, for testing and reviewing the patch. I have committed this. > NN running newer software rejects loading the fsimage during rolling upgrade. > - > > Key: HDFS-6031 > URL: https://issues.apache.org/jira/browse/HDFS-6031 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Reporter: Haohui Mai >Assignee: Tsz Wo (Nicholas), SZE > Fix For: HDFS-5535 (Rolling upgrades) > > Attachments: h6031_20140227.patch, h6031_20140227b.patch, > h6031_20140227c.patch, h6031_20140227d.patch, h6031_20140227e.patch > > > During rolling upgrade, the standby NN that has a newer layout version will > complain about that the fsimage is too old, and require starting the NN using > the {{-upgrade}} option: > {noformat} > File system image contains an old layout version -55. An upgrade to version > -56 is required. > {noformat} > However, {{-upgrade}} is disabled during rolling upgrade, thus it is > impossible to restart the standby NN: > {noformat} > org.apache.hadoop.hdfs.protocol.RollingUpgradeException: Failed to upgrade > namenode since a rolling upgrade is already in progress. Existing rolling > upgrade info: > ... > {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-4200) Reduce the size of synchronized sections in PacketResponder
[ https://issues.apache.org/jira/browse/HDFS-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jing Zhao resolved HDFS-4200. - Resolution: Fixed Fix Version/s: 2.4.0 I've merged the backported patch to branch-2 and branch-2.4.0. Thanks for the backporting, Andrew! > Reduce the size of synchronized sections in PacketResponder > > > Key: HDFS-4200 > URL: https://issues.apache.org/jira/browse/HDFS-4200 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode >Affects Versions: 2.0.0-alpha >Reporter: Suresh Srinivas >Assignee: Andrew Wang > Fix For: 3.0.0, 2.4.0 > > Attachments: HDFS-4200.patch, HDFS-4200.patch, > hdfs-4200-branch-2.patch > > > The size of synchronized sections can be reduced in PacketResponder. Also the > methods in the PacketResponder are long and need refactoring. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-6035) TestCacheDirectives#testCacheManagerRestart is failing on branch-2
Mit Desai created HDFS-6035: --- Summary: TestCacheDirectives#testCacheManagerRestart is failing on branch-2 Key: HDFS-6035 URL: https://issues.apache.org/jira/browse/HDFS-6035 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 2.4.0 Reporter: Mit Desai {noformat} java.io.IOException: Inconsistent checkpoint fields. LV = -51 namespaceID = 1641397469 cTime = 0 ; clusterId = testClusterID ; blockpoolId = BP-423574854-x.x.x.x-1393478669835. Expecting respectively: -51; 2; 0; testClusterID; BP-2051361571-x.x.x.x-1393478572877. at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:133) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:526) at org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testCacheManagerRestart(TestCacheDirectives.java:582) {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-5333) Improvement of current HDFS Web UI
[ https://issues.apache.org/jira/browse/HDFS-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Haohui Mai resolved HDFS-5333. -- Resolution: Fixed Fix Version/s: 2.3.0 The new web UI has been merged into 2.3. Thanks everybody for the work! > Improvement of current HDFS Web UI > -- > > Key: HDFS-5333 > URL: https://issues.apache.org/jira/browse/HDFS-5333 > Project: Hadoop HDFS > Issue Type: Improvement >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Haohui Mai > Fix For: 2.3.0 > > > This is an umbrella jira for improving the current JSP-based HDFS Web UI. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Created] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long
Colin Patrick McCabe created HDFS-6036: -- Summary: Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long Key: HDFS-6036 URL: https://issues.apache.org/jira/browse/HDFS-6036 Project: Hadoop HDFS Issue Type: Sub-task Components: caching, datanode Affects Versions: 2.4.0 Reporter: Colin Patrick McCabe Assignee: Colin Patrick McCabe We should forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long. -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Resolved] (HDFS-6034) DN registration should use DataNodeLayoutVersion instead of NameNodeLayoutVersion
[ https://issues.apache.org/jira/browse/HDFS-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE resolved HDFS-6034. -- Resolution: Fixed Fix Version/s: HDFS-5535 (Rolling upgrades) Hadoop Flags: Reviewed Haohui, thanks a lot for reviewing and testing the patch. I have committed this. > DN registration should use DataNodeLayoutVersion instead of > NameNodeLayoutVersion > - > > Key: HDFS-6034 > URL: https://issues.apache.org/jira/browse/HDFS-6034 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: datanode, ha, hdfs-client, namenode >Affects Versions: HDFS-5535 (Rolling upgrades) >Reporter: Haohui Mai >Assignee: Tsz Wo (Nicholas), SZE > Fix For: HDFS-5535 (Rolling upgrades) > > Attachments: h6034_20140228.patch > > > Currently the registrationID is in the form of > {{BlockPoolID-NNLayoutVersion}}. When the NN bumps the layout version during > rolling upgrade, the DNs will fail to register with the new NN and exit, > regardless whether the new software version is used. -- This message was sent by Atlassian JIRA (v6.1.5#6160)