Re: [VOTE] Merge HDFS-5535 Rolling Upgrade Improvement

2014-02-28 Thread Chris Nauroth
+1 for the merge.

I just got caught up on the current state of the branch, and it looks good.
 End user documentation is in place.  I deployed a cluster built from the
branch, and then I used the documentation to test various scenarios of
rolling upgrade, downgrade and rollback.  Everything worked as expected.
 This looks ready to merge.

Nice work, everyone!

Chris Nauroth
Hortonworks
http://hortonworks.com/



On Thu, Feb 27, 2014 at 7:02 PM, Kihwal Lee  wrote:

> +1
>
> > On Feb 25, 2014, at 3:42 PM, "Tsz Wo Sze"  wrote:
> >
> > Hi hdfs-dev,
> >
> > We propose merging the HDFS-5535 branch to trunk.
> >
> > HDFS Rolling Upgrade is a feature to allow upgrading individual HDFS
> daemons.  In Hadoop v2, HDFS supports highly-available (HA) namenode
> services and wire compatibility. These two capabilities make it feasible to
> upgrade HDFS without incurring HDFS downtime.  We make such improvement in
> the HDFS-5535 branch.
> >
> > The HDFS-5535 branch is ready to be merged to trunk.  As this being
> written, there are 48 subtasks in HDFS-5535; 44 subtasks are already
> completed.  The core developments including feature development, unit tests
> and user doc, are already done.  The merge patch posted a few ago already
> passed Jenkins.  I will post a updated patch to trigger Jenkins again for
> the latest code base.
> >
> > The remaining JIRAs are:
> >
> > HDFS-3225: Revist upgrade snapshots, roll back, finalize to enable
> rolling upgrades (assigned to Sanjay)
> > HDFS-6000: Avoid saving namespace when starting rolling upgrade
> (assigned to Jing)
> > HDFS-6013: add rollingUpgrade information to  latest UI (assigned to
> Vinay)
> > HDFS-6016: Update datanode replacement policy to make writes more robust
> (assigned to Kihwal)
> >
> > HDFS-6000 will be committed soon.  All other issues are further
> improvements which can be done after merge.
> >
> > The other remaining works are:
> > - Revise the design doc
> > - Post a test plan (Haohui is working on it.)
> > - Execute the manual tests (Haohui and Fengdong will work on it.)
> >
> > The work was a collective effort of Nathan Roberts, Sanjay Radia, Suresh
> Srinivas, Kihwal Lee, Jing Zhao, Arpit Agarwal, Brandon Li, Haohui Mai,
> Vinayakumar B, Fengdong Yu, Chris Nauroth and Tsz-Wo Nicholas Sze, who have
> proposed the design, worked on the code, reviewed patches, tested the
> features and authored documentation.  We thank everyone that who has gave
> us valuable comments and feedback on the feature.
> >
> > The vote runs for 7 days.  Here is my +1 on the merge.
> >
> > Thanks.
> > Tsz-Wo
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


Re: [VOTE] Merge HDFS-5535 Rolling Upgrade Improvement

2014-02-28 Thread Jing Zhao
+1 for the merging.


On Fri, Feb 28, 2014 at 10:58 AM, Chris Nauroth wrote:

> +1 for the merge.
>
> I just got caught up on the current state of the branch, and it looks good.
>  End user documentation is in place.  I deployed a cluster built from the
> branch, and then I used the documentation to test various scenarios of
> rolling upgrade, downgrade and rollback.  Everything worked as expected.
>  This looks ready to merge.
>
> Nice work, everyone!
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Thu, Feb 27, 2014 at 7:02 PM, Kihwal Lee  wrote:
>
> > +1
> >
> > > On Feb 25, 2014, at 3:42 PM, "Tsz Wo Sze"  wrote:
> > >
> > > Hi hdfs-dev,
> > >
> > > We propose merging the HDFS-5535 branch to trunk.
> > >
> > > HDFS Rolling Upgrade is a feature to allow upgrading individual HDFS
> > daemons.  In Hadoop v2, HDFS supports highly-available (HA) namenode
> > services and wire compatibility. These two capabilities make it feasible
> to
> > upgrade HDFS without incurring HDFS downtime.  We make such improvement
> in
> > the HDFS-5535 branch.
> > >
> > > The HDFS-5535 branch is ready to be merged to trunk.  As this being
> > written, there are 48 subtasks in HDFS-5535; 44 subtasks are already
> > completed.  The core developments including feature development, unit
> tests
> > and user doc, are already done.  The merge patch posted a few ago already
> > passed Jenkins.  I will post a updated patch to trigger Jenkins again for
> > the latest code base.
> > >
> > > The remaining JIRAs are:
> > >
> > > HDFS-3225: Revist upgrade snapshots, roll back, finalize to enable
> > rolling upgrades (assigned to Sanjay)
> > > HDFS-6000: Avoid saving namespace when starting rolling upgrade
> > (assigned to Jing)
> > > HDFS-6013: add rollingUpgrade information to  latest UI (assigned to
> > Vinay)
> > > HDFS-6016: Update datanode replacement policy to make writes more
> robust
> > (assigned to Kihwal)
> > >
> > > HDFS-6000 will be committed soon.  All other issues are further
> > improvements which can be done after merge.
> > >
> > > The other remaining works are:
> > > - Revise the design doc
> > > - Post a test plan (Haohui is working on it.)
> > > - Execute the manual tests (Haohui and Fengdong will work on it.)
> > >
> > > The work was a collective effort of Nathan Roberts, Sanjay Radia,
> Suresh
> > Srinivas, Kihwal Lee, Jing Zhao, Arpit Agarwal, Brandon Li, Haohui Mai,
> > Vinayakumar B, Fengdong Yu, Chris Nauroth and Tsz-Wo Nicholas Sze, who
> have
> > proposed the design, worked on the code, reviewed patches, tested the
> > features and authored documentation.  We thank everyone that who has gave
> > us valuable comments and feedback on the feature.
> > >
> > > The vote runs for 7 days.  Here is my +1 on the merge.
> > >
> > > Thanks.
> > > Tsz-Wo
> >
>
> --
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to
> which it is addressed and may contain information that is confidential,
> privileged and exempt from disclosure under applicable law. If the reader
> of this message is not the intended recipient, you are hereby notified that
> any printing, copying, dissemination, distribution, disclosure or
> forwarding of this communication is strictly prohibited. If you have
> received this communication in error, please contact the sender immediately
> and delete it from your system. Thank You.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.


[jira] [Resolved] (HDFS-6031) NN running newer software rejects loading the fsimage during rolling upgrade.

2014-02-28 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6031?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-6031.
--

   Resolution: Fixed
Fix Version/s: HDFS-5535 (Rolling upgrades)
 Hadoop Flags: Reviewed

Thanks Haohui, for testing and reviewing the patch.

I have committed this.

> NN running newer software rejects loading the fsimage during rolling upgrade.
> -
>
> Key: HDFS-6031
> URL: https://issues.apache.org/jira/browse/HDFS-6031
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Haohui Mai
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: HDFS-5535 (Rolling upgrades)
>
> Attachments: h6031_20140227.patch, h6031_20140227b.patch, 
> h6031_20140227c.patch, h6031_20140227d.patch, h6031_20140227e.patch
>
>
> During rolling upgrade, the standby NN that has a newer layout version will 
> complain about that the fsimage is too old, and require starting the NN using 
> the {{-upgrade}} option:
> {noformat}
> File system image contains an old layout version -55. An upgrade to version 
> -56 is required.
> {noformat}
> However, {{-upgrade}} is disabled during rolling upgrade, thus it is 
> impossible to restart the standby NN:
> {noformat}
> org.apache.hadoop.hdfs.protocol.RollingUpgradeException: Failed to upgrade 
> namenode since a rolling upgrade is already in progress. Existing rolling 
> upgrade info:
> ...
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HDFS-4200) Reduce the size of synchronized sections in PacketResponder

2014-02-28 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-4200.
-

   Resolution: Fixed
Fix Version/s: 2.4.0

I've merged the backported patch to branch-2 and branch-2.4.0. Thanks for the 
backporting, Andrew!

> Reduce the size of synchronized sections in PacketResponder 
> 
>
> Key: HDFS-4200
> URL: https://issues.apache.org/jira/browse/HDFS-4200
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.0.0-alpha
>Reporter: Suresh Srinivas
>Assignee: Andrew Wang
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HDFS-4200.patch, HDFS-4200.patch, 
> hdfs-4200-branch-2.patch
>
>
> The size of synchronized sections can be reduced in PacketResponder. Also the 
> methods in the PacketResponder are long and need refactoring.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-6035) TestCacheDirectives#testCacheManagerRestart is failing on branch-2

2014-02-28 Thread Mit Desai (JIRA)
Mit Desai created HDFS-6035:
---

 Summary: TestCacheDirectives#testCacheManagerRestart is failing on 
branch-2
 Key: HDFS-6035
 URL: https://issues.apache.org/jira/browse/HDFS-6035
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 2.4.0
Reporter: Mit Desai


{noformat}
java.io.IOException: Inconsistent checkpoint fields.
LV = -51 namespaceID = 1641397469 cTime = 0 ; clusterId = testClusterID ; 
blockpoolId = BP-423574854-x.x.x.x-1393478669835.
Expecting respectively: -51; 2; 0; testClusterID; 
BP-2051361571-x.x.x.x-1393478572877.
at 
org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:133)
at 
org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:526)
at 
org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives.testCacheManagerRestart(TestCacheDirectives.java:582)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HDFS-5333) Improvement of current HDFS Web UI

2014-02-28 Thread Haohui Mai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Haohui Mai resolved HDFS-5333.
--

   Resolution: Fixed
Fix Version/s: 2.3.0

The new web UI has been merged into 2.3. Thanks everybody for the work!

> Improvement of current HDFS Web UI
> --
>
> Key: HDFS-5333
> URL: https://issues.apache.org/jira/browse/HDFS-5333
> Project: Hadoop HDFS
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Haohui Mai
> Fix For: 2.3.0
>
>
> This is an umbrella jira for improving the current JSP-based HDFS Web UI. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Created] (HDFS-6036) Forcibly timeout misbehaving DFSClients that try to do no-checksum reads that extend too long

2014-02-28 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-6036:
--

 Summary: Forcibly timeout misbehaving DFSClients that try to do 
no-checksum reads that extend too long
 Key: HDFS-6036
 URL: https://issues.apache.org/jira/browse/HDFS-6036
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: caching, datanode
Affects Versions: 2.4.0
Reporter: Colin Patrick McCabe
Assignee: Colin Patrick McCabe


We should forcibly timeout misbehaving DFSClients that try to do no-checksum 
reads that extend too long.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)


[jira] [Resolved] (HDFS-6034) DN registration should use DataNodeLayoutVersion instead of NameNodeLayoutVersion

2014-02-28 Thread Tsz Wo (Nicholas), SZE (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tsz Wo (Nicholas), SZE resolved HDFS-6034.
--

   Resolution: Fixed
Fix Version/s: HDFS-5535 (Rolling upgrades)
 Hadoop Flags: Reviewed

Haohui, thanks a lot for reviewing and testing the patch.

I have committed this.

> DN registration should use DataNodeLayoutVersion instead of 
> NameNodeLayoutVersion
> -
>
> Key: HDFS-6034
> URL: https://issues.apache.org/jira/browse/HDFS-6034
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: datanode, ha, hdfs-client, namenode
>Affects Versions: HDFS-5535 (Rolling upgrades)
>Reporter: Haohui Mai
>Assignee: Tsz Wo (Nicholas), SZE
> Fix For: HDFS-5535 (Rolling upgrades)
>
> Attachments: h6034_20140228.patch
>
>
> Currently the registrationID is in the form of 
> {{BlockPoolID-NNLayoutVersion}}. When the NN bumps the layout version during 
> rolling upgrade, the DNs will fail to register with the new NN and exit, 
> regardless whether the new software version is used.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)