[jira] [Created] (HDFS-6154) Improve the speed of saveNameSpace,making HDFS restart and checkPoint faster

2014-03-25 Thread guodongdong (JIRA)
guodongdong created HDFS-6154:
-

 Summary: Improve the speed of saveNameSpace,making HDFS restart 
and checkPoint faster
 Key: HDFS-6154
 URL: https://issues.apache.org/jira/browse/HDFS-6154
 Project: Hadoop HDFS
  Issue Type: Improvement
Affects Versions: 2.3.0
Reporter: guodongdong


There are two stage In namenode savenamespace,  serializing INode, calculate 
MD5 and write to disk.  Now, two stage is doing serially, In this improvement, 
it is doing  parallel, one thread do serializing INode, other thread do 
calculating MD5 and writing to disk, it double speed of savenamespace, Detail 
is show in table:

Testing environment:
  only test namenode savenamespace, dfsadmin -saveNamespace
machine: 144GB, Intel(R) Xeon(R) CPU  E5645  @ 2.40GHz, 12 cpu, Raid 5 SAS 
Disk,  jdk 1.7.0
 
||image size||before optimizing||after optimizing ||
|1.2GB|22sec|11sec|
|4.3GB|66sec|36sec|
|22GB|406sec|250sec|



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6155) Fix Boxing/unboxing to parse a primitive findbugs warnings

2014-03-25 Thread Suresh Srinivas (JIRA)
Suresh Srinivas created HDFS-6155:
-

 Summary: Fix Boxing/unboxing to parse a primitive findbugs warnings
 Key: HDFS-6155
 URL: https://issues.apache.org/jira/browse/HDFS-6155
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


There are many instances of such findbugs warnings related to performance. See 
for details - 
http://findbugs.sourceforge.net/bugDescriptions.html#DM_BOXED_PRIMITIVE_FOR_PARSING



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-8) Interrupting the namenode thread triggers System.exit()

2014-03-25 Thread Steve Loughran (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-8?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran resolved HDFS-8.
---

Resolution: Cannot Reproduce

closing as so out of date it probably doesn't happen -and if it does, the stack 
trace is obsolete

> Interrupting the namenode thread triggers System.exit()
> ---
>
> Key: HDFS-8
> URL: https://issues.apache.org/jira/browse/HDFS-8
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Steve Loughran
>Priority: Minor
>
> My service setup/teardown tests are managing to trigger system exits in the 
> namenode, which seems overkill.
> 1. Interrupting the thread that is starting the namesystem up raises a 
> java.nio.channels.ClosedByInterruptException.
> 2. This is caught in FSImage.rollFSImage, and handed off to processIOError
> 3. This triggers a call to Runtime.getRuntime().exit(-1); "All storage 
> directories are inaccessible.".
> Stack trace to follow. Exiting the JVM is somewhat overkill; if someone has 
> interrupted the thread is is (presumably) because they want to stop the 
> namenode, which may not imply they want to kill the JVM at the same time. 
> Certainly JUnit does not expect it. 
> Some possibilities
>  -ClosedByInterruptException get handled differently as some form of shutdown 
> request
>  -Calls to system exit are factored out into something that can have its 
> behaviour changed by policy options to throw a RuntimeException instead. 
> Hosting a Namenode in a security manager that blocks off System.exit() is the 
> simplest workaround; this is fairly simple, but it means that what would be a 
> straight exit does now get turned into an exception, so callers may be 
> surprised by what happens.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


Build failed in Jenkins: Hadoop-Hdfs-trunk #1712

2014-03-25 Thread Apache Jenkins Server
See 

Changes:

[jing9] HDFS-5840. Follow-up to HDFS-5138 to improve error handling during 
partial upgrade failures. Contributed by Aaron T. Myers, Suresh Srinivas, and 
Jing Zhao.

[suresh] HDFS-6125. Cleanup unnecessary cast in HDFS code base. Contributed by 
Suresh Srinivas.

[vinodkv] YARN-1850. Introduced the ability to optionally disable sending out 
timeline-events in the TimelineClient. Contributed by Zhijie Shen.

[szetszwo] HADOOP-10425. LocalFileSystem.getContentSummary should not count crc 
files.

[vinodkv] MAPREDUCE-5795. Fixed MRAppMaster to record the correct job-state 
after it recovers from a commit during a previous attempt. Contributed by Xuan 
Gong.

[arp] HDFS-6124. Add final modifier to class members. (Contributed by Suresh 
Srinivas)

[kasha] HADOOP-10423. Clarify compatibility policy document for combination of 
new client and old server. (Chris Nauroth via kasha)

[cnauroth] HADOOP-10422. Remove redundant logging of RPC retry attempts. 
Contributed by Chris Nauroth.

[cnauroth] HDFS-5846. Shuffle phase is slow in Windows - 
FadviseFileRegion::transferTo does not read disks efficiently. Contributed by 
Nikola Vujic.

[jing9] Move HDFS-5138 to 2.4.0 section in CHANGES.txt

[jing9] HDFS-6135. In HDFS upgrade with HA setup, JournalNode cannot handle 
layout version bump when rolling back. Contributed by Jing Zhao.

[brandonli] HDFS-6050. NFS does not handle exceptions correctly in a few 
places. Contributed by Brandon Li

[jianhe] YARN-1852. Fixed RMAppAttempt to not resend 
AttemptFailed/AttemptKilled events to already recovered Failed/Killed RMApps. 
Contributed by Rohith Sharmaks

[cnauroth] MAPREDUCE-5791. Shuffle phase is slow in Windows - 
FadviseFileRegion::transferTo does not read disks efficiently. Contributed by 
Nikola Vujic.

[szetszwo] HADOOP-10015. UserGroupInformation prints out excessive warnings.  
Contributed by Nicolas Liochon

[zjshen] YARN-1838. Enhanced timeline service getEntities API to get entities 
from a given entity ID or insertion timestamp. Contributed by Billie Rinaldi.

[jeagles] YARN-1670. aggregated log writer can write more log data then it says 
is the log length (Mit Desai via jeagles)

[kihwal] HDFS-3087. Decomissioning on NN restart can complete without blocks 
being replicated. Contributed by Rushabh S Shah.

--
[...truncated 12806 lines...]
Running org.apache.hadoop.hdfs.server.namenode.TestFSNamesystem
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.701 sec - in 
org.apache.hadoop.hdfs.server.namenode.TestFSNamesystem
Running org.apache.hadoop.hdfs.server.namenode.TestStreamFile
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.046 sec - in 
org.apache.hadoop.hdfs.server.namenode.TestStreamFile
Running org.apache.hadoop.hdfs.server.namenode.TestNNStorageRetentionManager
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.029 sec - in 
org.apache.hadoop.hdfs.server.namenode.TestNNStorageRetentionManager
Running org.apache.hadoop.hdfs.server.namenode.TestCorruptFilesJsp
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.017 sec - in 
org.apache.hadoop.hdfs.server.namenode.TestCorruptFilesJsp
Running org.apache.hadoop.hdfs.server.namenode.TestBlockUnderConstruction
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.173 sec - in 
org.apache.hadoop.hdfs.server.namenode.TestBlockUnderConstruction
Running org.apache.hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.524 sec - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestFailureOfSharedDir
Running org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.246 sec - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyIsHot
Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 27.825 sec - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
Running org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions
Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 71.548 sec - 
in org.apache.hadoop.hdfs.server.namenode.ha.TestHAStateTransitions
Running org.apache.hadoop.hdfs.server.namenode.ha.TestStateTransitionFailure
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.322 sec - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestStateTransitionFailure
Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
Tests run: 20, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 87.207 sec - 
in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
Running org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints
Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 56.52 sec - in 
org.apache.hadoop.hdfs.server.namenode.ha.TestSt

Hadoop-Hdfs-trunk - Build # 1712 - Failure

2014-03-25 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/1712/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 12999 lines...]
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE 
[2:06:36.545s]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [2.755s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 2:06:40.796s
[INFO] Finished at: Tue Mar 25 13:42:48 UTC 2014
[INFO] Final Memory: 34M/456M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.16:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/trunk/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating YARN-1838
Updating HDFS-5846
Updating HADOOP-10015
Updating HDFS-3087
Updating HADOOP-10423
Updating HDFS-5840
Updating HADOOP-10422
Updating YARN-1670
Updating HDFS-6050
Updating HDFS-5138
Updating MAPREDUCE-5795
Updating HADOOP-10425
Updating HDFS-6125
Updating HDFS-6135
Updating MAPREDUCE-5791
Updating YARN-1850
Updating HDFS-6124
Updating YARN-1852
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testSafeBlockTracking

Error Message:
Bad safemode status: 'Safe mode is ON. The reported blocks 12 needs additional 
3 blocks to reach the threshold 0.9990 of total blocks 15.
The number of live datanodes 3 has reached the minimum number 0. In safe mode 
extension. Safe mode will be turned off automatically once the thresholds have 
been reached.'

Stack Trace:
java.lang.AssertionError: Bad safemode status: 'Safe mode is ON. The reported 
blocks 12 needs additional 3 blocks to reach the threshold 0.9990 of total 
blocks 15.
The number of live datanodes 3 has reached the minimum number 0. In safe mode 
extension. Safe mode will be turned off automatically once the thresholds have 
been reached.'
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.assertSafeMode(TestHASafeMode.java:493)
at 
org.apache.hadoop.hdfs.server.namenode.ha.TestHASafeMode.testSafeBlockTracking(TestHASafeMode.java:633)




[jira] [Reopened] (HDFS-5672) TestHASafeMode#testSafeBlockTracking fails in trunk

2014-03-25 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HDFS-5672:
--


> TestHASafeMode#testSafeBlockTracking fails in trunk
> ---
>
> Key: HDFS-5672
> URL: https://issues.apache.org/jira/browse/HDFS-5672
> Project: Hadoop HDFS
>  Issue Type: Test
>Reporter: Ted Yu
>
> From build #1614:
> {code}
>  TestHASafeMode.testSafeBlockTracking:623->assertSafeMode:488 Bad safemode 
> status: 'Safe mode is ON. The reported blocks 3 needs additional 7 blocks to 
> reach the threshold 0.9990 of total blocks 10.
> Safe mode will be turned off automatically'
> {code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-5807) TestBalancerWithNodeGroup.testBalancerWithNodeGroup fails intermittently on Branch-2

2014-03-25 Thread Chen He (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen He resolved HDFS-5807.
---

Resolution: Fixed

> TestBalancerWithNodeGroup.testBalancerWithNodeGroup fails intermittently on 
> Branch-2
> 
>
> Key: HDFS-5807
> URL: https://issues.apache.org/jira/browse/HDFS-5807
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.3.0
>Reporter: Mit Desai
>Assignee: Chen He
> Fix For: 3.0.0, 2.4.0
>
> Attachments: HDFS-5807.patch
>
>
> The test times out after some time.
> {noformat}
> java.util.concurrent.TimeoutException: Rebalancing expected avg utilization 
> to become 0.16, but on datanode 127.0.0.1:42451 it remains at 0.3 after more 
> than 2 msec.
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.waitForBalancer(TestBalancerWithNodeGroup.java:151)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.runBalancer(TestBalancerWithNodeGroup.java:178)
>   at 
> org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup.testBalancerWithNodeGroup(TestBalancerWithNodeGroup.java:302)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Resolved] (HDFS-6148) LeaseManager crashes while initiating block recovery

2014-03-25 Thread Kihwal Lee (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-6148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kihwal Lee resolved HDFS-6148.
--

Resolution: Duplicate

> LeaseManager crashes while initiating block recovery
> 
>
> Key: HDFS-6148
> URL: https://issues.apache.org/jira/browse/HDFS-6148
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.3.0
>Reporter: Kihwal Lee
>Assignee: Kihwal Lee
>Priority: Blocker
>
> While running branch-2.4, the LeaseManager crashed with an NPE. This does not 
> always happen on block recovery.
> {panel}
> Exception in thread
> "org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor@5d66b728"
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction$
> 
> ReplicaUnderConstruction.isAlive(BlockInfoUnderConstruction.java:121)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockInfoUnderConstruction.
> initializeBlockRecovery(BlockInfoUnderConstruction.java:286)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.internalReleaseLease(FSNamesystem.java:3746)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.checkLeases(LeaseManager.java:474)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager.access$900(LeaseManager.java:68)
> at 
> org.apache.hadoop.hdfs.server.namenode.LeaseManager$Monitor.run(LeaseManager.java:411)
> at java.lang.Thread.run(Thread.java:722)
> {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6156) Simplify the JMX API that provides snapshot information

2014-03-25 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-6156:


 Summary: Simplify the JMX API that provides snapshot information
 Key: HDFS-6156
 URL: https://issues.apache.org/jira/browse/HDFS-6156
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai


HDFS-5196 introduces a set of new APIs that provide snapshot information 
through JMX. Currently. The API nests {{SnapshotDirectoryMXBean}} into 
{{SnapshotStatsMXBean}}, creating another layer of composition.

This jira proposes to inline {{SnapshotDirectoryMXBean}} into 
{{SnapshotStatsMXBean}} and to simplify the API.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6157) Fix the entry point of OfflineImageViewer for hdfs.cmd

2014-03-25 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-6157:


 Summary: Fix the entry point of OfflineImageViewer for hdfs.cmd
 Key: HDFS-6157
 URL: https://issues.apache.org/jira/browse/HDFS-6157
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6157.000.patch

After HDFS-5797, the entry point of the OfflineImageViewer has changed from 
{{org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewer}} to 
{{org.apache.hadoop.hdfs.tools.offlineImageViewer.OfflineImageViewerPB}}.

The reference in {{hdfs.cmd}} is out-of-date.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (HDFS-6158) Clean up dead code for OfflineImageViewer

2014-03-25 Thread Haohui Mai (JIRA)
Haohui Mai created HDFS-6158:


 Summary: Clean up dead code for OfflineImageViewer
 Key: HDFS-6158
 URL: https://issues.apache.org/jira/browse/HDFS-6158
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Haohui Mai
Assignee: Haohui Mai
 Attachments: HDFS-6158.000.patch

After HDFS-5698, the {{OfflineImageViewer}} and related classes have become 
dead. This jira cleans up the dead code.



--
This message was sent by Atlassian JIRA
(v6.2#6252)