Jenkins build is back to normal : Hadoop-Hdfs-trunk #978

2012-03-08 Thread Apache Jenkins Server
See 




Jenkins build became unstable: Hadoop-Hdfs-0.23-PB-Build #14

2012-03-08 Thread Apache Jenkins Server
See 




Hadoop-Hdfs-0.23-PB-Build - Build # 14 - Unstable

2012-03-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-PB-Build/14/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 15126 lines...]
[INFO] Deleting 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-PB-Build/trunk/hadoop-hdfs-project/target
[INFO] 
[INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-PB-Build/trunk/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-install-plugin:2.3.1:install (default-install) @ 
hadoop-hdfs-project ---
[INFO] Installing 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-0.23-PB-Build/trunk/hadoop-hdfs-project/pom.xml
 to 
/home/jenkins/.m2/repository/org/apache/hadoop/hadoop-hdfs-project/0.23.3-SNAPSHOT/hadoop-hdfs-project-0.23.3-SNAPSHOT.pom
[INFO] 
[INFO] --- maven-antrun-plugin:1.6:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  SUCCESS [8:14.180s]
[INFO] Apache Hadoop HttpFS .. SUCCESS [53.423s]
[INFO] Apache Hadoop HDFS Project  SUCCESS [0.057s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 9:08.319s
[INFO] Finished at: Thu Mar 08 11:43:23 UTC 2012
[INFO] Final Memory: 77M/743M
[INFO] 
+ /home/jenkins/tools/maven/latest/bin/mvn test 
-Dmaven.test.failure.ignore=true -Pclover 
-DcloverLicenseLocation=/home/jenkins/tools/clover/latest/lib/clover.license
Archiving artifacts
Recording test results
Build step 'Publish JUnit test result report' changed build result to UNSTABLE
Publishing Javadoc
Recording fingerprints
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Unstable
Sending email for trigger: Unstable



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade.testDistributedUpgrade

Error Message:
test timed out after 12 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 12 milliseconds
at java.lang.Thread.sleep(Native Method)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:684)
at 
org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:540)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:255)
at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:79)
at 
org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:241)
at 
org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade.__CLR3_0_2lxfdoy18mr(TestDistributedUpgrade.java:143)
at 
org.apache.hadoop.hdfs.server.common.TestDistributedUpgrade.testDistributedUpgrade(TestDistributedUpgrade.java:100)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallabl

[jira] [Created] (HDFS-3064) Allow datanodes to start with non-privileged ports for testing.

2012-03-08 Thread Jitendra Nath Pandey (Created) (JIRA)
Allow datanodes to start with non-privileged ports for testing.
---

 Key: HDFS-3064
 URL: https://issues.apache.org/jira/browse/HDFS-3064
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Jitendra Nath Pandey
Assignee: Jitendra Nath Pandey


HADOOP-8078 allows enabling security in unit tests. However, datanodes still 
can't be started because they require privileged ports. We should allow 
datanodes to come up on non-privileged ports ONLY for testing. This part of the 
code will be removed anyway, when HDFS-2856 is committed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Hadoop-Hdfs-trunk-Commit - Build # 1927 - Failure

2012-03-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/1927/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 7778 lines...]
WARN: The method class 
org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html for an explanation.
[INFO] Compiled completed in 0:00:00.014
[INFO] 
[INFO] --- jspc-maven-plugin:2.0-alpha-3:compile (datanode) @ hadoop-hdfs ---
[WARNING] Compiled JSPs will not be added to the project and web.xml will not 
be modified, either because includeInProject is set to false or because the 
project's packaging is not 'war'.
[INFO] Compiling 3 JSP source files to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-src/main/jsp
WARN: The method class 
org.apache.commons.logging.impl.SLF4JLogFactory#release() was invoked.
WARN: Please see http://www.slf4j.org/codes.html for an explanation.
[INFO] Compiled completed in 0:00:00.014
[INFO] 
[INFO] --- build-helper-maven-plugin:1.5:add-source (add-source) @ hadoop-hdfs 
---
[INFO] Source directory: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-sources/java
 added.
[INFO] Source directory: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/hadoop-hdfs-project/hadoop-hdfs/target/generated-src/main/jsp
 added.
[INFO] 
[INFO] --- maven-antrun-plugin:1.6:run (compile-proto) @ hadoop-hdfs ---
[INFO] Executing tasks

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-resources-plugin:2.2:resources (default-resources) @ 
hadoop-hdfs ---
[INFO] Using default encoding to copy filtered resources.
[INFO] 
[INFO] --- maven-compiler-plugin:2.3.2:compile (default-compile) @ hadoop-hdfs 
---
[INFO] Compiling 410 source files to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/hadoop-hdfs-project/hadoop-hdfs/target/classes
[INFO] -
[ERROR] COMPILATION ERROR : 
[INFO] -
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java:[682,25]
 tokenRefetchNeeded(java.io.IOException,java.net.InetSocketAddress) is already 
defined in org.apache.hadoop.hdfs.DFSClient
[INFO] 1 error
[INFO] -
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [16.146s]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS Project  SKIPPED
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 16.564s
[INFO] Finished at: Thu Mar 08 18:42:53 UTC 2012
[INFO] Final Memory: 25M/331M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) 
on project hadoop-hdfs: Compilation failure
[ERROR] 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Commit/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSClient.java:[682,25]
 tokenRefetchNeeded(java.io.IOException,java.net.InetSocketAddress) is already 
defined in org.apache.hadoop.hdfs.DFSClient
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Updating HDFS-2976
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
No tests ran.

[jira] [Created] (HDFS-3065) HA: Newly active NameNode does not recognize decommissioning DataNode

2012-03-08 Thread Stephen Chu (Created) (JIRA)
HA: Newly active NameNode does not recognize decommissioning DataNode
-

 Key: HDFS-3065
 URL: https://issues.apache.org/jira/browse/HDFS-3065
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: ha
Affects Versions: HA branch (HDFS-1623)
Reporter: Stephen Chu


I'm working on a cluster where, originally, styx01 hosts the active NameNode 
and styx02 hosts the standby NameNode. 

In both styx01's and styx02's exclude file, I added the DataNode on styx03.I 
then ran _hdfs dfsadmin -refreshNodes_ and verified on styx01 NN web UI that 
the DN on styx03 was decommissioning. After waiting a few minutes, I checked 
the standby NN web UI (while the DN was decommissioning) and didn't see that 
the DN was marked as decommissioning.

I executed manual failover, making styx02 NN active and styx01 NN standby. I 
checked the newly active NN web UI, and the DN was still not marked as 
decommissioning, even after a few minutes. However, the newly standby NN's web 
UI still showed the DN as decommissioning.

I added another DN to the exclude file, and executed _hdfs dfsadmin 
-refreshNodes_, but the styx02 NN web UI still did not update with the 
decommissioning nodes.

I failed back over to make styx01 NN active and styx02 NN standby. I checked 
the styx01 NN web UI and saw that it correctly marked 2 DNs as decommissioning.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3066) cap space usage of default log4j rolling policy (hdfs specific changes)

2012-03-08 Thread Patrick Hunt (Created) (JIRA)
cap space usage of default log4j rolling policy (hdfs specific changes)
---

 Key: HDFS-3066
 URL: https://issues.apache.org/jira/browse/HDFS-3066
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: scripts
Reporter: Patrick Hunt
Assignee: Patrick Hunt


see HADOOP-8149 for background on this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3067) Null pointer in DFSInputStream.readBuffer if read is repeated on singly-replicated corrupted block

2012-03-08 Thread Henry Robinson (Created) (JIRA)
Null pointer in DFSInputStream.readBuffer if read is repeated on 
singly-replicated corrupted block
--

 Key: HDFS-3067
 URL: https://issues.apache.org/jira/browse/HDFS-3067
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Henry Robinson
Assignee: Henry Robinson


With a singly-replicated block that's corrupted, issuing a read against it 
twice in succession (e.g. if ChecksumException is caught by the client) gives a 
NullPointerException.

Here's the body of a test that reproduces the problem:

{code}

final short REPL_FACTOR = 1;
final long FILE_LENGTH = 512L;
cluster.waitActive();
FileSystem fs = cluster.getFileSystem();

Path path = new Path("/corrupted");

DFSTestUtil.createFile(fs, path, FILE_LENGTH, REPL_FACTOR, 12345L);
DFSTestUtil.waitReplication(fs, path, REPL_FACTOR);

ExtendedBlock block = DFSTestUtil.getFirstBlock(fs, path);
int blockFilesCorrupted = cluster.corruptBlockOnDataNodes(block);
assertEquals("All replicas not corrupted", REPL_FACTOR, 
blockFilesCorrupted);

InetSocketAddress nnAddr =
new InetSocketAddress("localhost", cluster.getNameNodePort());
DFSClient client = new DFSClient(nnAddr, conf);
DFSInputStream dis = client.open(path.toString());
byte[] arr = new byte[(int)FILE_LENGTH];
boolean sawException = false;
try {
  dis.read(arr, 0, (int)FILE_LENGTH);
} catch (ChecksumException ex) { 
  sawException = true;
}

assertTrue(sawException);
sawException = false;
try {
  dis.read(arr, 0, (int)FILE_LENGTH); // <-- NPE thrown here
} catch (ChecksumException ex) { 
  sawException = true;
} 
{code}

The stack:

{code}
java.lang.NullPointerException
at 
org.apache.hadoop.hdfs.DFSInputStream.readBuffer(DFSInputStream.java:492)
at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:545)
[snip test stack]
{code}

and the problem is that currentNode is null. It's left at null after the first 
read, which fails, and then is never refreshed because the condition in read 
that protects blockSeekTo is only triggered if the current position is outside 
the block's range. 


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Merge Namenode HA feature to 0.23

2012-03-08 Thread Aaron T. Myers
+1

I applied the patch to branch-0.23. It compiles just fine. I built a
distribution tar, deployed it to a 4-node cluster, and ran some smoke tests
with HA enabled. All seemed good.

I also ran the following unit tests, which should exercise the relevant HA
code:

TestOfflineEditsViewer,TestHDFSConcat,TestEditLogRace,TestNameEditsConfigs,TestSaveNamespace,TestEditLogFileOutputStream,TestFileJournalManager,TestCheckpoint,TestEditLog,TestFSEditLogLoader,TestFsLimits,TestSecurityTokenEditLog,TestStorageRestore,TestBackupNode,TestEditLogJournalFailures,TestEditLogTailer,TestEditLogsDuringFailover,TestFailureToReadEdits,TestHASafeMode,TestHAStateTransitions,TestFailureOfSharedDir,TestDNFencing,TestStandbyIsHot,TestGenericJournalConf,TestCheckPointForSecurityTokens,TestNNStorageRetentionManager,TestPersistBlocks,TestPBHelper,TestNNLeaseRecovery

Of these, they all passed except for TestOfflineEditsViewer and
TestPersistBlocks. These failed because the patch obviously doesn't include
changes to a few binary files which the tests rely on. Assuming that when
you merge to branch-0.23 you do an actual svn merge, and don't just apply
the patch, then these won't be a problem.

--
Aaron T. Myers
Software Engineer, Cloudera



On Wed, Mar 7, 2012 at 9:26 PM, Suresh Srinivas wrote:

> I have merged the change required for merging Namenode HA. I have also
> attached a release 23 patch in the jira HDFS-1623. Please take a look the
> attached patch and let me know if that looks good.
>
> Regards,
> Suresh
>


[jira] [Created] (HDFS-3068) RemoteBlockReader2 fails when using SocksSocketFactory

2012-03-08 Thread Tom White (Created) (JIRA)
RemoteBlockReader2 fails when using SocksSocketFactory 
---

 Key: HDFS-3068
 URL: https://issues.apache.org/jira/browse/HDFS-3068
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs client
Affects Versions: 0.23.1
Reporter: Tom White


When hadoop.rpc.socket.factory.class.default is set to 
org.apache.hadoop.net.SocksSocketFactory, HDFS file reads fail with errors like

{noformat}
Socket Socket[addr=/10.12.185.132,port=50010,localport=55216] does not have an 
associated Channel.
{noformat}

The workaround is to set dfs.client.use.legacy.blockreader=true to use the old 
implementation of RemoteBlockReader. RemoteBlockReader should not be removed 
until this bug is fixed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3069) If an edits file has more edits in it than expected by its name, should trigger an error

2012-03-08 Thread Todd Lipcon (Created) (JIRA)
If an edits file has more edits in it than expected by its name, should trigger 
an error


 Key: HDFS-3069
 URL: https://issues.apache.org/jira/browse/HDFS-3069
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.24.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon


In testing what happens in HA split brain scenarios, I ended up with an edits 
log that was named edits_47-47 but actually had two edits in it (#47 and #48). 
The edits loading process should detect this situation and barf. Otherwise, the 
problem shows up later during loading or even on the next restart, and is tough 
to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes

2012-03-08 Thread Stephen Chu (Created) (JIRA)
hdfs balancer doesn't balance blocks between datanodes
--

 Key: HDFS-3070
 URL: https://issues.apache.org/jira/browse/HDFS-3070
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: balancer
Affects Versions: 0.24.0
Reporter: Stephen Chu
 Attachments: unbalanced_nodes.png

I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, 
both have over 3% disk usage.
Attached is a screenshot of the Live Nodes web UI.

On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see 
the blocks being balanced across all 4 datanodes (all blocks on styx01 and 
styx02 stay put).

HA is currently enabled.

[schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
active
[schu@styx01 ~]$ hdfs balancer -threshold 1
12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
12/03/08 10:10:32 INFO balancer.Balancer: p = 
Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
Time Stamp   Iteration#  Bytes Already Moved  Bytes Left To Move  
Bytes Being Moved
Balancing took 95.0 milliseconds
[schu@styx01 ~]$ 

I believe with a threshold of 1% the balancer should trigger blocks being moved 
across DataNodes, right? I am curious about the "namenode = []" from the above 
output.

[schu@styx01 ~]$ hadoop version
Hadoop 0.24.0-SNAPSHOT
Subversion 
git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
 -r f6a577d697bbcd04ffbc568167c97b79479ff319
Compiled by schu on Thu Mar  8 15:32:50 PST 2012
>From source with checksum ec971a6e7316f7fbf471b617905856b8

>From 
>http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
The threshold parameter is a fraction in the range of (0%, 100%) with a default 
value of 10%. The threshold sets a target for whether the cluster is balanced. 
A cluster is balanced if for each datanode, the utilization of the node (ratio 
of used space at the node to total capacity of the node) differs from the 
utilization of the (ratio of used space in the cluster to total capacity of the 
cluster) by no more than the threshold value. The smaller the threshold, the 
more balanced a cluster will become. It takes more time to run the balancer for 
small threshold values. Also for a very small threshold the cluster may not be 
able to reach the balanced state when applications write and delete files 
concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review Request: ByteBuffer-based read API for DFSInputStream (review 2)

2012-03-08 Thread Henry Robinson

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4212/
---

(Updated 2012-03-09 00:47:24.765130)


Review request for hadoop-hdfs and Todd Lipcon.


Summary
---

New patch for HDFS-2834 (I can't update the old review request).


This addresses bug HDFS-2834.
http://issues.apache.org/jira/browse/HDFS-2834


Diffs
-

  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReader.java
 dfab730 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/BlockReaderLocal.java
 cc61697 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java
 4187f1c 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSInputStream.java
 2b817ff 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader.java
 b7da8d4 
  
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/RemoteBlockReader2.java
 ea24777 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/BlockReaderTestUtil.java
 9d4f4a2 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestBlockReaderLocal.java
 PRE-CREATION 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestParallelRead.java
 bbd0012 
  
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestShortCircuitLocalRead.java
 eb2a1d8 

Diff: https://reviews.apache.org/r/4212/diff


Testing
---


Thanks,

Henry



[jira] [Created] (HDFS-3071) haadmin failover command does not provide enough detail for when target NN is not ready to be active

2012-03-08 Thread Philip Zeyliger (Created) (JIRA)
haadmin failover command does not provide enough detail for when target NN is 
not ready to be active


 Key: HDFS-3071
 URL: https://issues.apache.org/jira/browse/HDFS-3071
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger


When running the failover command, you can get an error message like the 
following:
{quote}
$ hdfs --config $(pwd) haadmin -failover namenode2 namenode1
Failover failed: xxx.yyy/1.2.3.4:8020 is not ready to become active
{quote}
Unfortunately, the error message doesn't describe why that node isn't ready to 
be active.  In my case, the target namenode's logs don't indicate anything 
either. It turned out that the issue was "Safe mode is ON.Resources are low on 
NN. Safe mode must be turned off manually.", but ideally the user would be told 
that at the time of the failover.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-3072) haadmin should have configurable timeouts for failover commands

2012-03-08 Thread Philip Zeyliger (Created) (JIRA)
haadmin should have configurable timeouts for failover commands
---

 Key: HDFS-3072
 URL: https://issues.apache.org/jira/browse/HDFS-3072
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha
Affects Versions: 0.24.0
Reporter: Philip Zeyliger


The HAAdmin failover could should time out reasonably aggressively and go onto 
the fencing strategies if it's dealing with a mostly dead active namenode.  
Currently it uses what's probably the default, which is to say no timeout 
whatsoever.

{code}
  /**
   * Return a proxy to the specified target service.
   */
  protected HAServiceProtocol getProtocol(String serviceId)
  throws IOException {
String serviceAddr = getServiceAddr(serviceId);
InetSocketAddress addr = NetUtils.createSocketAddr(serviceAddr);
return (HAServiceProtocol)RPC.getProxy(
  HAServiceProtocol.class, HAServiceProtocol.versionID,
  addr, getConf());
  }
{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira