[jira] [Created] (HDFS-7934) During Rolling upgrade rollback ,standby namenode startup fails.

2015-03-16 Thread J.Andreina (JIRA)
J.Andreina created HDFS-7934:


 Summary: During Rolling upgrade rollback ,standby namenode startup 
fails.
 Key: HDFS-7934
 URL: https://issues.apache.org/jira/browse/HDFS-7934
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: J.Andreina
Assignee: J.Andreina
Priority: Critical


During Rolling upgrade rollback , standby namenode startup fails , while 
loading edits and when  there is no local copy of edits created after upgrade ( 
which is already been removed  by Active Namenode from journal manager and from 
Active's local). 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7935) Support multi-homed networks when Kerberos security is enabled

2015-03-16 Thread Arun Suresh (JIRA)
Arun Suresh created HDFS-7935:
-

 Summary: Support multi-homed networks when Kerberos security is 
enabled
 Key: HDFS-7935
 URL: https://issues.apache.org/jira/browse/HDFS-7935
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Arun Suresh
Assignee: Arun Suresh


Currently, during SASL negotiation stage between ipc Client and Server, The 
server sends only a single serviceId (curresponding to a single principal) to 
the client. This is the principal the the server process is logged in as during 
startup.

It is possible that in a multi-homed network, the server might be associated 
with more than one principal, and thus severs must provide the clients all 
possible principals it can use to connect to.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Hadoop-Hdfs-trunk - Build # 2066 - Still Failing

2015-03-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk/2066/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 6506 lines...]
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-hdfs-project 
---
[INFO] Deleting 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/target
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.3:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Skipping javadoc generation
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [  03:02 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  2.157 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 03:02 h
[INFO] Finished at: 2015-03-16T14:36:35+00:00
[INFO] Final Memory: 55M/622M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-hdfs: There are test failures.
[ERROR] 
[ERROR] Please refer to 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk/hadoop-hdfs-project/hadoop-hdfs/target/surefire-reports
 for the individual test results.
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating YARN-3171
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
1 tests failed.
FAILED:  
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect

Error Message:
The map of version counts returned by DatanodeManager was not what it was 
expected to be on iteration 341 expected:<0> but was:<1>

Stack Trace:
java.lang.AssertionError: The map of version counts returned by DatanodeManager 
was not what it was expected to be on iteration 341 expected:<0> but was:<1>
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:743)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:555)
at 
org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect(TestDatanodeManager.java:157)




Build failed in Jenkins: Hadoop-Hdfs-trunk #2066

2015-03-16 Thread Apache Jenkins Server
See 

Changes:

[xgong] YARN-3171. Sort by Application id, AppAttempt and ContainerID doesn't

--
[...truncated 6313 lines...]
Running org.apache.hadoop.hdfs.TestDisableConnCache
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.14 sec - in 
org.apache.hadoop.hdfs.TestDisableConnCache
Running org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 38.348 sec - in 
org.apache.hadoop.hdfs.qjournal.TestSecureNNWithQJM
Running org.apache.hadoop.hdfs.qjournal.server.TestJournalNode
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 50.15 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournalNode
Running org.apache.hadoop.hdfs.qjournal.server.TestJournal
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.785 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournal
Running org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.645 sec - in 
org.apache.hadoop.hdfs.qjournal.server.TestJournalNodeMXBean
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit
Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.457 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManagerUnit
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
Tests run: 21, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 20.92 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager
Running org.apache.hadoop.hdfs.qjournal.client.TestSegmentRecoveryComparator
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.299 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestSegmentRecoveryComparator
Running org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChannel
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.165 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChannel
Running org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.638 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestEpochsAreUnique
Running org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 151.262 sec - 
in org.apache.hadoop.hdfs.qjournal.client.TestQJMWithFaults
Running org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.264 sec - in 
org.apache.hadoop.hdfs.qjournal.client.TestQuorumCall
Running org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.842 sec - in 
org.apache.hadoop.hdfs.qjournal.TestMiniJournalCluster
Running org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 7.546 sec - in 
org.apache.hadoop.hdfs.qjournal.TestNNWithQJM
Running org.apache.hadoop.hdfs.TestConnCache
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.991 sec - in 
org.apache.hadoop.hdfs.TestConnCache
Running org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.832 sec - in 
org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
Running org.apache.hadoop.hdfs.TestFileAppend
Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 22.605 sec - 
in org.apache.hadoop.hdfs.TestFileAppend
Running org.apache.hadoop.hdfs.TestFileAppend3
Tests run: 15, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 42.336 sec - 
in org.apache.hadoop.hdfs.TestFileAppend3
Running org.apache.hadoop.hdfs.TestClientReportBadBlock
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.078 sec - in 
org.apache.hadoop.hdfs.TestClientReportBadBlock
Running org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.798 sec - in 
org.apache.hadoop.hdfs.TestParallelShortCircuitReadNoChecksum
Running org.apache.hadoop.hdfs.TestFileCreation
Tests run: 23, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 381.435 sec - 
in org.apache.hadoop.hdfs.TestFileCreation
Running org.apache.hadoop.hdfs.TestDFSRemove
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.408 sec - in 
org.apache.hadoop.hdfs.TestDFSRemove
Running org.apache.hadoop.hdfs.TestHdfsAdmin
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.922 sec - in 
org.apache.hadoop.hdfs.TestHdfsAdmin
Running org.apache.hadoop.hdfs.TestDFSUtil
Tests run: 30, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 58.433 sec - 
in org.apache.hadoop.hdfs.TestDFSUtil
Running org.apache.hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.08 sec - in 
org.apache.hadoop.hdfs.TestWriteBlockGetsBlockLengthHint
Run

Hadoop-Hdfs-trunk-Java8 - Build # 125 - Still Failing

2015-03-16 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/125/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 7989 lines...]
[INFO] 
[WARNING] The POM for org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 is missing, 
no dependency information available
[WARNING] Failed to retrieve plugin descriptor for 
org.eclipse.m2e:lifecycle-mapping:1.0.0: Plugin 
org.eclipse.m2e:lifecycle-mapping:1.0.0 or one of its dependencies could not be 
resolved: Failure to find org.eclipse.m2e:lifecycle-mapping:jar:1.0.0 in 
http://repo.maven.apache.org/maven2 was cached in the local repository, 
resolution will not be reattempted until the update interval of central has 
elapsed or updates are forced
[INFO] 
[INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-hdfs-project 
---
[INFO] Deleting 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/target
[INFO] 
[INFO] --- maven-antrun-plugin:1.7:run (create-testdirs) @ hadoop-hdfs-project 
---
[INFO] Executing tasks

main:
[mkdir] Created dir: 
/home/jenkins/jenkins-slave/workspace/Hadoop-Hdfs-trunk-Java8/hadoop-hdfs-project/target/test-dir
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-source-plugin:2.3:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-source-plugin:2.3:test-jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (dist-enforce) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.4:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-enforcer-plugin:1.3.1:enforce (depcheck) @ hadoop-hdfs-project 
---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.12.1:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:3.0.0:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  FAILURE [  03:09 h]
[INFO] Apache Hadoop HttpFS .. SKIPPED
[INFO] Apache Hadoop HDFS BookKeeper Journal . SKIPPED
[INFO] Apache Hadoop HDFS-NFS  SKIPPED
[INFO] Apache Hadoop HDFS Project  SUCCESS [  1.757 s]
[INFO] 
[INFO] BUILD FAILURE
[INFO] 
[INFO] Total time: 03:09 h
[INFO] Finished at: 2015-03-16T14:44:21+00:00
[INFO] Final Memory: 49M/242M
[INFO] 
[ERROR] Failed to execute goal 
org.apache.maven.plugins:maven-surefire-plugin:2.17:test (default-test) on 
project hadoop-hdfs: There was a timeout or other error in the fork -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e 
switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please 
read the following articles:
[ERROR] [Help 1] 
http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
Build step 'Execute shell' marked build as failure
Archiving artifacts
Recording test results
Updating YARN-3171
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Failure
Sending email for trigger: Failure



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover.testFailoverRightBeforeCommitSynchronization

Error Message:
test timed out after 3 milliseconds

Stack Trace:
java.lang.Exception: test timed out after 3 milliseconds
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:231)
at 
org.apache.hadoop.test.GenericTestUtils$DelayAnswer.waitForCall(GenericTestUtils.java:226

Build failed in Jenkins: Hadoop-Hdfs-trunk-Java8 #125

2015-03-16 Thread Apache Jenkins Server
See 

Changes:

[xgong] YARN-3171. Sort by Application id, AppAttempt and ContainerID doesn't

--
[...truncated 7796 lines...]
Running org.apache.hadoop.hdfs.TestRollingUpgradeRollback
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.236 sec - in 
org.apache.hadoop.hdfs.TestRollingUpgradeRollback
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestDFSStartupVersions
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.296 sec - in 
org.apache.hadoop.hdfs.TestDFSStartupVersions
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestDFSShellGenericOptions
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.737 sec - in 
org.apache.hadoop.hdfs.TestDFSShellGenericOptions
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.protocolPB.TestPBHelper
Tests run: 28, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 0.816 sec - in 
org.apache.hadoop.hdfs.protocolPB.TestPBHelper
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestLeaseRenewer
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.23 sec - in 
org.apache.hadoop.hdfs.TestLeaseRenewer
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestDFSRemove
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 14.647 sec - in 
org.apache.hadoop.hdfs.TestDFSRemove
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestFileAppend4
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 31.372 sec - in 
org.apache.hadoop.hdfs.TestFileAppend4
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestParallelRead
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 36.545 sec - in 
org.apache.hadoop.hdfs.TestParallelRead
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestClose
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.238 sec - in 
org.apache.hadoop.hdfs.TestClose
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestDFSAddressConfig
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.678 sec - in 
org.apache.hadoop.hdfs.TestDFSAddressConfig
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestParallelShortCircuitLegacyRead
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.864 sec - in 
org.apache.hadoop.hdfs.TestParallelShortCircuitLegacyRead
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestLargeBlock
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.35 sec - in 
org.apache.hadoop.hdfs.TestLargeBlock
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestHDFSTrash
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.382 sec - in 
org.apache.hadoop.hdfs.TestHDFSTrash
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestClientReportBadBlock
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 8.542 sec - in 
org.apache.hadoop.hdfs.TestClientReportBadBlock
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestWriteRead
Tests run: 3, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 21.216 sec - in 
org.apache.hadoop.hdfs.TestWriteRead
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery
Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 34.342 sec - in 
org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.hdfs.TestBalancerBandwidth
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.652 sec - in 
org.apache.hadoop.hdfs.TestBalancerBandwidth
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
s

[jira] [Resolved] (HDFS-2360) Ugly stacktrace when quota exceeds

2015-03-16 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-2360.
---
Resolution: Not a Problem

The last line of the command (excluding the log and its stack trace via the 
WARN) does today print the base message reason that should catch the eye 
clearly:

{code}
put: The DiskSpace quota of /testDir is exceeded: quota = 1024 B = 1 KB but 
diskspace consumed = 402653184 B = 384 MB
{code}

Resolving this as it should be clear enough. To get rid of the WARN, the client 
logger can be nullified, but the catch layer is rather generic today to 
specifically turn it off without causing other impact (for other use-cases and 
troubles) I think.

As always though, feel free to reopen with any counter-point.

> Ugly stacktrace when quota exceeds
> --
>
> Key: HDFS-2360
> URL: https://issues.apache.org/jira/browse/HDFS-2360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 0.23.0
>Reporter: Rajit Saha
>Priority: Minor
>
> Will it be better to catch the exception and throw a small reasonable messege 
> to user when they exceed quota?
> $hdfs  dfs -mkdir testDir
> $hdfs  dfsadmin -setSpaceQuota 191M  testDir
> $hdfs dfs -count -q testDir
> none inf   200278016   2002780161 
>0  0
> hdfs://:/user/hdfsqa/testDir
> $hdfs dfs -put /etc/passwd /user/hadoopqa/testDir 
> 11/09/19 08:08:15 WARN hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /user/hdfsqa/testDir is exceeded:
> quota=191.0m diskspace consumed=768.0m
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:389)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:365)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
> at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1100)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:972)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
> Caused by: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The 
> DiskSpace quota of /user/hdfsqa/testDir is
> exceeded: quota=191.0m diskspace consumed=768.0m
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.jav

[jira] [Resolved] (HDFS-3349) DFSAdmin fetchImage command should initialize security credentials

2015-03-16 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-3349.
---
  Resolution: Cannot Reproduce
Target Version/s:   (was: 2.0.0-alpha)

Trying with lack of credentials throws the proper response back (No tgt). I 
think this is stale given Aaron's comment as well, marking as resolved.

> DFSAdmin fetchImage command should initialize security credentials
> --
>
> Key: HDFS-3349
> URL: https://issues.apache.org/jira/browse/HDFS-3349
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.0.0-alpha
>Reporter: Aaron T. Myers
>Priority: Minor
>
> The `hdfs dfsadmin -fetchImage' command should fetch the fsimage using the 
> appropriate credentials if security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-5740) getmerge file system shell command needs error message for user error

2015-03-16 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-5740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-5740.
---
Resolution: Not a Problem

This is no longer an issue on branch-2 and trunk today. The command accepts a 
collection of files now, and prepares the output accordingly.

> getmerge file system shell command needs error message for user error
> -
>
> Key: HDFS-5740
> URL: https://issues.apache.org/jira/browse/HDFS-5740
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs-client
>Affects Versions: 1.1.2
> Environment: {noformat}[jpfuntner@h58 tmp]$ cat /etc/redhat-release
> Red Hat Enterprise Linux Server release 6.0 (Santiago)
> [jpfuntner@h58 tmp]$ hadoop version
> Hadoop 1.1.2.21
> Subversion  -r 
> Compiled by jenkins on Thu Jan 10 03:38:39 PST 2013
> From source with checksum ce0aa0de785f572347f1afee69c73861{noformat}
>Reporter: John Pfuntner
>Priority: Minor
>
> I naively tried a {{getmerge}} operation but it didn't seem to do anything 
> and there was no error message:
> {noformat}[jpfuntner@h58 tmp]$ hadoop fs -mkdir /user/jpfuntner/tmp
> [jpfuntner@h58 tmp]$ num=0; while [ $num -lt 5 ]; do echo file$num | hadoop 
> fs -put - /user/jpfuntner/tmp/file$num; let num=num+1; done
> [jpfuntner@h58 tmp]$ ls -A
> [jpfuntner@h58 tmp]$ hadoop fs -getmerge /user/jpfuntner/tmp/file* files.txt
> [jpfuntner@h58 tmp]$ ls -A
> [jpfuntner@h58 tmp]$ hadoop fs -ls /user/jpfuntner/tmp
> Found 5 items
> -rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
> /user/jpfuntner/tmp/file0
> -rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
> /user/jpfuntner/tmp/file1
> -rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
> /user/jpfuntner/tmp/file2
> -rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
> /user/jpfuntner/tmp/file3
> -rw---   3 jpfuntner hdfs  6 2014-01-08 17:37 
> /user/jpfuntner/tmp/file4
> [jpfuntner@h58 tmp]$ {noformat}
> It was pointed out to me that I made a mistake and my source should have been 
> a directory not a set of regular files.  It works if I use the directory:
> {noformat}[jpfuntner@h58 tmp]$ hadoop fs -getmerge /user/jpfuntner/tmp/ 
> files.txt
> [jpfuntner@h58 tmp]$ ls -A
> files.txt  .files.txt.crc
> [jpfuntner@h58 tmp]$ cat files.txt
> file0
> file1
> file2
> file3
> file4
> [jpfuntner@h58 tmp]$ {noformat}
> I think the {{getmerge}} command should issue an error message to let the 
> user know they made a mistake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-147) Stack trace on spaceQuota excced .

2015-03-16 Thread Xiaoyu Yao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xiaoyu Yao resolved HDFS-147.
-
Resolution: Duplicate

> Stack trace on spaceQuota excced .
> --
>
> Key: HDFS-147
> URL: https://issues.apache.org/jira/browse/HDFS-147
> Project: Hadoop HDFS
>  Issue Type: Bug
> Environment: All
>Reporter: Ravi Phulari
>Assignee: Xiaoyu Yao
>  Labels: newbie
>
> Currently disk space quota exceed exception spits out stack trace . It's 
> better to show error message instead of stack trace .
> {code}
> somehost:Hadoop guesti$ bin/hdfs dfsadmin -setSpaceQuota 2 2344
> somehost:Hadoop guest$ bin/hadoop fs -put conf 2344
> 09/06/19 16:44:30 WARN hdfs.DFSClient: DataStreamer Exception: 
> org.apache.hadoop.hdfs.protocol.QuotaExceededException: 
> org.apache.hadoop.hdfs.protocol.QuotaExceededException: The quota of 
> /user/guest/2344 is exceeded: namespace quota=-1 file count=4, diskspace 
> quota=2 diskspace=67108864
>   at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
>   at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
>   at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> ..
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-4494) Confusing exception for unresolvable hdfs host with security enabled

2015-03-16 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-4494.
---
  Resolution: Done
Target Version/s: 2.1.0-beta, 3.0.0  (was: 3.0.0, 2.1.0-beta)

This seems resolved now (as of 2.6.0):

{code}
[root@host ~]# hdfs getconf -confKey hadoop.security.authentication
kerberos
[root@host ~]# hadoop fs -ls hdfs://asdfsdfsdf/
-ls: java.net.UnknownHostException: asdfsdfsdf
Usage: hadoop fs [generic options] -ls [-d] [-h] [-R] [ ...]
{code}

Marking as Done.

> Confusing exception for unresolvable hdfs host with security enabled
> 
>
> Key: HDFS-4494
> URL: https://issues.apache.org/jira/browse/HDFS-4494
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 0.23.0, 2.0.0-alpha, 3.0.0
>Reporter: Daryn Sharp
>Priority: Minor
>
> {noformat}
> $ hadoop fs -ls hdfs://unresolvable-host
> ls: Can't replace _HOST pattern since client address is null
> {noformat}
> It's misleading because it's not even related to the client's address.  It'd 
> be a bit more informative to see something like "{{UnknownHostException: 
> unresolvable-host}}".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Reopened] (HDFS-2360) Ugly stacktrace when quota exceeds

2015-03-16 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer reopened HDFS-2360:


OK, then let me re-open it.

Having oodles of useless stack trace here is *incredibly* user-unfriendly.  
Users do miss this message very very often because, believe it or not, they 
aren't Java programmers who are used to reading these things.

> Ugly stacktrace when quota exceeds
> --
>
> Key: HDFS-2360
> URL: https://issues.apache.org/jira/browse/HDFS-2360
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 0.23.0
>Reporter: Rajit Saha
>Priority: Minor
>
> Will it be better to catch the exception and throw a small reasonable messege 
> to user when they exceed quota?
> $hdfs  dfs -mkdir testDir
> $hdfs  dfsadmin -setSpaceQuota 191M  testDir
> $hdfs dfs -count -q testDir
> none inf   200278016   2002780161 
>0  0
> hdfs://:/user/hdfsqa/testDir
> $hdfs dfs -put /etc/passwd /user/hadoopqa/testDir 
> 11/09/19 08:08:15 WARN hdfs.DFSClient: DataStreamer Exception
> org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The DiskSpace quota 
> of /user/hdfsqa/testDir is exceeded:
> quota=191.0m diskspace consumed=768.0m
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:389)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at 
> org.apache.hadoop.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:365)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1496)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1492)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:396)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1490)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native 
> Method)
> at 
> sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
> at 
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
> at 
> org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:90)
> at 
> org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:57)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.locateFollowingBlock(DFSOutputStream.java:1100)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:972)
> at 
> org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)
> Caused by: org.apache.hadoop.hdfs.protocol.DSQuotaExceededException: The 
> DiskSpace quota of /user/hdfsqa/testDir is
> exceeded: quota=191.0m diskspace consumed=768.0m
> at 
> org.apache.hadoop.hdfs.server.namenode.INodeDirectoryWithQuota.verifyQuota(INodeDirectoryWithQuota.java:159)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.verifyQuota(FSDirectory.java:1609)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.updateCount(FSDirectory.java:1383)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSDirectory.addBlock(FSDirectory.java:370)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:1681)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1476)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:389)
> at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMe

[jira] [Resolved] (HDFS-4290) Expose an event listener interface in DFSOutputStreams for block write pipeline status changes

2015-03-16 Thread Harsh J (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4290?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J resolved HDFS-4290.
---
Resolution: Later

Specific problems/use-cases driving this need haven't been bought up in the 
past years. Resolving as Later for now.

> Expose an event listener interface in DFSOutputStreams for block write 
> pipeline status changes
> --
>
> Key: HDFS-4290
> URL: https://issues.apache.org/jira/browse/HDFS-4290
> Project: Hadoop HDFS
>  Issue Type: New Feature
>  Components: hdfs-client
>Affects Versions: 2.0.0-alpha
>Reporter: Harsh J
>Priority: Minor
>
> I've noticed HBase periodically polls the current status of block replicas 
> for its HLog files via the API presented by HDFS-826.
> It would perhaps be better for such clients if they could register a listener 
> instead. The listener(s) can be sent an event in case things change in the 
> last open block (due to DN fall but no replacement found, etc. cases). This 
> would avoid having a periodic, parallel looped check in such clients and be 
> more efficient.
> Just a thought :)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: upstream jenkins build broken?

2015-03-16 Thread Colin P. McCabe
If all it takes is someone creating a test that makes a directory
without -x, this is going to happen over and over.

Let's just fix the problem at the root by running "git clean -fqdx" in
our jenkins scripts.  If there's no objections I will add this in and
un-break the builds.

best,
Colin

On Fri, Mar 13, 2015 at 1:48 PM, Lei Xu  wrote:
> I filed HDFS-7917 to change the way to simulate disk failures.
>
> But I think we still need infrastructure folks to help with jenkins
> scripts to clean the dirs left today.
>
> On Fri, Mar 13, 2015 at 1:38 PM, Mai Haohui  wrote:
>> Any updates on this issues? It seems that all HDFS jenkins builds are
>> still failing.
>>
>> Regards,
>> Haohui
>>
>> On Thu, Mar 12, 2015 at 12:53 AM, Vinayakumar B  
>> wrote:
>>> I think the problem started from here.
>>>
>>> https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/
>>>
>>> As Chris mentioned TestDataNodeVolumeFailure is changing the permission.
>>> But in this patch, ReplicationMonitor got NPE and it got terminate signal,
>>> due to which MiniDFSCluster.shutdown() throwing Exception.
>>>
>>> But, TestDataNodeVolumeFailure#teardown() is restoring those permission
>>> after shutting down cluster. So in this case IMO, permissions were never
>>> restored.
>>>
>>>
>>>   @After
>>>   public void tearDown() throws Exception {
>>> if(data_fail != null) {
>>>   FileUtil.setWritable(data_fail, true);
>>> }
>>> if(failedDir != null) {
>>>   FileUtil.setWritable(failedDir, true);
>>> }
>>> if(cluster != null) {
>>>   cluster.shutdown();
>>> }
>>> for (int i = 0; i < 3; i++) {
>>>   FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)), true);
>>>   FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)), true);
>>> }
>>>   }
>>>
>>>
>>> Regards,
>>> Vinay
>>>
>>> On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B 
>>> wrote:
>>>
 When I see the history of these kind of builds, All these are failed on
 node H9.

 I think some or the other uncommitted patch would have created the problem
 and left it there.


 Regards,
 Vinay

 On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey  wrote:

> You could rely on a destructive git clean call instead of maven to do the
> directory removal.
>
> --
> Sean
> On Mar 11, 2015 4:11 PM, "Colin McCabe"  wrote:
>
> > Is there a maven plugin or setting we can use to simply remove
> > directories that have no executable permissions on them?  Clearly we
> > have the permission to do this from a technical point of view (since
> > we created the directories as the jenkins user), it's simply that the
> > code refuses to do it.
> >
> > Otherwise I guess we can just fix those tests...
> >
> > Colin
> >
> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu  wrote:
> > > Thanks a lot for looking into HDFS-7722, Chris.
> > >
> > > In HDFS-7722:
> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions in
> > TearDown().
> > > TestDataNodeHotSwapVolumes reset permissions in a finally clause.
> > >
> > > Also I ran mvn test several times on my machine and all tests passed.
> > >
> > > However, since in DiskChecker#checkDirAccess():
> > >
> > > private static void checkDirAccess(File dir) throws
> DiskErrorException {
> > >   if (!dir.isDirectory()) {
> > > throw new DiskErrorException("Not a directory: "
> > >  + dir.toString());
> > >   }
> > >
> > >   checkAccessByFileMethods(dir);
> > > }
> > >
> > > One potentially safer alternative is replacing data dir with a regular
> > > file to stimulate disk failures.
> > >
> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
> cnaur...@hortonworks.com>
> > wrote:
> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
> > >> TestDataNodeVolumeFailureReporting, and
> > >> TestDataNodeVolumeFailureToleration all remove executable permissions
> > from
> > >> directories like the one Colin mentioned to simulate disk failures at
> > data
> > >> nodes.  I reviewed the code for all of those, and they all appear to
> be
> > >> doing the necessary work to restore executable permissions at the
> end of
> > >> the test.  The only recent uncommitted patch I¹ve seen that makes
> > changes
> > >> in these test suites is HDFS-7722.  That patch still looks fine
> > though.  I
> > >> don¹t know if there are other uncommitted patches that changed these
> > test
> > >> suites.
> > >>
> > >> I suppose it¹s also possible that the JUnit process unexpectedly died
> > >> after removing executable permissions but before restoring them.
> That
> > >> always would have been a wea

Re: upstream jenkins build broken?

2015-03-16 Thread Chris Nauroth
+1 for the git clean command.

HDFS-7917 still might be valuable for enabling us to run a few unit tests
on Windows that are currently skipped.  Let's please keep it open, but
it's less urgent.

Thanks!

Chris Nauroth
Hortonworks
http://hortonworks.com/






On 3/16/15, 11:54 AM, "Colin P. McCabe"  wrote:

>If all it takes is someone creating a test that makes a directory
>without -x, this is going to happen over and over.
>
>Let's just fix the problem at the root by running "git clean -fqdx" in
>our jenkins scripts.  If there's no objections I will add this in and
>un-break the builds.
>
>best,
>Colin
>
>On Fri, Mar 13, 2015 at 1:48 PM, Lei Xu  wrote:
>> I filed HDFS-7917 to change the way to simulate disk failures.
>>
>> But I think we still need infrastructure folks to help with jenkins
>> scripts to clean the dirs left today.
>>
>> On Fri, Mar 13, 2015 at 1:38 PM, Mai Haohui  wrote:
>>> Any updates on this issues? It seems that all HDFS jenkins builds are
>>> still failing.
>>>
>>> Regards,
>>> Haohui
>>>
>>> On Thu, Mar 12, 2015 at 12:53 AM, Vinayakumar B
>>> wrote:
 I think the problem started from here.

 
https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/juni
t/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/test
UnderReplicationAfterVolFailure/

 As Chris mentioned TestDataNodeVolumeFailure is changing the
permission.
 But in this patch, ReplicationMonitor got NPE and it got terminate
signal,
 due to which MiniDFSCluster.shutdown() throwing Exception.

 But, TestDataNodeVolumeFailure#teardown() is restoring those
permission
 after shutting down cluster. So in this case IMO, permissions were
never
 restored.


   @After
   public void tearDown() throws Exception {
 if(data_fail != null) {
   FileUtil.setWritable(data_fail, true);
 }
 if(failedDir != null) {
   FileUtil.setWritable(failedDir, true);
 }
 if(cluster != null) {
   cluster.shutdown();
 }
 for (int i = 0; i < 3; i++) {
   FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)), true);
   FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)), true);
 }
   }


 Regards,
 Vinay

 On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B

 wrote:

> When I see the history of these kind of builds, All these are failed
>on
> node H9.
>
> I think some or the other uncommitted patch would have created the
>problem
> and left it there.
>
>
> Regards,
> Vinay
>
> On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey 
>wrote:
>
>> You could rely on a destructive git clean call instead of maven to
>>do the
>> directory removal.
>>
>> --
>> Sean
>> On Mar 11, 2015 4:11 PM, "Colin McCabe" 
>>wrote:
>>
>> > Is there a maven plugin or setting we can use to simply remove
>> > directories that have no executable permissions on them?  Clearly
>>we
>> > have the permission to do this from a technical point of view
>>(since
>> > we created the directories as the jenkins user), it's simply that
>>the
>> > code refuses to do it.
>> >
>> > Otherwise I guess we can just fix those tests...
>> >
>> > Colin
>> >
>> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu  wrote:
>> > > Thanks a lot for looking into HDFS-7722, Chris.
>> > >
>> > > In HDFS-7722:
>> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions in
>> > TearDown().
>> > > TestDataNodeHotSwapVolumes reset permissions in a finally
>>clause.
>> > >
>> > > Also I ran mvn test several times on my machine and all tests
>>passed.
>> > >
>> > > However, since in DiskChecker#checkDirAccess():
>> > >
>> > > private static void checkDirAccess(File dir) throws
>> DiskErrorException {
>> > >   if (!dir.isDirectory()) {
>> > > throw new DiskErrorException("Not a directory: "
>> > >  + dir.toString());
>> > >   }
>> > >
>> > >   checkAccessByFileMethods(dir);
>> > > }
>> > >
>> > > One potentially safer alternative is replacing data dir with a
>>regular
>> > > file to stimulate disk failures.
>> > >
>> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
>> cnaur...@hortonworks.com>
>> > wrote:
>> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
>> > >> TestDataNodeVolumeFailureReporting, and
>> > >> TestDataNodeVolumeFailureToleration all remove executable
>>permissions
>> > from
>> > >> directories like the one Colin mentioned to simulate disk
>>failures at
>> > data
>> > >> nodes.  I reviewed the code for all of those, and they all
>>appear to
>> be
>> > >> doing the necessary work to restore executabl

[jira] [Resolved] (HDFS-7886) TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes

2015-03-16 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HDFS-7886.
---
  Resolution: Fixed
Hadoop Flags: Reviewed

I just committed this. Thanks everybody.

> TestFileTruncate#testTruncateWithDataNodesRestart runs timeout sometimes
> 
>
> Key: HDFS-7886
> URL: https://issues.apache.org/jira/browse/HDFS-7886
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.7.0
>Reporter: Yi Liu
>Assignee: Plamen Jeliazkov
>Priority: Minor
> Attachments: HDFS-7886-01.patch, HDFS-7886-02.patch, 
> HDFS-7886-branch2.patch, HDFS-7886.patch
>
>
> https://builds.apache.org/job/PreCommit-HDFS-Build/9730//testReport/



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-2771) Move Federation and WebHDFS documentation into HDFS project

2015-03-16 Thread Allen Wittenauer (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Allen Wittenauer resolved HDFS-2771.

  Resolution: Implemented
Target Version/s:   (was: )

fixed eons ago

> Move Federation and WebHDFS documentation into HDFS project
> ---
>
> Key: HDFS-2771
> URL: https://issues.apache.org/jira/browse/HDFS-2771
> Project: Hadoop HDFS
>  Issue Type: Task
>  Components: webhdfs
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>  Labels: newbie
>
> For some strange reason, the WebHDFS and Federation documentation is 
> currently in the hadoop-yarn site. This is counter-intuitive. We should move 
> these documents to an hdfs site, or if we think that all documentation should 
> go on one site, it should go into the hadoop-common project somewhere.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: upstream jenkins build broken?

2015-03-16 Thread Haohui Mai
+1 for git clean.

Colin, can you please get it in ASAP? Currently due to the jenkins
issues, we cannot close the 2.7 blockers.

Thanks,
Haohui



On Mon, Mar 16, 2015 at 11:54 AM, Colin P. McCabe  wrote:
> If all it takes is someone creating a test that makes a directory
> without -x, this is going to happen over and over.
>
> Let's just fix the problem at the root by running "git clean -fqdx" in
> our jenkins scripts.  If there's no objections I will add this in and
> un-break the builds.
>
> best,
> Colin
>
> On Fri, Mar 13, 2015 at 1:48 PM, Lei Xu  wrote:
>> I filed HDFS-7917 to change the way to simulate disk failures.
>>
>> But I think we still need infrastructure folks to help with jenkins
>> scripts to clean the dirs left today.
>>
>> On Fri, Mar 13, 2015 at 1:38 PM, Mai Haohui  wrote:
>>> Any updates on this issues? It seems that all HDFS jenkins builds are
>>> still failing.
>>>
>>> Regards,
>>> Haohui
>>>
>>> On Thu, Mar 12, 2015 at 12:53 AM, Vinayakumar B  
>>> wrote:
 I think the problem started from here.

 https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/

 As Chris mentioned TestDataNodeVolumeFailure is changing the permission.
 But in this patch, ReplicationMonitor got NPE and it got terminate signal,
 due to which MiniDFSCluster.shutdown() throwing Exception.

 But, TestDataNodeVolumeFailure#teardown() is restoring those permission
 after shutting down cluster. So in this case IMO, permissions were never
 restored.


   @After
   public void tearDown() throws Exception {
 if(data_fail != null) {
   FileUtil.setWritable(data_fail, true);
 }
 if(failedDir != null) {
   FileUtil.setWritable(failedDir, true);
 }
 if(cluster != null) {
   cluster.shutdown();
 }
 for (int i = 0; i < 3; i++) {
   FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)), true);
   FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)), true);
 }
   }


 Regards,
 Vinay

 On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B 
 wrote:

> When I see the history of these kind of builds, All these are failed on
> node H9.
>
> I think some or the other uncommitted patch would have created the problem
> and left it there.
>
>
> Regards,
> Vinay
>
> On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey  wrote:
>
>> You could rely on a destructive git clean call instead of maven to do the
>> directory removal.
>>
>> --
>> Sean
>> On Mar 11, 2015 4:11 PM, "Colin McCabe"  wrote:
>>
>> > Is there a maven plugin or setting we can use to simply remove
>> > directories that have no executable permissions on them?  Clearly we
>> > have the permission to do this from a technical point of view (since
>> > we created the directories as the jenkins user), it's simply that the
>> > code refuses to do it.
>> >
>> > Otherwise I guess we can just fix those tests...
>> >
>> > Colin
>> >
>> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu  wrote:
>> > > Thanks a lot for looking into HDFS-7722, Chris.
>> > >
>> > > In HDFS-7722:
>> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions in
>> > TearDown().
>> > > TestDataNodeHotSwapVolumes reset permissions in a finally clause.
>> > >
>> > > Also I ran mvn test several times on my machine and all tests passed.
>> > >
>> > > However, since in DiskChecker#checkDirAccess():
>> > >
>> > > private static void checkDirAccess(File dir) throws
>> DiskErrorException {
>> > >   if (!dir.isDirectory()) {
>> > > throw new DiskErrorException("Not a directory: "
>> > >  + dir.toString());
>> > >   }
>> > >
>> > >   checkAccessByFileMethods(dir);
>> > > }
>> > >
>> > > One potentially safer alternative is replacing data dir with a 
>> > > regular
>> > > file to stimulate disk failures.
>> > >
>> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
>> cnaur...@hortonworks.com>
>> > wrote:
>> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
>> > >> TestDataNodeVolumeFailureReporting, and
>> > >> TestDataNodeVolumeFailureToleration all remove executable 
>> > >> permissions
>> > from
>> > >> directories like the one Colin mentioned to simulate disk failures 
>> > >> at
>> > data
>> > >> nodes.  I reviewed the code for all of those, and they all appear to
>> be
>> > >> doing the necessary work to restore executable permissions at the
>> end of
>> > >> the test.  The only recent uncommitted patch I¹ve seen that makes
>> > changes
>> > >> in these test suites is HDFS-7

Re: upstream jenkins build broken?

2015-03-16 Thread Sean Busbey
I'm on it. HADOOP-11721

On Mon, Mar 16, 2015 at 3:44 PM, Haohui Mai  wrote:

> +1 for git clean.
>
> Colin, can you please get it in ASAP? Currently due to the jenkins
> issues, we cannot close the 2.7 blockers.
>
> Thanks,
> Haohui
>
>
>
> On Mon, Mar 16, 2015 at 11:54 AM, Colin P. McCabe 
> wrote:
> > If all it takes is someone creating a test that makes a directory
> > without -x, this is going to happen over and over.
> >
> > Let's just fix the problem at the root by running "git clean -fqdx" in
> > our jenkins scripts.  If there's no objections I will add this in and
> > un-break the builds.
> >
> > best,
> > Colin
> >
> > On Fri, Mar 13, 2015 at 1:48 PM, Lei Xu  wrote:
> >> I filed HDFS-7917 to change the way to simulate disk failures.
> >>
> >> But I think we still need infrastructure folks to help with jenkins
> >> scripts to clean the dirs left today.
> >>
> >> On Fri, Mar 13, 2015 at 1:38 PM, Mai Haohui  wrote:
> >>> Any updates on this issues? It seems that all HDFS jenkins builds are
> >>> still failing.
> >>>
> >>> Regards,
> >>> Haohui
> >>>
> >>> On Thu, Mar 12, 2015 at 12:53 AM, Vinayakumar B <
> vinayakum...@apache.org> wrote:
>  I think the problem started from here.
> 
> 
> https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/
> 
>  As Chris mentioned TestDataNodeVolumeFailure is changing the
> permission.
>  But in this patch, ReplicationMonitor got NPE and it got terminate
> signal,
>  due to which MiniDFSCluster.shutdown() throwing Exception.
> 
>  But, TestDataNodeVolumeFailure#teardown() is restoring those
> permission
>  after shutting down cluster. So in this case IMO, permissions were
> never
>  restored.
> 
> 
>    @After
>    public void tearDown() throws Exception {
>  if(data_fail != null) {
>    FileUtil.setWritable(data_fail, true);
>  }
>  if(failedDir != null) {
>    FileUtil.setWritable(failedDir, true);
>  }
>  if(cluster != null) {
>    cluster.shutdown();
>  }
>  for (int i = 0; i < 3; i++) {
>    FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)), true);
>    FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)), true);
>  }
>    }
> 
> 
>  Regards,
>  Vinay
> 
>  On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B <
> vinayakum...@apache.org>
>  wrote:
> 
> > When I see the history of these kind of builds, All these are failed
> on
> > node H9.
> >
> > I think some or the other uncommitted patch would have created the
> problem
> > and left it there.
> >
> >
> > Regards,
> > Vinay
> >
> > On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey 
> wrote:
> >
> >> You could rely on a destructive git clean call instead of maven to
> do the
> >> directory removal.
> >>
> >> --
> >> Sean
> >> On Mar 11, 2015 4:11 PM, "Colin McCabe" 
> wrote:
> >>
> >> > Is there a maven plugin or setting we can use to simply remove
> >> > directories that have no executable permissions on them?  Clearly
> we
> >> > have the permission to do this from a technical point of view
> (since
> >> > we created the directories as the jenkins user), it's simply that
> the
> >> > code refuses to do it.
> >> >
> >> > Otherwise I guess we can just fix those tests...
> >> >
> >> > Colin
> >> >
> >> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu  wrote:
> >> > > Thanks a lot for looking into HDFS-7722, Chris.
> >> > >
> >> > > In HDFS-7722:
> >> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions in
> >> > TearDown().
> >> > > TestDataNodeHotSwapVolumes reset permissions in a finally
> clause.
> >> > >
> >> > > Also I ran mvn test several times on my machine and all tests
> passed.
> >> > >
> >> > > However, since in DiskChecker#checkDirAccess():
> >> > >
> >> > > private static void checkDirAccess(File dir) throws
> >> DiskErrorException {
> >> > >   if (!dir.isDirectory()) {
> >> > > throw new DiskErrorException("Not a directory: "
> >> > >  + dir.toString());
> >> > >   }
> >> > >
> >> > >   checkAccessByFileMethods(dir);
> >> > > }
> >> > >
> >> > > One potentially safer alternative is replacing data dir with a
> regular
> >> > > file to stimulate disk failures.
> >> > >
> >> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
> >> cnaur...@hortonworks.com>
> >> > wrote:
> >> > >> TestDataNodeHotSwapVolumes, TestDataNodeVolumeFailure,
> >> > >> TestDataNodeVolumeFailureReporting, and
> >> > >> TestDataNodeVolumeFailureToleration all remove executable
> permissions
> >> > from
> >> > >> directories li

Re: upstream jenkins build broken?

2015-03-16 Thread Sean Busbey
Can someone point me to an example build that is broken?

On Mon, Mar 16, 2015 at 3:52 PM, Sean Busbey  wrote:

> I'm on it. HADOOP-11721
>
> On Mon, Mar 16, 2015 at 3:44 PM, Haohui Mai  wrote:
>
>> +1 for git clean.
>>
>> Colin, can you please get it in ASAP? Currently due to the jenkins
>> issues, we cannot close the 2.7 blockers.
>>
>> Thanks,
>> Haohui
>>
>>
>>
>> On Mon, Mar 16, 2015 at 11:54 AM, Colin P. McCabe 
>> wrote:
>> > If all it takes is someone creating a test that makes a directory
>> > without -x, this is going to happen over and over.
>> >
>> > Let's just fix the problem at the root by running "git clean -fqdx" in
>> > our jenkins scripts.  If there's no objections I will add this in and
>> > un-break the builds.
>> >
>> > best,
>> > Colin
>> >
>> > On Fri, Mar 13, 2015 at 1:48 PM, Lei Xu  wrote:
>> >> I filed HDFS-7917 to change the way to simulate disk failures.
>> >>
>> >> But I think we still need infrastructure folks to help with jenkins
>> >> scripts to clean the dirs left today.
>> >>
>> >> On Fri, Mar 13, 2015 at 1:38 PM, Mai Haohui 
>> wrote:
>> >>> Any updates on this issues? It seems that all HDFS jenkins builds are
>> >>> still failing.
>> >>>
>> >>> Regards,
>> >>> Haohui
>> >>>
>> >>> On Thu, Mar 12, 2015 at 12:53 AM, Vinayakumar B <
>> vinayakum...@apache.org> wrote:
>>  I think the problem started from here.
>> 
>> 
>> https://builds.apache.org/job/PreCommit-HDFS-Build/9828/testReport/junit/org.apache.hadoop.hdfs.server.datanode/TestDataNodeVolumeFailure/testUnderReplicationAfterVolFailure/
>> 
>>  As Chris mentioned TestDataNodeVolumeFailure is changing the
>> permission.
>>  But in this patch, ReplicationMonitor got NPE and it got terminate
>> signal,
>>  due to which MiniDFSCluster.shutdown() throwing Exception.
>> 
>>  But, TestDataNodeVolumeFailure#teardown() is restoring those
>> permission
>>  after shutting down cluster. So in this case IMO, permissions were
>> never
>>  restored.
>> 
>> 
>>    @After
>>    public void tearDown() throws Exception {
>>  if(data_fail != null) {
>>    FileUtil.setWritable(data_fail, true);
>>  }
>>  if(failedDir != null) {
>>    FileUtil.setWritable(failedDir, true);
>>  }
>>  if(cluster != null) {
>>    cluster.shutdown();
>>  }
>>  for (int i = 0; i < 3; i++) {
>>    FileUtil.setExecutable(new File(dataDir, "data"+(2*i+1)),
>> true);
>>    FileUtil.setExecutable(new File(dataDir, "data"+(2*i+2)),
>> true);
>>  }
>>    }
>> 
>> 
>>  Regards,
>>  Vinay
>> 
>>  On Thu, Mar 12, 2015 at 12:35 PM, Vinayakumar B <
>> vinayakum...@apache.org>
>>  wrote:
>> 
>> > When I see the history of these kind of builds, All these are
>> failed on
>> > node H9.
>> >
>> > I think some or the other uncommitted patch would have created the
>> problem
>> > and left it there.
>> >
>> >
>> > Regards,
>> > Vinay
>> >
>> > On Thu, Mar 12, 2015 at 6:16 AM, Sean Busbey 
>> wrote:
>> >
>> >> You could rely on a destructive git clean call instead of maven to
>> do the
>> >> directory removal.
>> >>
>> >> --
>> >> Sean
>> >> On Mar 11, 2015 4:11 PM, "Colin McCabe" 
>> wrote:
>> >>
>> >> > Is there a maven plugin or setting we can use to simply remove
>> >> > directories that have no executable permissions on them?
>> Clearly we
>> >> > have the permission to do this from a technical point of view
>> (since
>> >> > we created the directories as the jenkins user), it's simply
>> that the
>> >> > code refuses to do it.
>> >> >
>> >> > Otherwise I guess we can just fix those tests...
>> >> >
>> >> > Colin
>> >> >
>> >> > On Tue, Mar 10, 2015 at 2:43 PM, Lei Xu 
>> wrote:
>> >> > > Thanks a lot for looking into HDFS-7722, Chris.
>> >> > >
>> >> > > In HDFS-7722:
>> >> > > TestDataNodeVolumeFailureXXX tests reset data dir permissions
>> in
>> >> > TearDown().
>> >> > > TestDataNodeHotSwapVolumes reset permissions in a finally
>> clause.
>> >> > >
>> >> > > Also I ran mvn test several times on my machine and all tests
>> passed.
>> >> > >
>> >> > > However, since in DiskChecker#checkDirAccess():
>> >> > >
>> >> > > private static void checkDirAccess(File dir) throws
>> >> DiskErrorException {
>> >> > >   if (!dir.isDirectory()) {
>> >> > > throw new DiskErrorException("Not a directory: "
>> >> > >  + dir.toString());
>> >> > >   }
>> >> > >
>> >> > >   checkAccessByFileMethods(dir);
>> >> > > }
>> >> > >
>> >> > > One potentially safer alternative is replacing data dir with a
>> regular
>> >> > > file to stimulate disk failures.
>> >> > >
>> >> > > On Tue, Mar 10, 2015 at 2:19 PM, Chris Nauroth <
>> >> cnaur...@hortonwor

[jira] [Created] (HDFS-7936) Erasure coding: resolving conflicts when merging with HDFS-7903 and HDFS-7435

2015-03-16 Thread Zhe Zhang (JIRA)
Zhe Zhang created HDFS-7936:
---

 Summary: Erasure coding: resolving conflicts when merging with 
HDFS-7903 and HDFS-7435
 Key: HDFS-7936
 URL: https://issues.apache.org/jira/browse/HDFS-7936
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Zhe Zhang
Assignee: Zhe Zhang


A few non-trivial conflicts were found when merging trunk changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7936) Erasure coding: resolving conflicts when merging with HDFS-7903 and HDFS-7435

2015-03-16 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7936?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-7936.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

I've committed this.

> Erasure coding: resolving conflicts when merging with HDFS-7903 and HDFS-7435
> -
>
> Key: HDFS-7936
> URL: https://issues.apache.org/jira/browse/HDFS-7936
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Zhe Zhang
>Assignee: Zhe Zhang
> Fix For: HDFS-7285
>
> Attachments: HDFS-7936-001.patch, HDFS-7936-002.patch
>
>
> A few non-trivial conflicts were found when merging trunk changes. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Hadoop - Major releases

2015-03-16 Thread Andrew Wang
I took the liberty of adding line breaks to Joep's mail.

Thanks for the great feedback Joep. The goal with 3.x is to maintain API
and wire compatibility with 2.x, which I think addresses most of your
concerns. A 2.x client running on JDK7 would then still be able to talk to
a 3.x server running on JDK8. Classpath isolation is also proposed as a
banner feature, which directly addresses g). This might require new
(major?) releases for some downstreams, but the feedback I've heard related
to this has been very positive.

Best,
Andrew

==

It depends on the "Return on Pain". While it is hard to quantify the
returns in the abstract, I can try to sketch out which kinds of changes are
the most painful and therefore cause the most friction for us.In rough
order of increasing pain to deal with:

a) There is a new upstream (3.x)
release, but it is so backwards incompatible, that we won't be able to
adopt it for the foreseeable future. Even though we don’t adopt it, it
still causes pain. Now development becomes that much harder because we'd
have to get a patch for trunk, a patch for 3.x and a patch for the 2.x
branch. Conversely if patches go into 2.x only, now the releases start
drifting apart. We already have (several dozen) patches in production that
have not yet made it upstream, but are striving to keep this list as short
as possible to reduce the rebase pain and risk.

b) Central Daemons (RM, or
pairs of HA NNs) have to be restarted causing a cluster-wide outage. The
work towards work-preserving restart in progress in various areas makes
these kinds of upgrades less painful.

c) Server-side requires different
runtime from client-side. We'd have to produce multiple artifacts, but we
could make that work. For example, NN code uses Java 8 features, but
clients can still use Java 7 to submit jobs and read/write HDFS.

Now for the more painful backwards incompatibilities:

d) All clients have to recompile
(a token uses protobuf instead of thrift, an interface becomes an abstract
class or vice versa). Not only do these kinds of changes make a rolling
upgrade impossible, more importantly it requires all our clients to
recompile their code and redeploy their production pipelines in a
coordinated fashion. On top of this, we have multiple large production
clusters and clients would have to keep multiple incompatible pipelines
running, because we simply cannot upgrade all clusters in all datacenters
at the same time.

e) Customers are forced to restart and can no longer run
with JDK 7 clients because job submission client code or HDFS has started
using JDK 8-only features. Eventually group will reduce, but for at least
another year if not more this will be very painful.

f) Even more painful is
when Yarn/MapReduce APIs change so that customers not only have to
recompile, but also have to change hundreds of scripts / flows in order to
deal with the API change. This problem is compounded by other tools in the
Hadoop ecosystem that would have to deal with these changes. There would be
two different versions of Cascading, HBase, Hive, Pig, Spark, Tez, you name
it.

g) Without proper classpath isolation, third party dependency changes
(guava, protobuf version, etc) are probably as painful as API changes.

h) HDFS client API get changed in a backwards incompatible way requiring all
clients to change their code, recompile and re-start their services in a
coordinated way. We have tens of thousands of production servers reading
from / writing to Hadoop and cannot have all of these long running clients
restart at the same time.

To put these in perspective, despite us being one
of the early adopters of Hadoop 2 in production at the scale of many
thousands of nodes, we are still wrapping up the migration from our last
Hadoop 1 clusters. We have many war stories about many of the above
incompatibilities. As I've tweeted about publicly the gains have been
significant with this migration to Hadoop 2, but the friction has also been
considerable.

To get specific about JDK 8, we are intending to move to Java
8. Right now we're letting clients choose to run tasks with JDK 8
optionally, then we'll make it default. We'll switch to the daemons running
with JDK 8. What we're concerned it would then be feasible to use JDK 8
features on the servers side (see c) above).

I'm suggesting that if we do
allow backwards incompatible changes, we introduce an upgrade path through
an agreed upon stepping stone release. For example, a protocol changing from
thrift to protobuf can be done in steps. In the stepping-stone release both
would be accepted. in the following release (or two releases later) the
thrift version support is dropped.This would allow for a rolling upgrade,
or even if a cluster-wide restart is needed, at least customers can adopt
to the change at a pace of weeks or months. Once no more (important)
customers are running the thrift client, we could then roll to the next
release. It would be useful to coordinate the backward

[jira] [Created] (HDFS-7937) Erasure Coding: INodeFile quota computation unit tests

2015-03-16 Thread Kai Sasaki (JIRA)
Kai Sasaki created HDFS-7937:


 Summary: Erasure Coding: INodeFile quota computation unit tests
 Key: HDFS-7937
 URL: https://issues.apache.org/jira/browse/HDFS-7937
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Kai Sasaki
Assignee: Kai Sasaki
Priority: Minor


Unit test for [HDFS-7826|https://issues.apache.org/jira/browse/HDFS-7826]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7826) Erasure Coding: Update INodeFile quota computation for striped blocks

2015-03-16 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-7826.
-
   Resolution: Fixed
Fix Version/s: HDFS-7285
 Hadoop Flags: Reviewed

I've committed this. Thanks for the contribution, Kai!

> Erasure Coding: Update INodeFile quota computation for striped blocks
> -
>
> Key: HDFS-7826
> URL: https://issues.apache.org/jira/browse/HDFS-7826
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Kai Sasaki
> Fix For: HDFS-7285
>
> Attachments: HDFS-7826.1.patch, HDFS-7826.2.patch, HDFS-7826.3.patch, 
> HDFS-7826.4.patch, HDFS-7826.5.patch
>
>
> Currently INodeFile's quota computation only considers contiguous blocks 
> (i.e., {{INodeFile#blocks}}). We need to update it to support striped blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-7938) OpensslSecureRandom.c pthread_threadid_np usage signature is wrong on 32-bit Mac

2015-03-16 Thread Colin Patrick McCabe (JIRA)
Colin Patrick McCabe created HDFS-7938:
--

 Summary: OpensslSecureRandom.c pthread_threadid_np usage signature 
is wrong on 32-bit Mac
 Key: HDFS-7938
 URL: https://issues.apache.org/jira/browse/HDFS-7938
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 2.7.0
Reporter: Colin Patrick McCabe
Priority: Critical


In OpensslSecureRandom.c, pthread_threadid_np is being used with an unsigned 
long, but the type signature requires a uint64_t.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (HDFS-7620) Change disk quota calculation for EC files

2015-03-16 Thread Jing Zhao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jing Zhao resolved HDFS-7620.
-
Resolution: Duplicate

Hi [~tasanuma0829], I just find that this issue has been solved by HDFS-7826. I 
will resolve this as duplicated, but please feel free to create new jiras if 
you still see issues there.

> Change disk quota calculation for EC files
> --
>
> Key: HDFS-7620
> URL: https://issues.apache.org/jira/browse/HDFS-7620
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: namenode
>Reporter: Tsz Wo Nicholas Sze
>Assignee: Tsz Wo Nicholas Sze
>
> EC files has different disk space usage than replication.  We need to change 
> quota calculation to support it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)