[jira] [Created] (HDFS-2915) HA: TestFailureOfSharedDir.testFailureOfSharedDir() has race condition

2012-02-08 Thread Bikas Saha (Created) (JIRA)
HA: TestFailureOfSharedDir.testFailureOfSharedDir() has race condition
--

 Key: HDFS-2915
 URL: https://issues.apache.org/jira/browse/HDFS-2915
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: name-node
Affects Versions: HA branch (HDFS-1623)
Reporter: Bikas Saha
Assignee: Bikas Saha
Priority: Minor


The test deletes the shared edits dir to simulate a failure. Then it calls 
rollEditLogs() to trigger the deleted dir to be used and fail with an 
IOException. Unfortunately, deleting the shared dir can put the NN in safe mode 
due to lack of space. This causes a SafeModeException to be thrown when 
rollEditDirs() is called. This exception is caught as an IOException in the 
test but the associated assert in the catch block fails.

This always happens in the debugger because the delay in stepping through 
causes the safe mode change to happen before rollEditLogs() gets called.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Jenkins build became unstable: Hadoop-Hdfs-0.23-Build #163

2012-02-08 Thread Apache Jenkins Server
See 




Hadoop-Hdfs-0.23-Build - Build # 163 - Unstable

2012-02-08 Thread Apache Jenkins Server
See https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/163/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 14544 lines...]

main:
[INFO] Executed tasks
[INFO] 
[INFO] --- maven-javadoc-plugin:2.8.1:jar (module-javadocs) @ 
hadoop-hdfs-project ---
[INFO] Not executing Javadoc as the project is not a Java classpath-capable 
package
[INFO] 
[INFO] --- maven-source-plugin:2.1.2:jar-no-fork (hadoop-java-sources) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-site-plugin:3.0:attach-descriptor (attach-descriptor) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- maven-checkstyle-plugin:2.6:checkstyle (default-cli) @ 
hadoop-hdfs-project ---
[INFO] 
[INFO] --- findbugs-maven-plugin:2.3.2:findbugs (default-cli) @ 
hadoop-hdfs-project ---
[INFO] ** FindBugsMojo execute ***
[INFO] canGenerate is false
[INFO] 
[INFO] Reactor Summary:
[INFO] 
[INFO] Apache Hadoop HDFS  SUCCESS [5:39.356s]
[INFO] Apache Hadoop HttpFS .. SUCCESS [39.278s]
[INFO] Apache Hadoop HDFS Project  SUCCESS [0.057s]
[INFO] 
[INFO] BUILD SUCCESS
[INFO] 
[INFO] Total time: 6:19.252s
[INFO] Finished at: Wed Feb 08 11:41:42 UTC 2012
[INFO] Final Memory: 74M/748M
[INFO] 
+ /home/jenkins/tools/maven/latest/bin/mvn test 
-Dmaven.test.failure.ignore=true -Pclover 
-DcloverLicenseLocation=/home/jenkins/tools/clover/latest/lib/clover.license
Archiving artifacts
Publishing Clover coverage report...
Publishing Clover HTML report...
Publishing Clover XML report...
Publishing Clover coverage results...
Recording test results
Build step 'Publish JUnit test result report' changed build result to UNSTABLE
Publishing Javadoc
Recording fingerprints
Updating MAPREDUCE-3415
Updating MAPREDUCE-3770
Updating HDFS-2572
Updating MAPREDUCE-3823
Updating MAPREDUCE-3822
Updating MAPREDUCE-3834
Updating MAPREDUCE-3815
Updating MAPREDUCE-3833
Updating MAPREDUCE-3828
Updating MAPREDUCE-3827
Updating MAPREDUCE-3826
Updating HDFS-2786
Updating HADOOP-8013
Updating MAPREDUCE-3436
Updating HADOOP-7813
Updating HADOOP-7841
Updating HADOOP-7851
Sending e-mails to: hdfs-dev@hadoop.apache.org
Email was triggered for: Unstable
Sending email for trigger: Unstable



###
## FAILED TESTS (if any) 
##
1 tests failed.
REGRESSION:  
org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks.testListCorruptFileBlocksInSafeMode

Error Message:
Namenode has 2 bad files. Expecting 1.

Stack Trace:
java.lang.AssertionError: Namenode has 2 bad files. Expecting 1.
at org.junit.Assert.fail(Assert.java:91)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks.__CLR3_0_2mvj3yz1cqt(TestListCorruptFileBlocks.java:239)
at 
org.apache.hadoop.hdfs.server.namenode.TestListCorruptFileBlocks.testListCorruptFileBlocksInSafeMode(TestListCorruptFileBlocks.java:133)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
at 
org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
at 
org.apache.maven.surefire.junit4.JUnit4TestSet.execute(JUnit4TestSet.java:53)
at 
org

[PROPOSAL] Hadoop OSGi compliant and Apache Karaf features

2012-02-08 Thread Jean-Baptiste Onofré

Hi folks,

I'm working right now to turn Hadoop as an OSGi compliant set of modules.

I've more or less achieved the first step:
- turn all Hadoop modules (common, annotations, hdfs, mapreduce, etc) as 
OSGi bundle
- provide a Karaf features descriptor to easily deploy it into Apache 
Karaf OSGi container


I will upload the patches on the different Jira related to that.

The second step that I propose is to introduce blueprint descriptor in 
order to expose some Hadoop features as OSGi services.
It won't affect the "non-OSGi" users but give lot of fun and interest 
for OSGi users ;)


WDYT ?

Regards
JB

PS: the Jira issues are HADOOP-6484, HADOOP-7977, MAPREDUCE-243. It 
would be great if someone can assign it to me (easier to track it).

--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


Re: [PROPOSAL] Hadoop OSGi compliant and Apache Karaf features

2012-02-08 Thread Steve Loughran

On 08/02/12 14:25, Jean-Baptiste Onofré wrote:

Hi folks,

I'm working right now to turn Hadoop as an OSGi compliant set of modules.

I've more or less achieved the first step:
- turn all Hadoop modules (common, annotations, hdfs, mapreduce, etc) as
OSGi bundle
- provide a Karaf features descriptor to easily deploy it into Apache
Karaf OSGi container

I will upload the patches on the different Jira related to that.

The second step that I propose is to introduce blueprint descriptor in
order to expose some Hadoop features as OSGi services.
It won't affect the "non-OSGi" users but give lot of fun and interest
for OSGi users ;)



Zookeeper would be nice too, as you could bring up a very small cluster

As I mentioned in one of the JIRA comments

-there are a lot of calls to System.exit() in Hadoop when it isn't 
happy, you need a security manager to catch them and turn them into 
exceptions -and no, the code doesn't expect exceptions everywhere.


-There are a lot of assumptions that every service (namenode, datanode, 
etc) is running in its own VM, with its own singletons. They will all 
need their own classloaders, which implies separate OSGi bundles for 
each public service.


YARN is even more interesting, as it works by deploying the application 
master (such as the MR engine) on request, picking a suitable node and 
executing the entry point with a classpath (somehow) set up. If you are 
going to work with trunk you will need to address this, the simplest 
tactic being "don't try and run YARN-based services under OSGi, just the 
YARN Resource Manager and Node Managers itself";


A more advanced options "support OSGi-based YARN services specially", 
would also be good if it could start both Application Masters and their 
container applications themselves (Task Trackers &c), and aided the 
execution of things like actual tasks within the OSGi container (for 
speed).


If you are looking a production use of this stuff, you'll need to worry 
about loading of the native libraries too. Otherwise this becomes more 
restricted to experimental-small-machine setups.




[jira] [Resolved] (HDFS-2854) SecurityUtil.buildTokenService returns java.net.UnknownHostException when using paths like viewfs://default/some/path

2012-02-08 Thread Daryn Sharp (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daryn Sharp resolved HDFS-2854.
---

Resolution: Duplicate

Will be indirectly fixed by the comprehensive changes in the linked jiras.

> SecurityUtil.buildTokenService returns java.net.UnknownHostException when 
> using paths like viewfs://default/some/path
> -
>
> Key: HDFS-2854
> URL: https://issues.apache.org/jira/browse/HDFS-2854
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 0.24.0, 0.23.1
>Reporter: Arpit Gupta
>Assignee: Daryn Sharp
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: [PROPOSAL] Hadoop OSGi compliant and Apache Karaf features

2012-02-08 Thread Jean-Baptiste Onofré

Hi Steve

My other comments inline:

Zookeeper would be nice too, as you could bring up a very small cluster


+1, I will tackle that too ;)


-there are a lot of calls to System.exit() in Hadoop when it isn't
happy, you need a security manager to catch them and turn them into
exceptions -and no, the code doesn't expect exceptions everywhere.


I will check if we can trap this. Maybe a modification in the core code 
could do that.




-There are a lot of assumptions that every service (namenode, datanode,
etc) is running in its own VM, with its own singletons. They will all
need their own classloaders, which implies separate OSGi bundles for
each public service.


We can imagine a kind of "fork" in the OSGi container. On the other 
hand, singletons are per classloader, so we can handle that.




YARN is even more interesting, as it works by deploying the application
master (such as the MR engine) on request, picking a suitable node and
executing the entry point with a classpath (somehow) set up. If you are
going to work with trunk you will need to address this, the simplest
tactic being "don't try and run YARN-based services under OSGi, just the
YARN Resource Manager and Node Managers itself";

A more advanced options "support OSGi-based YARN services specially",
would also be good if it could start both Application Masters and their
container applications themselves (Task Trackers &c), and aided the
execution of things like actual tasks within the OSGi container (for
speed).

If you are looking a production use of this stuff, you'll need to worry
about loading of the native libraries too. Otherwise this becomes more
restricted to experimental-small-machine setups.



Thanks for these comments !! I will take care of that in the following 
patches ;)


Thanks again,
Regards
JB

--
Jean-Baptiste Onofré
jbono...@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com


[jira] [Created] (HDFS-2916) HA: allow dfsadmin to refer to a particular namenode

2012-02-08 Thread Eli Collins (Created) (JIRA)
HA: allow dfsadmin to refer to a particular namenode


 Key: HDFS-2916
 URL: https://issues.apache.org/jira/browse/HDFS-2916
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: HA branch (HDFS-1623)
Reporter: Eli Collins
Assignee: Eli Collins


dfsadmin currently fails over like other clients, so if we you want to put a 
particular NN in safemode you need to use the "fs" option and specify a 
host:ipcport target. Like HDFS-2808 it would be useful to be able to specify a 
logical namenode ID instead of an RPC addr. Since fs is part of generic options 
this could potentially apply to all tools, however most tools want to refer to 
the default logical namenode URI and failover like other clients.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2917) HA: haadmin should not work if run by regular user

2012-02-08 Thread Aaron T. Myers (Created) (JIRA)
HA: haadmin should not work if run by regular user
--

 Key: HDFS-2917
 URL: https://issues.apache.org/jira/browse/HDFS-2917
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node
Reporter: Aaron T. Myers
 Fix For: HA branch (HDFS-1623)


Like dfsadmin, haadmin should require HDFS superuser privileges to work. 
Currently any user can use haadmin.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2918) HA: dfsadmin should failover like other clients

2012-02-08 Thread Eli Collins (Created) (JIRA)
HA: dfsadmin should failover like other clients
---

 Key: HDFS-2918
 URL: https://issues.apache.org/jira/browse/HDFS-2918
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: HA branch (HDFS-1623)
Reporter: Eli Collins


dfsadmin currently always uses the first namenode rather than failing over. It 
should failover like other clients, unless fs specifies a specific namenode.

{noformat}
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs haadmin -failover nn1 nn2
Failover from nn1 to nn2 successful
# nn2 is 8022
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfsadmin -fs localhost:8022 -safemode enter
Safe mode is ON
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfsadmin -safemode get 
Safe mode is OFF
hadoop-0.24.0-SNAPSHOT $ ./bin/hdfs dfsadmin -fs localhost:8022 -safemode get
Safe mode is ON
{noformat}


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2919) Add method(s) to FSDatasetInterface for checking if the scanners are supported

2012-02-08 Thread Tsz Wo (Nicholas), SZE (Created) (JIRA)
Add method(s) to FSDatasetInterface for checking if the scanners are supported
--

 Key: HDFS-2919
 URL: https://issues.apache.org/jira/browse/HDFS-2919
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: Tsz Wo (Nicholas), SZE
Assignee: Tsz Wo (Nicholas), SZE




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2920) HA: fix remaining TODO items

2012-02-08 Thread Eli Collins (Created) (JIRA)
HA: fix remaining TODO items


 Key: HDFS-2920
 URL: https://issues.apache.org/jira/browse/HDFS-2920
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Reporter: Eli Collins


There are a number of "TODO(HA)" and "TODO:HA" comments we need to fix or 
remove.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2921) HA: HA docs need to cover decomissioning

2012-02-08 Thread Eli Collins (Created) (JIRA)
HA: HA docs need to cover decomissioning


 Key: HDFS-2921
 URL: https://issues.apache.org/jira/browse/HDFS-2921
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: documentation, ha
Affects Versions: HA branch (HDFS-1623)
Reporter: Eli Collins


We need to cover decomissioning in the HA docs as is done in the [federation 
decomissioning 
docs|http://hadoop.apache.org/common/docs/r0.23.0/hadoop-yarn/hadoop-yarn-site/Federation.html#Decommissioning].
 The same process should apply, we need to refresh all the namenodes (same 
commands should work).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2922) HA: close out operation categories

2012-02-08 Thread Eli Collins (Created) (JIRA)
HA: close out operation categories
--

 Key: HDFS-2922
 URL: https://issues.apache.org/jira/browse/HDFS-2922
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha
Affects Versions: HA branch (HDFS-1623)
Reporter: Eli Collins
Assignee: Eli Collins


We need to close out the NN operations categories.

The following operations should be left as is, ie not failover, as it's 
reasonable to call these on a standby, and we just need to update the TODO with 
a comment:
- {{setSafeMode}} (Might want to force the standby out of safemode)
- {{restoreFailedStorage}} (Might want to tell the standby to restore the 
shared edits dir)
- {{saveNamespace}}, {{metaSave}} (Could imagine calling these on a standby eg 
in a recovery scenario)
- {{refreshNodes}} (Decommissioning needs to refresh the standby)

The following operations should be checked for READ, as neither should need to 
be called on standby, will failover unless stale reads are enabled:
- {{getTransactionID}}, {{getEditLogManifest}} (we don't checkoint the standby)

The following operations should be checked for WRITE, as they should not be 
called on a standby, ie should always failover:
- {{finalizeUpgrade}}, {{distributedUpgradeProgress}} (should not be able to 
upgrade the standby)
- {{setBalancerBandwidth}} (balancer should failover)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




Re: Review request: trunk->HDFS-1623 merge

2012-02-08 Thread Todd Lipcon
The branch developed some new conflicts due to recent changes in trunk
affecting the RPC between the DN and the NN (the "StorageReport"
stuff). I've done a new merge to address these conflicts here:

https://github.com/toddlipcon/hadoop-common/tree/ha-merge-20120208

I've also addressed Aaron's comments in the thread above.
I ran the unit tests on the branch and they passed.

Thanks
-Todd

On Fri, Feb 3, 2012 at 4:44 PM, Aaron T. Myers  wrote:
> Hey Todd,
>
> The merge largely looks good. I agree with the general approach you took. A
> few small comments:
>
> 1. There's a comment in the OP_ADD case blockabout handling OP_CLOSE. This
> makes sense in 0.22/0.23/0.24, but in the HA branch the OP_ADD and OP_CLOSE
> cases are completely separate case blocks. I actually find this whole
> comment a little confusing, since it numbers the cases we have to handle,
> but those numbers aren't referenced anywhere else.
>
> 2. You mentioned in your message that you don't handle the (invalid) case
> of OP_ADD on a new file containing updated blocks, but it looks like the
> code actually does, though the code also mentions that we should add a
> sanity check that this is actually can't occur. Seems like we should clean
> up this inconsistency. I agree that adding asserting this case doesn't
> occur is the right way to go.
>
> 3. If we go with my suggestion in (2), we can also move the call to
> FSEditLogLoader#updateBlocks to only the case of OP_ADD for an existing
> file, and then get rid of the "INodeFile newFile = oldFile" assignment,
> which I found kind of confusing at first. (Though I do see why it's correct
> as-implemented.) If you don't go with my suggestion in (2), please add a
> comment explaining the assignment.
>
> Otherwise looks good. Merge away.
>
> --
> Aaron T. Myers
> Software Engineer, Cloudera
>
>
>
> On Fri, Feb 3, 2012 at 2:10 PM, Todd Lipcon  wrote:
>
>> I've got a merge pending of trunk into HDFS-1623 -- it was a bit
>> complicated so wanted to ask for another set of eyes:
>> https://github.com/toddlipcon/hadoop-common/tree/ha-merge-20120203
>> (using github since it's hard to review a merge patch via JIRA)
>>
>> The interesting bit of the merge was to deal with conflicts with
>> HDFS-2718. To summarize the changes I had to make:
>> - in the HDFS-1623 branch, we don't deal with the case where OP_ADD
>> contains blocks on a new file -- this is a case that doesn't happen on
>> real clusters, but currently happens with synthetic logs generated
>> from the CreateEditLogs tool. I added a TODO to add a sanity check
>> here and will address as a follow-up. Given the difference between
>> trunk and branch, there were a couple of small changes that propagated
>> into unprotectedAddFile
>> - In the HDFS-1623 branch we had already implemented the
>> "updateBlocks" call inside FSEditLogLoader. I used that existing
>> implementation rather than adding the new one in FSDirectory, since
>> this function had some other changes related to HA in the branch
>> version.
>>
>> I'll wait for a +1 before committing. I ran all of the unit tests and
>> they passed.
>>
>> -Todd
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>



-- 
Todd Lipcon
Software Engineer, Cloudera


[jira] [Created] (HDFS-2923) Namenode IPC handler count uses the wrong configuration key

2012-02-08 Thread Todd Lipcon (Created) (JIRA)
Namenode IPC handler count uses the wrong configuration key
---

 Key: HDFS-2923
 URL: https://issues.apache.org/jira/browse/HDFS-2923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.23.0, 0.23.1
Reporter: Todd Lipcon
Assignee: Todd Lipcon
Priority: Critical


In HDFS-1763, there was a typo introduced which causes the namenode to use 
dfs.datanode.handler.count to set the number of IPC threads instead of the 
correct dfs.namenode.handler.count. This results in bad performance under high 
load, since there are not nearly enough handlers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2923) Namenode IPC handler count uses the wrong configuration key

2012-02-08 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-2923.
---

   Resolution: Fixed
Fix Version/s: 0.23.2
   0.24.0
 Hadoop Flags: Reviewed

Committed to branch, thanks for reviewing Eli.

> Namenode IPC handler count uses the wrong configuration key
> ---
>
> Key: HDFS-2923
> URL: https://issues.apache.org/jira/browse/HDFS-2923
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: name-node
>Affects Versions: 0.23.0, 0.23.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.24.0, 0.23.2
>
> Attachments: hdfs-2923.txt
>
>
> In HDFS-1763, there was a typo introduced which causes the namenode to use 
> dfs.datanode.handler.count to set the number of IPC threads instead of the 
> correct dfs.namenode.handler.count. This results in bad performance under 
> high load, since there are not nearly enough handlers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HDFS-2924) Standby checkpointing fails to authenticate in secure cluster

2012-02-08 Thread Todd Lipcon (Created) (JIRA)
Standby checkpointing fails to authenticate in secure cluster
-

 Key: HDFS-2924
 URL: https://issues.apache.org/jira/browse/HDFS-2924
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: ha, name-node, security
Affects Versions: HA branch (HDFS-1623)
Reporter: Todd Lipcon
Priority: Critical


When running HA on a secure cluster, the SBN checkpointing process doesn't seem 
to pick up the keytab-based credentials for its RPC connection to the active. I 
think we're just missing a doAs() in the right spot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HDFS-2579) Starting delegation token manager during safemode fails

2012-02-08 Thread Todd Lipcon (Resolved) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon resolved HDFS-2579.
---

   Resolution: Fixed
Fix Version/s: HA branch (HDFS-1623)
 Hadoop Flags: Reviewed

Committed to HA branch, thanks for the reviews.

> Starting delegation token manager during safemode fails
> ---
>
> Key: HDFS-2579
> URL: https://issues.apache.org/jira/browse/HDFS-2579
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: ha, name-node, security
>Affects Versions: HA branch (HDFS-1623)
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: HA branch (HDFS-1623)
>
> Attachments: hdfs-2579.txt, hdfs-2579.txt, hdfs-2579.txt
>
>
> I noticed this on the HA branch, but it seems to actually affect non-HA 
> branch 0.23 if security is enabled. When the NN starts up, if security is 
> enabled, we start the delegation token secret manager, which then tries to 
> call {{logUpdateMasterKey}}. This fails because the edit logs may not be 
> written while in safe-mode.
> It seems to me that there's not any necessary reason that you have to make a 
> new master key at startup, since you've loaded the old key when you load the 
> FSImage. You'd only be lacking a DT master key on a fresh cluster, in which 
> case we could have it generate one at format time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira