Hadoop-Hdfs-trunk - Build # 614 - Still Failing

2011-03-22 Thread Apache Hudson Server
See https://hudson.apache.org/hudson/job/Hadoop-Hdfs-trunk/614/

###
## LAST 60 LINES OF THE CONSOLE 
###
[...truncated 729780 lines...]
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] 
[junit] 2011-03-22 12:24:42,983 INFO  datanode.DataNode 
(DataNode.java:shutdown(788)) - Waiting for threadgroup to exit, active threads 
is 0
[junit] 2011-03-22 12:24:42,983 INFO  datanode.DataBlockScanner 
(DataBlockScanner.java:run(624)) - Exiting DataBlockScanner thread.
[junit] 2011-03-22 12:24:42,984 INFO  datanode.DataNode 
(DataNode.java:run(1464)) - DatanodeRegistration(127.0.0.1:57260, 
storageID=DS-91605065-127.0.1.1-57260-1300796672283, infoPort=35711, 
ipcPort=34865):Finishing DataNode in: 
FSDataset{dirpath='/grid/0/hudson/hudson-slave/workspace/Hadoop-Hdfs-trunk/trunk/build-fi/test/data/dfs/data/data3/current/finalized,/grid/0/hudson/hudson-slave/workspace/Hadoop-Hdfs-trunk/trunk/build-fi/test/data/dfs/data/data4/current/finalized'}
[junit] 2011-03-22 12:24:42,984 INFO  ipc.Server (Server.java:stop(1626)) - 
Stopping server on 34865
[junit] 2011-03-22 12:24:42,984 INFO  datanode.DataNode 
(DataNode.java:shutdown(788)) - Waiting for threadgroup to exit, active threads 
is 0
[junit] 2011-03-22 12:24:42,984 INFO  datanode.FSDatasetAsyncDiskService 
(FSDatasetAsyncDiskService.java:shutdown(133)) - Shutting down all async disk 
service threads...
[junit] 2011-03-22 12:24:42,985 INFO  datanode.FSDatasetAsyncDiskService 
(FSDatasetAsyncDiskService.java:shutdown(142)) - All async disk service threads 
have been shut down.
[junit] 2011-03-22 12:24:42,985 WARN  datanode.FSDatasetAsyncDiskService 
(FSDatasetAsyncDiskService.java:shutdown(130)) - AsyncDiskService has already 
shut down.
[junit] 2011-03-22 12:24:42,985 INFO  hdfs.MiniDFSCluster 
(MiniDFSCluster.java:shutdownDataNodes(835)) - Shutting down DataNode 0
[junit] 2011-03-22 12:24:42,990 INFO  ipc.Server (Server.java:stop(1626)) - 
Stopping server on 38522
[junit] 2011-03-22 12:24:42,990 INFO  ipc.Server (Server.java:run(691)) - 
Stopping IPC Server Responder
[junit] 2011-03-22 12:24:42,990 INFO  datanode.DataNode 
(DataNode.java:shutdown(788)) - Waiting for threadgroup to exit, active threads 
is 1
[junit] 2011-03-22 12:24:42,991 INFO  ipc.Server (Server.java:run(487)) - 
Stopping IPC Server listener on 38522
[junit] 2011-03-22 12:24:42,991 WARN  datanode.DataNode 
(DataXceiverServer.java:run(142)) - DatanodeRegistration(127.0.0.1:49868, 
storageID=DS-1067466969-127.0.1.1-49868-1300796672121, infoPort=48806, 
ipcPort=38522):DataXceiveServer: java.nio.channels.AsynchronousCloseException
[junit] at 
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)
[junit] at 
sun.nio.ch.ServerSocketChannelImpl.accept(ServerSocketChannelImpl.java:159)
[junit] at 
sun.nio.ch.ServerSocketAdaptor.accept(ServerSocketAdaptor.java:84)
[junit] at 
org.apache.hadoop.hdfs.server.datanode.DataXceiverServer.run(DataXceiverServer.java:135)
[junit] at java.lang.Thread.run(Thread.java:662)
[junit] 
[junit] 2011-03-22 12:24:42,993 INFO  datanode.DataNode 
(DataNode.java:shutdown(788)) - Waiting for threadgroup to exit, active threads 
is 0
[junit] 2011-03-22 12:24:42,995 INFO  ipc.Server (Server.java:run(1459)) - 
IPC Server handler 0 on 38522: exiting
[junit] 2011-03-22 12:24:43,093 INFO  datanode.DataBlockScanner 
(DataBlockScanner.java:run(624)) - Exiting DataBlockScanner thread.
[junit] 2011-03-22 12:24:43,093 INFO  datanode.DataNode 
(DataNode.java:run(1464)) - DatanodeRegistration(127.0.0.1:49868, 
storageID=DS-1067466969-127.0.1.1-49868-1300796672121, infoPort=48806, 
ipcPort=38522):Finishing DataNode in: 
FSDataset{dirpath='/grid/0/hudson/hudson-slave/workspace/Hadoop-Hdfs-trunk/trunk/build-fi/test/data/dfs/data/data1/current/finalized,/grid/0/hudson/hudson-slave/workspace/Hadoop-Hdfs-trunk/trunk/build-fi/test/data/dfs/data/data2/current/finalized'}
[junit] 2011-03-22 12:24:43,094 INFO  ipc.Server (Server.java:stop(1626)) - 
Stopping server on 38522
[junit] 2011-03-22 12:24:43,094 INFO  datanode.DataNode 
(DataNode.java:shutdown(788)) - Waiting for threadgroup to exit, active threads 
is 0
[junit] 2011-03-22 12:24:43,094 INFO  datanode.FSDatasetAsyncDiskService 
(FSDatasetAsyncDiskService.java:shutdown(133)) - Shutting down all async disk 
service threads...
[junit] 2011-03-22 12:24:43,094 INFO  datanode.FSDatasetAsyncDiskService 
(FSDatasetAsyncDiskService.java:shutdown(142)) - All async disk service threads 
have been shut down.
[junit] 2011-03-22 12:24:43,095 WARN  datanode.FSDatasetAsyncDiskService 
(FSDatasetAsyncDiskService.java:shutdown(130)) - AsyncDiskService has already 
shut down.
[junit] 2011-03-22 12:24:43,196 WARN  namenode.DecommissionManager 
(

[jira] [Created] (HDFS-1774) Optimization in org.apache.hadoop.hdfs.server.datanode.FSDataset class.

2011-03-22 Thread Uma Maheswara Rao G (JIRA)
Optimization in org.apache.hadoop.hdfs.server.datanode.FSDataset class.
---

 Key: HDFS-1774
 URL: https://issues.apache.org/jira/browse/HDFS-1774
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: data-node
Reporter: Uma Maheswara Rao G
Assignee: Uma Maheswara Rao G


 Inner class FSDir constructor is doing duplicate iterations over the listed 
files in the passed directory. We can optimize this to single loop and also we 
can avoid isDirectory check which will perform some native invocations. 

  Consider a case: one directory has only one child directory and 1 files. 

1) First loop will get the number of children directories.

2) if (numChildren > 0) , This condition will satisfy and again it will iterate 
10001 times and also will check isDirectory.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1758) Web UI JSP pages thread safety issue

2011-03-22 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1758?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-1758.
---

Resolution: Fixed

I committed the patch. Thank you Tanping.

> Web UI JSP pages thread safety issue
> 
>
> Key: HDFS-1758
> URL: https://issues.apache.org/jira/browse/HDFS-1758
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.20.203.1
> Environment: branch-20-security
>Reporter: Tanping Wang
>Assignee: Tanping Wang
>Priority: Minor
> Fix For: 0.20.4
>
> Attachments: HDFS-1758.patch
>
>
> The set of JSP pages that web UI uses are not thread safe.  We have observed 
> some problems when requesting Live/Dead/Decommissioning pages from the web 
> UI, incorrect page is displayed.  To be more specific, requesting Dead node 
> list page, sometimes, Live node page is returned.  Requesting decommissioning 
> page, sometimes, dead page is returned.
> The root cause of this problem is that JSP page is not thread safe by 
> default.  When multiple requests come in,  each request is assigned to a 
> different thread, multiple threads access the same instance of the servlet 
> class resulted from a JSP page.  A class variable is shared by multiple 
> threads.  The JSP code in 20 branche, for example, dfsnodelist.jsp has
> {code}
>int rowNum = 0;
>   int colNum = 0;
>   String sorterField = null;
>   String sorterOrder = null;
>   String whatNodes = "LIVE";
>   ...
> %>
> {code}
> declared as  class variables.  ( These set of variables are declared within 
> <%! code %> directives which made them class members. )  Multiple threads 
> share the same set of class member variables, one request would step on 
> anther's toe. 
> However, due to the JSP code refactor, HADOOP-5857, all of these class member 
> variables are moved to become function local variables.  So this bug does not 
> appear in Apache trunk.  Hence, we have proposed to take a simple fix for 
> this bug on 20 branch alone, to be more specific, branch-0.20-security.
> The simple fix is to add jsp ThreadSafe="false" directive into the related 
> JSP pages, dfshealth.jsp and dfsnodelist.jsp to make them thread safe, i.e. 
> only on request is processed at each time. 
> We did evaluate the thread safety issue for other JSP pages on trunk, we 
> noticed a potential problem is that when we retrieving some statistics from 
> namenode, for example, we make the call to 
> {code}
> NamenodeJspHelper.getInodeLimitText(fsn);
> {code}
> in dfshealth.jsp, which eventuality is 
> {code}
>   static String getInodeLimitText(FSNamesystem fsn) {
> long inodes = fsn.dir.totalInodes();
> long blocks = fsn.getBlocksTotal();
> long maxobjects = fsn.getMaxObjects();
> 
> {code}
> some of the function calls are already guarded by readwritelock, e.g. 
> dir.totalInodes, but others are not.  As a result of this, the web ui results 
> are not 100% thread safe.  But after evaluating the prons and cons of adding 
> a giant lock into the JSP pages, we decided not to issue FSNamesystem 
> ReadWrite locks into JSPs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1386) unregister namenode datanode info MXBean

2011-03-22 Thread Tanping Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang resolved HDFS-1386.


  Resolution: Duplicate
Release Note:   (was: Making it a blocker for 0.22 as this causes tests to 
fail.)

This bug fix is included in HADOOP-6728.

> unregister namenode datanode info MXBean
> 
>
> Key: HDFS-1386
> URL: https://issues.apache.org/jira/browse/HDFS-1386
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: data-node, name-node, test
>Affects Versions: 0.22.0
>Reporter: Tanping Wang
>Assignee: Tanping Wang
>Priority: Blocker
> Fix For: 0.22.0
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-1695) HDFS federation: Fix testOIV and TestDatanodeUtils

2011-03-22 Thread Tanping Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanping Wang resolved HDFS-1695.


Resolution: Fixed

> HDFS federation: Fix testOIV and TestDatanodeUtils
> --
>
> Key: HDFS-1695
> URL: https://issues.apache.org/jira/browse/HDFS-1695
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: Federation Branch
>Reporter: Tanping Wang
>Assignee: Tanping Wang
>Priority: Minor
> Fix For: Federation Branch
>
> Attachments: HDFS-1695.patch
>
>
> # TestOfflineImageViewer fails due to the newest fsimage versions not being 
> listed as supported.  We're missing HDFS:1500, which fixed this same problem 
> for a previous commit from Hairong.  
> # Fix TestOIV.  The supported version got bumped from -25 to-27 during a  
> merge (19bbb922bc8e08a3a546b578b476be3ea79d45ea), which didn't update the 
> TestOIV.  
> # Fix TestDatanodeUtils.  TestDataNodeUtils was named as if it were a test, 
> which it wasn't, and so was being failed.
> Contributed by Jakob Homan.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-1775) getContentSummary should hold the FSNamesystem readlock

2011-03-22 Thread Dmytro Molkov (JIRA)
getContentSummary should hold the FSNamesystem readlock
---

 Key: HDFS-1775
 URL: https://issues.apache.org/jira/browse/HDFS-1775
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Dmytro Molkov
Assignee: Dmytro Molkov
Priority: Minor


Right now the getContentSummary call on the namenode only holds the FSDirectory 
lock, but not the FSNamesystem lock. What we are seeing because of that is: 
1) getContentSummary takes the read lock on FSDirectory 
2) the write operation comes and takes a write lock on FSNamesystem and waits 
for getContentSummary to finish to get a write lock on FSDirectory

As a result other read operations can't be executed. Since getContentSummary 
can take a while to execute on large directories, the performance would be 
improved if we hold a readlock while doing that.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Merging Namenode Federation feature (HDFS-1052) to trunk

2011-03-22 Thread Allen Wittenauer

On Mar 21, 2011, at 4:08 PM, Sanjay Radia wrote:
> 
> Allen, not sure if I explained the difference above.
> Base on the discussion we had at the Hug, I want to clarify a few things

Thanks for taking the time at HUG.  (I've since figured out that I lost 
your messages as part of my email list transition.)

> A DN stores block for only ONE cluster.


But this does make things easier.  Although I'm still fairly confident 
that it adds too much complexity for little gain though.  So put this in the 
'agree to disagree' column.  It would still be nice if you guys could lay off 
the camelCase options though.  Admins hate the shift key.

BTW, Robert C. asked what I thought you guys should have been working 
on instead of Federation.  I told him (and you) high availability of the 
namenode (which I still believe is necessary for HDFS in more and more cases), 
but I've had more time to think about it.  So expect my list (which I'll post 
here) soon.  :p




[jira] [Created] (HDFS-1776) Bug in Concat code

2011-03-22 Thread Dmytro Molkov (JIRA)
Bug in Concat code
--

 Key: HDFS-1776
 URL: https://issues.apache.org/jira/browse/HDFS-1776
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Dmytro Molkov


There is a bug in the concat code. Specifically: in INodeFile.appendBlocks() we 
need to first reassign the blocks list and then go through it and update the 
INode pointer. Otherwise we are not updating the inode pointer on all of the 
new blocks in the file.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira