[jira] [Created] (HDFS-4130) The reading for editlog at NN starting using bkjm is not efficient

2012-10-30 Thread Han Xiao (JIRA)
Han Xiao created HDFS-4130:
--

 Summary: The reading for editlog at NN starting using bkjm  is not 
efficient
 Key: HDFS-4130
 URL: https://issues.apache.org/jira/browse/HDFS-4130
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: ha, performance
Affects Versions: 2.0.2-alpha
Reporter: Han Xiao


Now, the method of BookKeeperJournalManager.selectInputStreams is written like:

while (true) {
  EditLogInputStream elis;
  try {
elis = getInputStream(fromTxId, inProgressOk);
  } catch (IOException e) {
LOG.error(e);
return;
  }
  if (elis == null) {
return;
  }
  streams.add(elis);
  if (elis.getLastTxId() == HdfsConstants.INVALID_TXID) {
return;
  }
  fromTxId = elis.getLastTxId() + 1;
}
 
EditLogInputstream is got from getInputStream(), which will read the ledgers 
from zookeeper in each calling.
This will be a larger cost of times when the the number ledgers becomes large.
The reading of ledgers from zk is not necessary for every calling of 
getInputStream().

The log of time wasting here is as follows:
2012-10-30 16:44:52,995 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: 
Caching file names occuring more than 10 times
2012-10-30 16:49:24,643 INFO 
hidden.bkjournal.org.apache.bookkeeper.proto.PerChannelBookieClient: 
Successfully connected to bookie: /167.52.1.121:318

The stack of the process when blocking between the two lines of log is like:
"main" prio=10 tid=0x4011f000 nid=0x39ba in Object.wait() 
[0x7fca020fe000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1253)
- locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1129)
at 
org.apache.hadoop.contrib.bkjournal.utils.RetryableZookeeper.getData(RetryableZookeeper.java:501)
at 
hidden.bkjournal.org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160)
at 
org.apache.hadoop.contrib.bkjournal.EditLogLedgerMetadata.read(EditLogLedgerMetadata.java:113)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getLedgerList(BookKeeperJournalManager.java:725)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.getInputStream(BookKeeperJournalManager.java:442)
at 
org.apache.hadoop.contrib.bkjournal.BookKeeperJournalManager.selectInputStreams(BookKeeperJournalManager.java:480)

betweent different time, the diff of stack is:
diff stack stack2
1c1
< 2012-10-30 16:44:53
---
> 2012-10-30 16:46:17
106c106
<   - locked <0x0006fb8495a8> (a 
hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)
---
>   - locked <0x0006fae58468> (a 
> hidden.bkjournal.org.apache.zookeeper.ClientCnxn$Packet)

In our environment, the waiting time could even reach to tens of minutes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HDFS-3809) Make BKJM use protobufs for all serialization with ZK

2012-10-30 Thread Robert Joseph Evans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Joseph Evans reopened HDFS-3809:
---


Branch-2 is failing with 
{noformat}
main:
 [exec] bkjournal.proto:30:12: "NamespaceInfoProto" is not defined.
{noformat}

after this was merged in.  Please either fix it or revert the change.

> Make BKJM use protobufs for all serialization with ZK
> -
>
> Key: HDFS-3809
> URL: https://issues.apache.org/jira/browse/HDFS-3809
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: 2.0.0-alpha, 3.0.0
>Reporter: Ivan Kelly
>Assignee: Ivan Kelly
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: 0004-HDFS-3809-for-branch-2.patch, HDFS-3809.diff, 
> HDFS-3809.diff, HDFS-3809.diff
>
>
> HDFS uses protobufs for serialization in many places. Protobufs allow fields 
> to be added without breaking bc or requiring new parsing code to be written. 
> For this reason, we should use them in BKJM also.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-3822) TestWebHDFS fails intermittently with NullPointerException

2012-10-30 Thread Trevor Robinson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Trevor Robinson resolved HDFS-3822.
---

  Resolution: Duplicate
Release Note: Yes, this probably is a duplicate of HDFS-3664. The stack 
traces look the same, and I haven't seen it occur since that issue was fixed.

> TestWebHDFS fails intermittently with NullPointerException
> --
>
> Key: HDFS-3822
> URL: https://issues.apache.org/jira/browse/HDFS-3822
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: test
>Affects Versions: 2.0.2-alpha
> Environment: Apache Maven 3.0.4
> Maven home: /usr/share/maven
> Java version: 1.6.0_24, vendor: Sun Microsystems Inc.
> Java home: /usr/lib/jvm/java-6-openjdk-amd64/jre
> Default locale: en_US, platform encoding: ISO-8859-1
> OS name: "linux", version: "3.2.0-25-generic", arch: "amd64", family: "unix"
>Reporter: Trevor Robinson
>  Labels: test-fail
> Attachments: org.apache.hadoop.hdfs.web.TestWebHDFS-output.txt, 
> org.apache.hadoop.hdfs.web.TestWebHDFS.txt
>
>
> I've hit this test failure a few times in trunk:
> {noformat}
> Tests run: 2, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 58.835 sec 
> <<< FAILURE!
> testNamenodeRestart(org.apache.hadoop.hdfs.web.TestWebHDFS)  Time elapsed: 
> 52.105 sec  <<< FAILURE!
> java.lang.AssertionError: There are 1 exception(s):
>   Exception 0: 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.RemoteException): 
> java.lang.NullPointerExcept
> ion
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlocksMap.getBlockCollection(BlocksMap.java:101)
> at 
> org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.getBlockCollection(BlockManager.java:2926)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.isValidBlock(FSNamesystem.java:4474)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.allocateBlock(FSNamesystem.java:2439)
> at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2200)
> at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:471)
> at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtoco
> lServerSideTranslatorPB.java:295)
> at 
> org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMetho
> d(ClientNamenodeProtocolProtos.java:43388)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:473)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:898)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1693)
> at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1689)
> {noformat}
> It appears that {{close}} has been called on the {{BlocksMap}} before 
> {{getBlockCollection}}.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4131) Add a tool to print the diff between two snapshots and diff of a snapshot from the current tree

2012-10-30 Thread Suresh Srinivas (JIRA)
Suresh Srinivas created HDFS-4131:
-

 Summary: Add a tool to print the diff between two snapshots and 
diff of a snapshot from the current tree
 Key: HDFS-4131
 URL: https://issues.apache.org/jira/browse/HDFS-4131
 Project: Hadoop HDFS
  Issue Type: Sub-task
Affects Versions: Snapshot (HDFS-2802)
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas


This jira tracks tool to print diff between an two snapshots at a given path. 
The tool will also print the difference between the current directory and the 
given snapshot.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-3923) libwebhdfs testing code cleanup

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-3923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-3923.
---

   Resolution: Fixed
Fix Version/s: 2.0.3-alpha
   3.0.0
 Hadoop Flags: Reviewed

I committed the patch to branch-2 as well.

Thank you Jing for the patch. Thank you Colin and Andy for the review.

> libwebhdfs testing code cleanup
> ---
>
> Key: HDFS-3923
> URL: https://issues.apache.org/jira/browse/HDFS-3923
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 3.0.0, 2.0.3-alpha
>
> Attachments: HDFS-3923.001.patch, HDFS-3923.002.patch
>
>
> 1. Testing code cleanup for libwebhdfs
> 1.1 Tests should generate a test-specific filename and should use TMPDIR 
> appropriately.
> 2. Enabling automate testing

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HDFS-4118) Change INodeDirectory.getExistingPathINodes(..) to work with snapshots

2012-10-30 Thread Suresh Srinivas (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-4118.
---

   Resolution: Fixed
Fix Version/s: Snapshot (HDFS-2802)
 Hadoop Flags: Reviewed

I committed the patch to HDFS-2802 branch. Thank you Jing.

> Change INodeDirectory.getExistingPathINodes(..) to work with snapshots
> --
>
> Key: HDFS-4118
> URL: https://issues.apache.org/jira/browse/HDFS-4118
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>  Components: name-node
>Affects Versions: Snapshot (HDFS-2802)
>Reporter: Tsz Wo (Nicholas), SZE
>Assignee: Jing Zhao
> Fix For: Snapshot (HDFS-2802)
>
> Attachments: HDFS-4118.001.patch
>
>
> {code}
> int getExistingPathINodes(byte[][] components, INode[] existing, boolean 
> resolveLink)
> {code}
> The INodeDirectory above retrieves existing INodes from the given path 
> components.  It needs to be updated in order to understand snapshot paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira