Null pointer exception comes when Namenode recovery happens and there is no 
response from client to NN more than the hardlimit for NN recovery and the 
current block is more than the prev block size in NN 
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

                 Key: HDFS-1970
                 URL: https://issues.apache.org/jira/browse/HDFS-1970
             Project: Hadoop HDFS
          Issue Type: Bug
          Components: name-node
    Affects Versions: 0.20-append
            Reporter: ramkrishna.s.vasudevan
             Fix For: 0.20-append


Null pointer exception comes when Namenode recovery happens and there is no 
response from client to NN more than the hardlimit for NN recovery and the 
current block is more than the prev block size in NN 
1. Write using a client to 2 datanodes
2. Kill one data node and allow pipeline recovery.
3. write somemore data to the same block
4. Parallely allow the namenode recovery to happen
Null pointer exception will come in addStoreBlock api.

Pls find the logs

Debugging in MachineName.. 
Listening for transport dt_socket at address: 8007
11/05/20 21:38:33 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = linux76/10.18.52.76
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 0.20.3-SNAPSHOT
STARTUP_MSG:   build =  -r ; compiled by 'G00900856' on Tue Feb  1 11:40:14 IST 
2011
************************************************************/
11/05/20 21:38:33 INFO metrics.RpcMetrics: Initializing RPC Metrics with 
hostName=NameNode, port=9000
11/05/20 21:38:33 INFO namenode.NameNode: Namenode up at: 
linux76/10.18.52.76:9000
11/05/20 21:38:33 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
processName=NameNode, sessionId=null
11/05/20 21:38:33 INFO metrics.NameNodeMetrics: Initializing NameNodeMeterics 
using context object:org.apache.hadoop.metrics.spi.NullContext
11/05/20 21:38:33 INFO namenode.FSNamesystem: fsOwner=root,root
11/05/20 21:38:33 INFO namenode.FSNamesystem: supergroup=supergroup
11/05/20 21:38:33 INFO namenode.FSNamesystem: isPermissionEnabled=false
11/05/20 21:38:33 INFO metrics.FSNamesystemMetrics: Initializing 
FSNamesystemMetrics using context 
object:org.apache.hadoop.metrics.spi.NullContext
11/05/20 21:38:33 INFO namenode.FSNamesystem: Registered FSNamesystemStatusMBean
11/05/20 21:38:33 INFO common.Storage: Number of files = 1
11/05/20 21:38:33 INFO common.Storage: Number of files under construction = 0
11/05/20 21:38:33 INFO common.Storage: Image file of size 94 loaded in 0 
seconds.
11/05/20 21:38:33 INFO common.Storage: Edits file 
/home/ramkrishna/opensrchadoop/appendbranch/hadoop-0.20.3-SNAPSHOT/bin/../hadoop-root/dfs/name/current/edits
 of size 4 edits # 0 loaded in 0 seconds.
11/05/20 21:38:33 INFO common.Storage: Image file of size 94 saved in 0 seconds.
11/05/20 21:38:34 INFO common.Storage: Image file of size 94 saved in 0 seconds.
11/05/20 21:38:34 INFO namenode.FSNamesystem: Finished loading FSImage in 482 
msecs
11/05/20 21:38:34 INFO namenode.FSNamesystem: Total number of blocks = 0
11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of invalid blocks = 0
11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of under-replicated blocks 
= 0
11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of  over-replicated blocks 
= 0
11/05/20 21:38:34 INFO hdfs.StateChange: STATE* Leaving safe mode after 0 secs.
11/05/20 21:38:34 INFO hdfs.StateChange: STATE* Network topology has 0 racks 
and 0 datanodes
11/05/20 21:38:34 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 
blocks
11/05/20 21:38:34 INFO mortbay.log: Logging to 
org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
11/05/20 21:38:35 INFO http.HttpServer: Port returned by 
webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the 
listener on 50070
11/05/20 21:38:35 INFO http.HttpServer: listener.getLocalPort() returned 50070 
webServer.getConnectors()[0].getLocalPort() returned 50070
11/05/20 21:38:35 INFO http.HttpServer: Jetty bound to port 50070
11/05/20 21:38:35 INFO mortbay.log: jetty-6.1.14
11/05/20 21:38:37 INFO mortbay.log: Started SelectChannelConnector@linux76:50070
11/05/20 21:38:37 INFO namenode.NameNode: Web-server up at: linux76:50070
11/05/20 21:38:37 INFO ipc.Server: IPC Server Responder: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server listener on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 0 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 1 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 2 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 3 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 4 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 5 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 6 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 7 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 8 on 9000: starting
11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 9 on 9000: starting
11/05/20 21:38:38 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: 
node registration from 10.18.52.76:50010 storage 
DS-1868335495-10.18.52.76-50010-1305907718956
11/05/20 21:38:38 INFO net.NetworkTopology: Adding a new node: 
/default-rack/10.18.52.76:50010
11/05/20 21:38:41 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: 
node registration from 10.18.52.5:50010 storage 
DS-721814011-10.18.52.5-50010-1305928120116
11/05/20 21:38:41 INFO net.NetworkTopology: Adding a new node: 
/default-rack/10.18.52.5:50010
11/05/20 21:38:54 INFO FSNamesystem.audit: ugi=R00902313,Tardis 
ip=/10.18.47.133        cmd=create      src=/synctest0  dst=null        
perm=R00902313:supergroup:rw-r--r--
11/05/20 21:38:54 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
/synctest0. blk_-8455334393673385140_1001
11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
blockMap updated: 10.18.52.5:50010 is added to blk_-8455334393673385140_1001 
size 1024
11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
blockMap updated: 10.18.52.76:50010 is added to blk_-8455334393673385140_1001 
size 1024
11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
/synctest0. blk_8955752156241965528_1001
11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
blockMap updated: 10.18.52.76:50010 is added to blk_8955752156241965528_1001 
size 1024
11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
blockMap updated: 10.18.52.5:50010 is added to blk_8955752156241965528_1001 
size 1024
11/05/20 21:39:11 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
/synctest0. blk_-5249191180607448701_1001
11/05/20 21:40:18 INFO namenode.FSNamesystem: 
commitBlockSynchronization(lastblock=blk_-5249191180607448701_1001, 
newgenerationstamp=1002, newlength=833, newtargets=[10.18.52.76:50010], 
closeFile=false, deleteBlock=false)
11/05/20 21:40:18 INFO namenode.FSNamesystem: Number of transactions: 4 Total 
time for transactions(ms): 2Number of transactions batched in Syncs: 0 Number 
of syncs: 1 SyncTimes(ms): 28 
11/05/20 21:40:18 INFO namenode.FSNamesystem: 
commitBlockSynchronization(blk_-5249191180607448701_1002) successful
11/05/20 21:41:41 INFO namenode.LeaseManager: Lease [Lease.  Holder: 
DFSClient_-676883980, pendingcreates: 1] has expired hard limit
11/05/20 21:41:56 INFO namenode.FSNamesystem: Recovering lease=[Lease.  Holder: 
DFSClient_-676883980, pendingcreates: 1], src=/synctest0
11/05/20 21:41:56 INFO hdfs.StateChange: BLOCK* blk_-5249191180607448701_1002 
recovery started, primary=10.18.52.76:50010
11/05/20 21:42:09 INFO namenode.FSNamesystem: 
commitBlockSynchronization(lastblock=blk_-5249191180607448701_1002, 
newgenerationstamp=1003, newlength=867, newtargets=[10.18.52.76:50010], 
closeFile=true, deleteBlock=false)
11/05/20 21:42:26 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
addStoredBlock request received for blk_-5249191180607448701_1003 on 
10.18.52.76:50010 size 867 and it belongs to a file under construction. 
11/05/20 21:42:52 WARN namenode.FSNamesystem: Inconsistent size for block 
blk_-5249191180607448701_1003 reported from 10.18.52.76:50010 current size is 
833 reported size is 867
11/05/20 21:43:11 WARN namenode.FSNamesystem: Block 
blk_-5249191180607448701_1003 reported from 10.18.52.76:50010 does not exist in 
blockMap. Surprise! Surprise!
11/05/20 21:43:13 INFO hdfs.StateChange: Removing lease on  file /synctest0 
from client NN_Recovery
11/05/20 21:43:13 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to 
replicate blk_-5249191180607448701_1003 to datanode(s) 10.18.52.5:50010
11/05/20 21:43:13 INFO namenode.FSNamesystem: Number of transactions: 6 Total 
time for transactions(ms): 2Number of transactions batched in Syncs: 0 Number 
of syncs: 2 SyncTimes(ms): 30 
11/05/20 21:43:13 INFO namenode.FSNamesystem: 
commitBlockSynchronization(newblock=blk_-5249191180607448701_1003, 
file=/synctest0, newgenerationstamp=1003, newlength=867, 
newtargets=[10.18.52.76:50010]) successful
11/05/20 21:43:15 INFO ipc.Server: IPC Server handler 6 on 9000, call 
blockReceived(DatanodeRegistration(10.18.52.76:50010, 
storageID=DS-1868335495-10.18.52.76-50010-1305907718956, infoPort=50075, 
ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@9b774e, 
[Ljava.lang.String;@b5d05b) from 10.18.52.76:40343: error: java.io.IOException: 
java.lang.NullPointerException
java.io.IOException: java.lang.NullPointerException
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addStoredBlock(FSNamesystem.java:3173)
        at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:3592)
        at 
org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:756)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
11/05/20 21:43:19 WARN hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
Redundant addStoredBlock request received for blk_-5249191180607448701_1003 on 
10.18.52.76:50010 size 867
11/05/20 21:43:19 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to 
replicate blk_-5249191180607448701_1003 to datanode(s) 10.18.52.5:50010
Listening for transport dt_socket at address: 8007
11/05/20 21:43:22 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to 
replicate blk_-5249191180607448701_1003 to datanode(s) 10.18.52.5:50010





 



--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to