[ 
https://issues.apache.org/jira/browse/HDFS-1970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suresh Srinivas resolved HDFS-1970.
-----------------------------------

    Resolution: Duplicate

Duplicate of HDFS-1951 as indicated in previous comment.

> Null pointer exception comes when Namenode recovery happens and there is no 
> response from client to NN more than the hardlimit for NN recovery and the 
> current block is more than the prev block size in NN 
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-1970
>                 URL: https://issues.apache.org/jira/browse/HDFS-1970
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.20-append
>            Reporter: ramkrishna.s.vasudevan
>             Fix For: 0.20-append
>
>
> Null pointer exception comes when Namenode recovery happens and there is no 
> response from client to NN more than the hardlimit for NN recovery and the 
> current block is more than the prev block size in NN 
> 1. Write using a client to 2 datanodes
> 2. Kill one data node and allow pipeline recovery.
> 3. write somemore data to the same block
> 4. Parallely allow the namenode recovery to happen
> Null pointer exception will come in addStoreBlock api.
> Pls find the logs
> Debugging in MachineName.. 
> Listening for transport dt_socket at address: 8007
> 11/05/20 21:38:33 INFO namenode.NameNode: STARTUP_MSG: 
> /************************************************************
> STARTUP_MSG: Starting NameNode
> STARTUP_MSG:   host = linux76/10.18.52.76
> STARTUP_MSG:   args = []
> STARTUP_MSG:   version = 0.20.3-SNAPSHOT
> STARTUP_MSG:   build =  -r ; compiled by 'G00900856' on Tue Feb  1 11:40:14 
> IST 2011
> ************************************************************/
> 11/05/20 21:38:33 INFO metrics.RpcMetrics: Initializing RPC Metrics with 
> hostName=NameNode, port=9000
> 11/05/20 21:38:33 INFO namenode.NameNode: Namenode up at: 
> linux76/10.18.52.76:9000
> 11/05/20 21:38:33 INFO jvm.JvmMetrics: Initializing JVM Metrics with 
> processName=NameNode, sessionId=null
> 11/05/20 21:38:33 INFO metrics.NameNodeMetrics: Initializing NameNodeMeterics 
> using context object:org.apache.hadoop.metrics.spi.NullContext
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: fsOwner=root,root
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: supergroup=supergroup
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: isPermissionEnabled=false
> 11/05/20 21:38:33 INFO metrics.FSNamesystemMetrics: Initializing 
> FSNamesystemMetrics using context 
> object:org.apache.hadoop.metrics.spi.NullContext
> 11/05/20 21:38:33 INFO namenode.FSNamesystem: Registered 
> FSNamesystemStatusMBean
> 11/05/20 21:38:33 INFO common.Storage: Number of files = 1
> 11/05/20 21:38:33 INFO common.Storage: Number of files under construction = 0
> 11/05/20 21:38:33 INFO common.Storage: Image file of size 94 loaded in 0 
> seconds.
> 11/05/20 21:38:33 INFO common.Storage: Edits file 
> /home/ramkrishna/opensrchadoop/appendbranch/hadoop-0.20.3-SNAPSHOT/bin/../hadoop-root/dfs/name/current/edits
>  of size 4 edits # 0 loaded in 0 seconds.
> 11/05/20 21:38:33 INFO common.Storage: Image file of size 94 saved in 0 
> seconds.
> 11/05/20 21:38:34 INFO common.Storage: Image file of size 94 saved in 0 
> seconds.
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Finished loading FSImage in 482 
> msecs
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Total number of blocks = 0
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of invalid blocks = 0
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of under-replicated 
> blocks = 0
> 11/05/20 21:38:34 INFO namenode.FSNamesystem: Number of  over-replicated 
> blocks = 0
> 11/05/20 21:38:34 INFO hdfs.StateChange: STATE* Leaving safe mode after 0 
> secs.
> 11/05/20 21:38:34 INFO hdfs.StateChange: STATE* Network topology has 0 racks 
> and 0 datanodes
> 11/05/20 21:38:34 INFO hdfs.StateChange: STATE* UnderReplicatedBlocks has 0 
> blocks
> 11/05/20 21:38:34 INFO mortbay.log: Logging to 
> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via 
> org.mortbay.log.Slf4jLog
> 11/05/20 21:38:35 INFO http.HttpServer: Port returned by 
> webServer.getConnectors()[0].getLocalPort() before open() is -1. Opening the 
> listener on 50070
> 11/05/20 21:38:35 INFO http.HttpServer: listener.getLocalPort() returned 
> 50070 webServer.getConnectors()[0].getLocalPort() returned 50070
> 11/05/20 21:38:35 INFO http.HttpServer: Jetty bound to port 50070
> 11/05/20 21:38:35 INFO mortbay.log: jetty-6.1.14
> 11/05/20 21:38:37 INFO mortbay.log: Started 
> SelectChannelConnector@linux76:50070
> 11/05/20 21:38:37 INFO namenode.NameNode: Web-server up at: linux76:50070
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server Responder: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server listener on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 0 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 1 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 2 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 3 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 4 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 5 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 6 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 7 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 8 on 9000: starting
> 11/05/20 21:38:37 INFO ipc.Server: IPC Server handler 9 on 9000: starting
> 11/05/20 21:38:38 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: 
> node registration from 10.18.52.76:50010 storage 
> DS-1868335495-10.18.52.76-50010-1305907718956
> 11/05/20 21:38:38 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/10.18.52.76:50010
> 11/05/20 21:38:41 INFO hdfs.StateChange: BLOCK* NameSystem.registerDatanode: 
> node registration from 10.18.52.5:50010 storage 
> DS-721814011-10.18.52.5-50010-1305928120116
> 11/05/20 21:38:41 INFO net.NetworkTopology: Adding a new node: 
> /default-rack/10.18.52.5:50010
> 11/05/20 21:38:54 INFO FSNamesystem.audit: ugi=R00902313,Tardis       
> ip=/10.18.47.133        cmd=create      src=/synctest0  dst=null        
> perm=R00902313:supergroup:rw-r--r--
> 11/05/20 21:38:54 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
> /synctest0. blk_-8455334393673385140_1001
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> blockMap updated: 10.18.52.5:50010 is added to blk_-8455334393673385140_1001 
> size 1024
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> blockMap updated: 10.18.52.76:50010 is added to blk_-8455334393673385140_1001 
> size 1024
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
> /synctest0. blk_8955752156241965528_1001
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> blockMap updated: 10.18.52.76:50010 is added to blk_8955752156241965528_1001 
> size 1024
> 11/05/20 21:38:55 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> blockMap updated: 10.18.52.5:50010 is added to blk_8955752156241965528_1001 
> size 1024
> 11/05/20 21:39:11 INFO hdfs.StateChange: BLOCK* NameSystem.allocateBlock: 
> /synctest0. blk_-5249191180607448701_1001
> 11/05/20 21:40:18 INFO namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=blk_-5249191180607448701_1001, 
> newgenerationstamp=1002, newlength=833, newtargets=[10.18.52.76:50010], 
> closeFile=false, deleteBlock=false)
> 11/05/20 21:40:18 INFO namenode.FSNamesystem: Number of transactions: 4 Total 
> time for transactions(ms): 2Number of transactions batched in Syncs: 0 Number 
> of syncs: 1 SyncTimes(ms): 28 
> 11/05/20 21:40:18 INFO namenode.FSNamesystem: 
> commitBlockSynchronization(blk_-5249191180607448701_1002) successful
> 11/05/20 21:41:41 INFO namenode.LeaseManager: Lease [Lease.  Holder: 
> DFSClient_-676883980, pendingcreates: 1] has expired hard limit
> 11/05/20 21:41:56 INFO namenode.FSNamesystem: Recovering lease=[Lease.  
> Holder: DFSClient_-676883980, pendingcreates: 1], src=/synctest0
> 11/05/20 21:41:56 INFO hdfs.StateChange: BLOCK* blk_-5249191180607448701_1002 
> recovery started, primary=10.18.52.76:50010
> 11/05/20 21:42:09 INFO namenode.FSNamesystem: 
> commitBlockSynchronization(lastblock=blk_-5249191180607448701_1002, 
> newgenerationstamp=1003, newlength=867, newtargets=[10.18.52.76:50010], 
> closeFile=true, deleteBlock=false)
> 11/05/20 21:42:26 INFO hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> addStoredBlock request received for blk_-5249191180607448701_1003 on 
> 10.18.52.76:50010 size 867 and it belongs to a file under construction. 
> 11/05/20 21:42:52 WARN namenode.FSNamesystem: Inconsistent size for block 
> blk_-5249191180607448701_1003 reported from 10.18.52.76:50010 current size is 
> 833 reported size is 867
> 11/05/20 21:43:11 WARN namenode.FSNamesystem: Block 
> blk_-5249191180607448701_1003 reported from 10.18.52.76:50010 does not exist 
> in blockMap. Surprise! Surprise!
> 11/05/20 21:43:13 INFO hdfs.StateChange: Removing lease on  file /synctest0 
> from client NN_Recovery
> 11/05/20 21:43:13 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to 
> replicate blk_-5249191180607448701_1003 to datanode(s) 10.18.52.5:50010
> 11/05/20 21:43:13 INFO namenode.FSNamesystem: Number of transactions: 6 Total 
> time for transactions(ms): 2Number of transactions batched in Syncs: 0 Number 
> of syncs: 2 SyncTimes(ms): 30 
> 11/05/20 21:43:13 INFO namenode.FSNamesystem: 
> commitBlockSynchronization(newblock=blk_-5249191180607448701_1003, 
> file=/synctest0, newgenerationstamp=1003, newlength=867, 
> newtargets=[10.18.52.76:50010]) successful
> 11/05/20 21:43:15 INFO ipc.Server: IPC Server handler 6 on 9000, call 
> blockReceived(DatanodeRegistration(10.18.52.76:50010, 
> storageID=DS-1868335495-10.18.52.76-50010-1305907718956, infoPort=50075, 
> ipcPort=50020), [Lorg.apache.hadoop.hdfs.protocol.Block;@9b774e, 
> [Ljava.lang.String;@b5d05b) from 10.18.52.76:40343: error: 
> java.io.IOException: java.lang.NullPointerException
> java.io.IOException: java.lang.NullPointerException
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.addStoredBlock(FSNamesystem.java:3173)
>       at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.blockReceived(FSNamesystem.java:3592)
>       at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.blockReceived(NameNode.java:756)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:961)
>       at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:957)
>       at java.security.AccessController.doPrivileged(Native Method)
>       at javax.security.auth.Subject.doAs(Subject.java:396)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:955)
> 11/05/20 21:43:19 WARN hdfs.StateChange: BLOCK* NameSystem.addStoredBlock: 
> Redundant addStoredBlock request received for blk_-5249191180607448701_1003 
> on 10.18.52.76:50010 size 867
> 11/05/20 21:43:19 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to 
> replicate blk_-5249191180607448701_1003 to datanode(s) 10.18.52.5:50010
> Listening for transport dt_socket at address: 8007
> 11/05/20 21:43:22 INFO hdfs.StateChange: BLOCK* ask 10.18.52.76:50010 to 
> replicate blk_-5249191180607448701_1003 to datanode(s) 10.18.52.5:50010
>  

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to