Exception while syncing from Flume to HDFS
I'm running hadoop 1.2.0 and flume 1.3. Every thing works fine if its independently run. When I start my tomcat I get the below exception after some time. 2013-07-17 12:40:35,640 (ResponseProcessor for block blk_5249456272858461891_436734) [WARN - org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3015)] DFSOutputStream ResponseProcessor exception for block blk_5249456272858461891_436734java.net.SocketTimeoutException: 63000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/127.0.0.1:24433 remote=/127.0.0.1:50010] at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155) at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128) at java.io.DataInputStream.readFully(DataInputStream.java:195) at java.io.DataInputStream.readLong(DataInputStream.java:416) at org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2967) 2013-07-17 12:40:35,800 (hdfs-hdfs-write-roll-timer-0) [WARN - org.apache.flume.sink.hdfs.BucketWriter.doClose(BucketWriter.java:277)] failed to close() HDFSWriter for file (hdfs://localhost:9000/flume/Broadsoft_App2/20130717/jboss/Broadsoft_App2.1374044838498.tmp). Exception follows. java.io.IOException: All datanodes 127.0.0.1:50010 are bad. Aborting... at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3096) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589) at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793) Java snippet for Configuraion configuration.set("fs.default.name", "hdfs://localhost:9000"); configuration.set("mapred.job.tracker", "hdfs://localhost:9000"); I'm using a single datanode to read the files that where written to hdfs by flume, my java program just reads the files from hdfs to show it on the screen nothing much. Any sort of help is highly appreciated. Regards, Divya
Exception in mid of reading files.
Hi Guys, I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is Running in cluster mode (having 2 dataNaodes currently). After every two or three hours I'm getting this exception. But both Data nodes are up and running. Can any one please guide me as to what I should do or If I'm doing wrong. Code Snippet: public InitHadoop() { configuration = new Configuration(); configuration.set("fs.default.name", "hdfs://<>:54310"); // Is this write to specify on namenode IP.? configuration.set("mapred.job.tracker", "hdfs://<>:54311"); try { fileSystem = FileSystem.get(configuration); } catch (IOException e) { e.printStackTrace(); } } private void indexDocument(FSDataInputStream file) { Scanner scanner = new Scanner(file); while (scanner.hasNext() != null) { // Indexing code } } } Logs: 2013-10-25 09:37:57 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:57 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:57 INFO DFSClient:2432 - Could not obtain block blk_-8795538519317154213_432897 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 2013-10-25 09:37:58 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:58 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:58 INFO DFSClient:2432 - Could not obtain block blk_-5974673190155585497_432671 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:59 INFO DFSClient:2432 - Could not obtain block blk_-1662761320365439855_431653 from any node: java.io.IOException: No live nodes contain current block. Will get new block locations from namenode and retry... 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to /<>:50010, add to deadNodes and continuejava.net.BindException: Cannot assign requested address 2013-10-25 09:37:59 WARN DFSClient:2400 - DFS Read: java.io.IOException: Could not obtain block: blk_8826777676488299245_432528 file=/flume/<>.1382639351042 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2426) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2218) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2381) at java.io.DataInputStream.read(DataInputStream.java:149) at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283) at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325) at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177) at java.io.InputStreamReader.read(InputStreamReader.java:184) Regards, -Divya
Re: Exception in mid of reading files.
Hi Chris, Thanks a lot for the help. But after lot of investigation I found that the issue was with the cached socket connection which was raised as a bug by Nicholas. Bug details are as follows, HDFS-3373 <https://issues.apache.org/jira/browse/HDFS-3373> FileContext HDFS implementation can leak socket caches When I executed command netstat -a |grep 50010 the count was approximately 52000. This issue is fixed in 0.20.3<https://issues.apache.org/jira/browse/HDFS/fixforversion/12314814>, 0.20.205.0<https://issues.apache.org/jira/browse/HDFS/fixforversion/12316392>, but its not present in hadoop-1.2.X. Could you please guide me as to what could I do.? -Divya On Sat, Oct 26, 2013 at 12:38 AM, Chris Nauroth wrote: > Hi Divya, > > The exceptions indicate that the HDFS client failed to establish a network > connection to a datanode hosting a block that the client is trying to read. > After too many of these failures (default 3, but configurable), the HDFS > client aborts the read and this bubbles up to the caller with the "could > not obtain block" error. > > I recommend troubleshooting this as a network connectivity issue. This > wiki page includes a few tips as a starting point: > > http://wiki.apache.org/hadoop/TroubleShooting > > Hope this helps, > > Chris Nauroth > Hortonworks > http://hortonworks.com/ > > > > On Fri, Oct 25, 2013 at 4:53 AM, Divya R wrote: > > > Hi Guys, > > > >I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is Running > in > > cluster mode (having 2 dataNaodes currently). After every two or three > > hours I'm getting this exception. But both Data nodes are up and running. > > Can any one please guide me as to what I should do or If I'm doing > wrong. > > > > Code Snippet: > > public InitHadoop() { > > > > configuration = new Configuration(); > > configuration.set("fs.default.name", "hdfs://< > IP>>:54310"); // Is this write to specify on namenode IP.? > > configuration.set("mapred.job.tracker", "hdfs://< > IP>>:54311"); > > > > try { > > fileSystem = FileSystem.get(configuration); > > } catch (IOException e) { > > e.printStackTrace(); > > } > > } > > private void indexDocument(FSDataInputStream file) { > > > > Scanner scanner = new Scanner(file); > > > > while (scanner.hasNext() != null) { > > // Indexing code > > } > > } > > } > > > > Logs: > > > > 2013-10-25 09:37:57 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to deadNodes and continuejava.net.BindException: > Cannot > > assign requested address > > 2013-10-25 09:37:57 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to deadNodes and continuejava.net.BindException: > Cannot > > assign requested address > > 2013-10-25 09:37:57 INFO DFSClient:2432 - Could not obtain block > > blk_-8795538519317154213_432897 from any node: java.io.IOException: No > live > > nodes contain current block. Will get new block locations from namenode > and > > retry... > > 2013-10-25 09:37:58 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to deadNodes and continuejava.net.BindException: > Cannot > > assign requested address > > 2013-10-25 09:37:58 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to deadNodes and continuejava.net.BindException: > Cannot > > assign requested address > > 2013-10-25 09:37:58 INFO DFSClient:2432 - Could not obtain block > > blk_-5974673190155585497_432671 from any node: java.io.IOException: No > live > > nodes contain current block. Will get new block locations from namenode > and > > retry... > > 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to deadNodes and continuejava.net.BindException: > Cannot > > assign requested address > > 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to deadNodes and continuejava.net.BindException: > Cannot > > assign requested address > > 2013-10-25 09:37:59 INFO DFSClient:2432 - Could not obtain block > > blk_-1662761320365439855_431653 from any node: java.io.IOException: No > live > > nodes contain current block. Will get new block locations from namenode > and > > retry... > > 2013-10-25 09:37:59 WARN DFSClient:2266 - Failed to connect to > > /<>:50010, add to d