Exception while syncing from Flume to HDFS

2013-07-19 Thread Divya R
I'm running hadoop 1.2.0 and flume 1.3. Every thing works fine if its
independently run. When I start my tomcat I get the below exception after
some time.

  2013-07-17 12:40:35,640 (ResponseProcessor for block
blk_5249456272858461891_436734) [WARN -
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:3015)]
DFSOutputStream ResponseProcessor exception  for block
blk_5249456272858461891_436734java.net.SocketTimeoutException: 63000
millis timeout while waiting for channel to be ready for read. ch :
java.nio.channels.SocketChannel[connected local=/127.0.0.1:24433
remote=/127.0.0.1:50010]
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:164)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readLong(DataInputStream.java:416)
at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2967)

 2013-07-17 12:40:35,800 (hdfs-hdfs-write-roll-timer-0) [WARN -
org.apache.flume.sink.hdfs.BucketWriter.doClose(BucketWriter.java:277)]
failed to close() HDFSWriter for file
(hdfs://localhost:9000/flume/Broadsoft_App2/20130717/jboss/Broadsoft_App2.1374044838498.tmp).
Exception follows.
java.io.IOException: All datanodes 127.0.0.1:50010 are bad. Aborting...
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3096)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2100(DFSClient.java:2589)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2793)


  Java snippet for Configuraion

configuration.set("fs.default.name", "hdfs://localhost:9000");
configuration.set("mapred.job.tracker", "hdfs://localhost:9000");

I'm using a single datanode to read the files that where written to hdfs by
flume, my java program just reads the files from hdfs to show it on the
screen nothing much. Any sort of help is highly appreciated.


Regards,
Divya


Exception in mid of reading files.

2013-10-25 Thread Divya R
Hi Guys,

   I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is Running in
cluster mode (having 2 dataNaodes currently). After every two or three
hours I'm getting this exception. But both Data nodes are up and running.
Can any one please guide me as to what I should do or  If I'm doing wrong.

Code Snippet:
public InitHadoop()  {

configuration = new Configuration();
configuration.set("fs.default.name", "hdfs://<>:54310"); // Is this write to specify on namenode IP.?
configuration.set("mapred.job.tracker", "hdfs://<>:54311");

try {
fileSystem = FileSystem.get(configuration);
} catch (IOException e) {
e.printStackTrace();
}
}
private void indexDocument(FSDataInputStream file) {

Scanner scanner = new Scanner(file);

while (scanner.hasNext() != null) {
  //   Indexing code
}
  }
}

Logs:

2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:57 INFO  DFSClient:2432 - Could not obtain block
blk_-8795538519317154213_432897 from any node: java.io.IOException: No live
nodes contain current block. Will get new block locations from namenode and
retry...
2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:58 INFO  DFSClient:2432 - Could not obtain block
blk_-5974673190155585497_432671 from any node: java.io.IOException: No live
nodes contain current block. Will get new block locations from namenode and
retry...
2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:59 INFO  DFSClient:2432 - Could not obtain block
blk_-1662761320365439855_431653 from any node: java.io.IOException: No live
nodes contain current block. Will get new block locations from namenode and
retry...
2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
/<>:50010, add to deadNodes and continuejava.net.BindException: Cannot
assign requested address
2013-10-25 09:37:59 WARN  DFSClient:2400 - DFS Read: java.io.IOException:
Could not obtain block: blk_8826777676488299245_432528
file=/flume/<>.1382639351042
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.chooseDataNode(DFSClient.java:2426)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.blockSeekTo(DFSClient.java:2218)
at
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.read(DFSClient.java:2381)
at java.io.DataInputStream.read(DataInputStream.java:149)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)

Regards,
-Divya


Re: Exception in mid of reading files.

2013-11-07 Thread Divya R
Hi Chris,

  Thanks a lot for the help. But after lot of investigation I found that
the issue was with the cached socket connection which was raised as a bug
by Nicholas. Bug details are as follows,

HDFS-3373 <https://issues.apache.org/jira/browse/HDFS-3373> FileContext
HDFS implementation can leak socket caches

When I executed command netstat -a |grep 50010 the count was approximately
52000. This issue is fixed in
0.20.3<https://issues.apache.org/jira/browse/HDFS/fixforversion/12314814>,
0.20.205.0<https://issues.apache.org/jira/browse/HDFS/fixforversion/12316392>,
but its not present in hadoop-1.2.X. Could you please guide me as to what
could I do.?

-Divya


On Sat, Oct 26, 2013 at 12:38 AM, Chris Nauroth wrote:

> Hi Divya,
>
> The exceptions indicate that the HDFS client failed to establish a network
> connection to a datanode hosting a block that the client is trying to read.
>  After too many of these failures (default 3, but configurable), the HDFS
> client aborts the read and this bubbles up to the caller with the "could
> not obtain block" error.
>
> I recommend troubleshooting this as a network connectivity issue.  This
> wiki page includes a few tips as a starting point:
>
> http://wiki.apache.org/hadoop/TroubleShooting
>
> Hope this helps,
>
> Chris Nauroth
> Hortonworks
> http://hortonworks.com/
>
>
>
> On Fri, Oct 25, 2013 at 4:53 AM, Divya R  wrote:
>
> > Hi Guys,
> >
> >I'm indexing data (~50 -100GB per day) from hadoop. Hadoop is Running
> in
> > cluster mode (having 2 dataNaodes currently). After every two or three
> > hours I'm getting this exception. But both Data nodes are up and running.
> > Can any one please guide me as to what I should do or  If I'm doing
> wrong.
> >
> > Code Snippet:
> > public InitHadoop()  {
> >
> > configuration = new Configuration();
> > configuration.set("fs.default.name", "hdfs://< > IP>>:54310"); // Is this write to specify on namenode IP.?
> > configuration.set("mapred.job.tracker", "hdfs://< > IP>>:54311");
> >
> > try {
> > fileSystem = FileSystem.get(configuration);
> > } catch (IOException e) {
> > e.printStackTrace();
> > }
> > }
> > private void indexDocument(FSDataInputStream file) {
> >
> > Scanner scanner = new Scanner(file);
> >
> > while (scanner.hasNext() != null) {
> >   //   Indexing code
> > }
> >   }
> > }
> >
> > Logs:
> >
> > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:57 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:57 INFO  DFSClient:2432 - Could not obtain block
> > blk_-8795538519317154213_432897 from any node: java.io.IOException: No
> live
> > nodes contain current block. Will get new block locations from namenode
> and
> > retry...
> > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:58 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:58 INFO  DFSClient:2432 - Could not obtain block
> > blk_-5974673190155585497_432671 from any node: java.io.IOException: No
> live
> > nodes contain current block. Will get new block locations from namenode
> and
> > retry...
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to deadNodes and continuejava.net.BindException:
> Cannot
> > assign requested address
> > 2013-10-25 09:37:59 INFO  DFSClient:2432 - Could not obtain block
> > blk_-1662761320365439855_431653 from any node: java.io.IOException: No
> live
> > nodes contain current block. Will get new block locations from namenode
> and
> > retry...
> > 2013-10-25 09:37:59 WARN  DFSClient:2266 - Failed to connect to
> > /<>:50010, add to d