RE: Too many open files

Bennie Schut Thu, 06 Jan 2011 23:42:41 -0800

In the past I ran into a similar problem which was actually caused by a bug in 
hadoop. Someone was nice enough to come up with a workaround for this. Perhaps 
you are running into a similar problem. I also had this problem when calling 
lots of "load file" commands. After adding this to the hive-site.xml we never 
had this problem again:


  <!-- workaround for connection leak problem fixed in HADOOP-5476 but only 
commited to hadoop 0.21.0 -->
  <property>
    <name>hive.fileformat.check</name>
    <value>false</value>
  </property>

________________________________
From: Terje Marthinussen [mailto:tmarthinus...@gmail.com]
Sent: Friday, January 07, 2011 4:14 AM
To: user@hive.apache.org
Subject: Re: Too many open files

No, the problem is connections to datanodes on port 50010.

Terje
On Fri, Jan 7, 2011 at 11:46 AM, Shrijeet Paliwal 
<shrij...@rocketfuel.com<mailto:shrij...@rocketfuel.com>> wrote:
You mentioned that you got the code from trunk so fair to assume you
are not hitting https://issues.apache.org/jira/browse/HIVE-1508
Worth checking still. Are all the open files -  hive history files
(they look like hive_job_log*.txt) ? Like Viral suggested you can
check that by monitoring open files.

-Shrijeet

On Thu, Jan 6, 2011 at 6:15 PM, Viral Bajaria 
<viral.baja...@gmail.com<mailto:viral.baja...@gmail.com>> wrote:
> Hi Terje,
>
> I have asked about this issue in an earlier thread but never got any
> response.
>
> I get this exception when I am using Hive over Thrift and submitting 1000s
> of LOAD FILE commands. If you actively monitor the open file count of the
> user under which I run the hive instance, it keeps on creeping yup for every
> LOAD FILE command sent to it.
>
> I have a temporary fix by increasing the # of open file(s) to 60000+ and
> then periodically restarting my thrift server (once every 2 days) to release
> the open file handlers.
>
> I would appreciate some feedback. (trying to find my earlier email)
>
> Thanks,
> Viral
>
> On Thu, Jan 6, 2011 at 4:57 PM, Terje Marthinussen 
> <tmarthinus...@gmail.com<mailto:tmarthinus...@gmail.com>>
> wrote:
>>
>> Hi,
>> While loading some 10k+ .gz files through HiveServer with LOAD FILE etc.
>> etc.
>> 11/01/06 22:12:42 INFO exec.CopyTask: Copying data from file:XXX.gz to
>> hdfs://YYY
>> 11/01/06 22:12:42 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream java.net.SocketException: Too many open files
>> 11/01/06 22:12:42 INFO hdfs.DFSClient: Abandoning block
>> blk_8251287732961496983_1741138
>> 11/01/06 22:12:48 INFO hdfs.DFSClient: Exception in
>> createBlockOutputStream java.net.SocketException: Too many open files
>> 11/01/06 22:12:48 INFO hdfs.DFSClient: Abandoning block
>> blk_-2561354015640936272_1741138
>> 11/01/06 22:12:54 WARN hdfs.DFSClient: DataStreamer Exception:
>> java.io.IOException: Too many open files
>>         at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method)
>>         at sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:69)
>>         at sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:52)
>>         at
>> sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18)
>>         at
>> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407)
>>         at
>> org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322)
>>         at
>> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
>>         at
>> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146)
>>         at
>> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107)
>>         at
>> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105)
>>         at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>         at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2314)
>> 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block
>> blk_2907917521214666486_1741138 bad datanode[0] 
>> 172.27.1.34:50010<http://172.27.1.34:50010>
>> 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block
>> blk_2907917521214666486_1741138 in pipeline 
>> 172.27.1.34:50010<http://172.27.1.34:50010>,
>> 172.27.1.4:50010<http://172.27.1.4:50010>: bad datanode 
>> 172.27.1.34:50010<http://172.27.1.34:50010>
>> Exception in thread "DataStreamer for file YYY block blk_29
>> 07917521214666486_1741138" java.lang.NullPointerException
>>         at
>> org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:351)
>>         at
>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:313)
>>         at
>> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
>>         at org.apache.hadoop.ipc.Client.getConnection(Client.java:860)
>>         at org.apache.hadoop.ipc.Client.call(Client.java:720)
>>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>         at $Proxy9.recoverBlock(Unknown Source)
>>         at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2581)
>>         at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2102)
>>         at
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2265)
>> After this, the HiveServer stops working throwing various exceptions due
>> to too many open files.
>> This is from a trunk checkout from yesterday January 6th.
>> Seems like we are leaking connections to datanodes on port 50010?
>> Regards,
>> Terje
>

RE: Too many open files

Reply via email to