Seems like this works for me too! That probably saved me for a bunch of hours tracing this down through hive and hadoop
Do you know what the side effect of setting this to false would be?. Thanks! Terje On Fri, Jan 7, 2011 at 4:39 PM, Bennie Schut <bsc...@ebuddy.com> wrote: > In the past I ran into a similar problem which was actually caused by a > bug in hadoop. Someone was nice enough to come up with a workaround for > this. Perhaps you are running into a similar problem. I also had this > problem when calling lots of “load file” commands. After adding this to the > hive-site.xml we never had this problem again: > > > > <!-- workaround for connection leak problem fixed in HADOOP-5476 but only > commited to hadoop 0.21.0 --> > > <property> > > <name>hive.fileformat.check</name> > > <value>false</value> > > </property> > > > ------------------------------ > > *From:* Terje Marthinussen [mailto:tmarthinus...@gmail.com] > *Sent:* Friday, January 07, 2011 4:14 AM > *To:* user@hive.apache.org > *Subject:* Re: Too many open files > > > > No, the problem is connections to datanodes on port 50010. > > > > Terje > > On Fri, Jan 7, 2011 at 11:46 AM, Shrijeet Paliwal <shrij...@rocketfuel.com> > wrote: > > You mentioned that you got the code from trunk so fair to assume you > are not hitting https://issues.apache.org/jira/browse/HIVE-1508 > Worth checking still. Are all the open files - hive history files > (they look like hive_job_log*.txt) ? Like Viral suggested you can > check that by monitoring open files. > > -Shrijeet > > > On Thu, Jan 6, 2011 at 6:15 PM, Viral Bajaria <viral.baja...@gmail.com> > wrote: > > Hi Terje, > > > > I have asked about this issue in an earlier thread but never got any > > response. > > > > I get this exception when I am using Hive over Thrift and submitting > 1000s > > of LOAD FILE commands. If you actively monitor the open file count of the > > user under which I run the hive instance, it keeps on creeping yup for > every > > LOAD FILE command sent to it. > > > > I have a temporary fix by increasing the # of open file(s) to 60000+ and > > then periodically restarting my thrift server (once every 2 days) to > release > > the open file handlers. > > > > I would appreciate some feedback. (trying to find my earlier email) > > > > Thanks, > > Viral > > > > On Thu, Jan 6, 2011 at 4:57 PM, Terje Marthinussen < > tmarthinus...@gmail.com> > > wrote: > >> > >> Hi, > >> While loading some 10k+ .gz files through HiveServer with LOAD FILE etc. > >> etc. > >> 11/01/06 22:12:42 INFO exec.CopyTask: Copying data from file:XXX.gz to > >> hdfs://YYY > >> 11/01/06 22:12:42 INFO hdfs.DFSClient: Exception in > >> createBlockOutputStream java.net.SocketException: Too many open files > >> 11/01/06 22:12:42 INFO hdfs.DFSClient: Abandoning block > >> blk_8251287732961496983_1741138 > >> 11/01/06 22:12:48 INFO hdfs.DFSClient: Exception in > >> createBlockOutputStream java.net.SocketException: Too many open files > >> 11/01/06 22:12:48 INFO hdfs.DFSClient: Abandoning block > >> blk_-2561354015640936272_1741138 > >> 11/01/06 22:12:54 WARN hdfs.DFSClient: DataStreamer Exception: > >> java.io.IOException: Too many open files > >> at sun.nio.ch.EPollArrayWrapper.epollCreate(Native Method) > >> at > sun.nio.ch.EPollArrayWrapper.<init>(EPollArrayWrapper.java:69) > >> at > sun.nio.ch.EPollSelectorImpl.<init>(EPollSelectorImpl.java:52) > >> at > >> > sun.nio.ch.EPollSelectorProvider.openSelector(EPollSelectorProvider.java:18) > >> at > >> > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.get(SocketIOWithTimeout.java:407) > >> at > >> > org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:322) > >> at > >> > org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157) > >> at > >> > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:146) > >> at > >> > org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:107) > >> at > >> java.io.BufferedOutputStream.write(BufferedOutputStream.java:105) > >> at java.io.DataOutputStream.write(DataOutputStream.java:90) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2314) > >> 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block > >> blk_2907917521214666486_1741138 bad datanode[0] 172.27.1.34:50010 > >> 11/01/06 22:12:54 WARN hdfs.DFSClient: Error Recovery for block > >> blk_2907917521214666486_1741138 in pipeline 172.27.1.34:50010, > >> 172.27.1.4:50010: bad datanode 172.27.1.34:50010 > >> Exception in thread "DataStreamer for file YYY block blk_29 > >> 07917521214666486_1741138" java.lang.NullPointerException > >> at > >> > org.apache.hadoop.ipc.Client$Connection.handleConnectionFailure(Client.java:351) > >> at > >> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:313) > >> at > >> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176) > >> at org.apache.hadoop.ipc.Client.getConnection(Client.java:860) > >> at org.apache.hadoop.ipc.Client.call(Client.java:720) > >> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220) > >> at $Proxy9.recoverBlock(Unknown Source) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2581) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$1600(DFSClient.java:2102) > >> at > >> > org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2265) > >> After this, the HiveServer stops working throwing various exceptions due > >> to too many open files. > >> This is from a trunk checkout from yesterday January 6th. > >> Seems like we are leaking connections to datanodes on port 50010? > >> Regards, > >> Terje > > > > >