Thomas, I had a similar problem a few weeks back. I changed my code to make sure that each thread only creates and uses one Hector connection. It seems that client sockets are not being released properly, but I didn't have the time to dig into it.
Jorge On Wed, Jul 14, 2010 at 8:28 AM, Peter Schuller <peter.schul...@infidyne.com > wrote: > > [snip] > > I'm not sure that is the case. > > > > When the server gets into the unrecoverable state, the repeating > exceptions > > are indeed "SocketException: Too many open files". > [snip] > > Although this is unquestionably a network error, I don't think it is > > actually a > > network problem per se, as the maximum number of sockets open by the > > Cassandra server is at this point is about 8. When I kill the client, > > sockets > > held are just the listening sockets - no sockets in ESTABLISHED or > > TIMED_WAIT. > > Is this based on netstat or lsof or similar? When the node is in the > state of giving these errors, try inspecting /proc/<pid>/fd or use > lsof. Presumably you'll see thousands of fds of some category; either > sockets or files. > > (If you already did this, sorry!) > > -- > / Peter Schuller >