On 7/14/2010 11:07 AM, Jonathan Ellis wrote:
socketexception means this is coming from the network, not the sstables

knowing the full error message would be nice, but just about any
problem on that end should be fixed by adding connection pooling to
your client.

(moving to user@)

On Wed, Jul 14, 2010 at 5:09 AM, Thomas Downing
<tdown...@proteus-technologies.com>  wrote:
On 7/13/2010 9:20 AM, Jonathan Ellis wrote:
On Tue, Jul 13, 2010 at 4:19 AM, Thomas Downing
<tdown...@proteus-technologies.com>    wrote:

On a related note:  I am running some feasibility tests looking for
high ingest rate capabilities.  While testing Cassandra the problem
I've encountered is that it runs out of file handles during compaction.


[snip]
I'm not sure that is the case.

When the server gets into the unrecoverable state, the repeating exceptions
are indeed "SocketException: Too many open files".

WARN [main] 2010-07-14 06:08:46,772 TThreadPoolServer.java (line 190) Transport error occurred during acceptance of message. org.apache.thrift.transport.TTransportException: java.net.SocketException: Too many open files at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:124) at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:35) at org.apache.thrift.transport.TServerTransport.accept(TServerTransport.java:31) at org.apache.thrift.server.TThreadPoolServer.serve(TThreadPoolServer.java:184) at org.apache.cassandra.thrift.CassandraDaemon.start(CassandraDaemon.java:149) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:190)
Caused by: java.net.SocketException: Too many open files
    at java.net.PlainSocketImpl.socketAccept(Native Method)
    at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:390)
    at java.net.ServerSocket.implAccept(ServerSocket.java:453)
    at java.net.ServerSocket.accept(ServerSocket.java:421)
at org.apache.thrift.transport.TServerSocket.acceptImpl(TServerSocket.java:119)
    ... 5 more

Although this is unquestionably a network error, I don't think it is actually a
network problem per se, as the maximum number of sockets open by the
Cassandra server is at this point is about 8. When I kill the client, sockets
held are just the listening sockets - no sockets in ESTABLISHED or
TIMED_WAIT.

I was originally using the client interface provided by Hector, but went to the direct thrift API to eliminate moving parts in the puzzle. When using Hector,
I was using the ClientConnectionPool. Either way, the behavior is the same.

Just a further note: my client test jig acquires a single connection, then uses
that connection for successive batch_mutate operations, with out closing.
It only closes the connection on an exception, or at the end of the run.  If
it would be helpful, I can change that to open/mutate/close and repeat to
see what happens.

Thanks
Thomas Downing

Reply via email to