On Wed, Aug 30, 2017 at 5:10 PM, Ivan Iliev <ivan.iliev.il...@gmail.com> wrote:
> Hello everyone, > > We are using Cassandra 3.9 for storing quite a lot of data produced from > our tester machines. > > Occasionally, we are seeing issues with apps not being able to communicate > with Cassandra nodes, returning the following errors (captured in > servicemix logs): > >> by: com.datastax.driver.core.exceptions.NoHostAvailableException: All >> host(s) tried for query failed (no host was tried) >> at com.datastax.driver.core.RequestHandler.reportNoMoreHosts(Re >> questHandler.java:218) >> at com.datastax.driver.core.RequestHandler.access$1000(RequestH >> andler.java:43) >> at com.datastax.driver.core.RequestHandler$SpeculativeExecution >> .sendRequest(RequestHandler.java:284) >> at com.datastax.driver.core.RequestHandler.startNewExecution(Re >> questHandler.java:115) >> at com.datastax.driver.core.RequestHandler.sendRequest(RequestH >> andler.java:91) >> at com.datastax.driver.core.SessionManager.executeAsync(Session >> Manager.java:132) >> ... 107 more > > > As a result, apps that try to send data to cassandra get crashed due to > running out of memory and we have to restart the containers in which they > run. > > So far I have not been able to identify what might be the cause for this > as nothing (at least I could not find anything relevant on the timestamps) > in the cassandra debug and system logs. > > Could you share some insight on this ? What to check and where to start > from , in order to troubleshoot this. > We've seen such error once on AWS EC2 when the Cassandra was configured using EC2MultiRegionSnitch, but the application code didn't use the EC2MultiRegionAddressTranslator[1,2]. What happened to us is whenever the node to which the client was first to connect was unavailable, it wouldn't even try to contact other nodes, since it somehow could figure out that it won't be able to reach them. I don't recall all the details now, but after studying the driver code[3] we could find that configuring address translation would fix the problem, which it did for us. I guess you might be hitting this very issue or a similar one. Hope this helps, -- Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 127-59-707 <+49%20176%2012759707> [1] http://docs.datastax.com/en/drivers/java/3.2/com/datastax/ driver/core/policies/EC2MultiRegionAddressTranslator.html [2] https://docs.datastax.com/en/developer/java-driver/3.3/ manual/address_resolution/ [3] https://github.com/datastax/java-driver/