Mike, Is that where you've bisected it to having been introduced?
I'll see what I can do, but doubt it, since we've long since upgraded prod to 2.2.4 (and stage before that) and the tests I'm running were for a new feature. On Fri, 4 Mar 2016 03:54 Mike Heffner, <m...@librato.com> wrote: > Emils, > > I realize this may be a big downgrade, but are you timeouts reproducible > under Cassandra 2.1.4? > > Mike > > On Thu, Feb 25, 2016 at 10:34 AM, Emīls Šolmanis <emils.solma...@gmail.com > > wrote: > >> Having had a read through the archives, I missed this at first, but this >> seems to be *exactly* like what we're experiencing. >> >> http://www.mail-archive.com/user@cassandra.apache.org/msg46064.html >> >> Only difference is we're getting this for reads and using CQL, but the >> behaviour is identical. >> >> On Thu, 25 Feb 2016 at 14:55 Emīls Šolmanis <emils.solma...@gmail.com> >> wrote: >> >>> Hello, >>> >>> We're having a problem with concurrent requests. It seems that whenever >>> we try resolving more >>> than ~ 15 queries at the same time, one or two get a read timeout and >>> then succeed on a retry. >>> >>> We're running Cassandra 2.2.4 accessed via the 2.1.9 Datastax driver on >>> AWS. >>> >>> What we've found while investigating: >>> >>> * this is not db-wide. Trying the same pattern against another table >>> everything works fine. >>> * it fails 1 or 2 requests regardless of how many are executed in >>> parallel, i.e., it's still 1 or 2 when we ramp it up to ~ 120 concurrent >>> requests and doesn't seem to scale up. >>> * the problem is consistently reproducible. It happens both under >>> heavier load and when just firing off a single batch of requests for >>> testing. >>> * tracing the faulty requests says everything is great. An example >>> trace: https://gist.github.com/emilssolmanis/41e1e2ecdfd9a0569b1a >>> * the only peculiar thing in the logs is there's no acknowledgement of >>> the request being accepted by the server, as seen in >>> https://gist.github.com/emilssolmanis/242d9d02a6d8fb91da8a >>> * there's nothing funny in the timed out Cassandra node's logs around >>> that time as far as I can tell, not even in the debug logs. >>> >>> Any ideas about what might be causing this, pointers to server config >>> options, or how else we might debug this would be much appreciated. >>> >>> Kind regards, >>> Emils >>> >>> > > > -- > > Mike Heffner <m...@librato.com> > Librato, Inc. > >