Re: RE: batch_mutate failed: out of sequence response

Dan Washusen Mon, 18 Apr 2011 17:48:48 -0700

An example scenario (that is now fixed in Pelops):
Attempt to write a column with a null value
Cassandra throws a TProtocolException which renders the connection useless for 
future operations
Pelops returns the corrupt connection to the pool
A second read operation is attempted with the corrupt connection and Cassandra 
throws an ApplicationException



A Pelops test case for this can be found here:
https://github.com/s7/scale7-pelops/blob/3fe7584a24bb4b62b01897a814ef62415bd2fe43/src/test/java/org/scale7/cassandra/pelops/MutatorIntegrationTest.java#L262

Cheers,
-- 
Dan Washusen
On Tuesday, 19 April 2011 at 10:28 AM, Jonathan Ellis wrote: 
> Any idea what's causing the original TPE?
> 
> On Mon, Apr 18, 2011 at 6:22 PM, Dan Washusen <d...@reactive.org> wrote:
> > It turns out that once a TProtocolException is thrown from Cassandra the
> > connection is useless for future operations. Pelops was closing connections
> > when it detected TimedOutException, TTransportException and
> > UnavailableException but not TProtocolException. We have now changed Pelops
> > to close connections is all cases *except* NotFoundException.
> > 
> > Cheers,
> > --
> > Dan Washusen
> > 
> > On Friday, 8 April 2011 at 7:28 AM, Dan Washusen wrote:
> > 
> > Pelops uses a single connection per operation from a pool that is backed by
> > Apache Commons Pool (assuming you're using Cassandra 0.7). I'm not saying
> > it's perfect but it's NOT sharing a connection over multiple threads.
> > Dan Hendry mentioned that he sees these errors. Is he also using Pelops?
> > From his comment about retrying I'd assume not...
> > 
> > --
> > Dan Washusen
> > 
> > On Thursday, 7 April 2011 at 7:39 PM, Héctor Izquierdo Seliva wrote:
> > 
> > El mié, 06-04-2011 a las 21:04 -0500, Jonathan Ellis escribió:
> > 
> > "out of sequence response" is thrift's way of saying "I got a response
> > for request Y when I expected request X."
> > 
> > my money is on using a single connection from multiple threads. don't do
> > that.
> > 
> > I'm not using thrift directly, and my application is single thread, so I
> > guess this is Pelops fault somehow. Since I managed to tame memory
> > comsuption the problem has not appeared again, but it always happened
> > during a stop-the-world GC. Could it be that the message was sent
> > instead of being dropped by the server when the client assumed it had
> > timed out?
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of DataStax, the source for professional Cassandra support
> http://www.datastax.com
>

Re: RE: batch_mutate failed: out of sequence response

Reply via email to