Re: Debugging write timeouts on Cassandra 2.2.5

Mike Heffner Wed, 17 Feb 2016 20:14:05 -0800

Jaydeep,

No, we don't use any light weight transactions.


Mike

On Wed, Feb 17, 2016 at 6:44 PM, Jaydeep Chovatia <
chovatia.jayd...@gmail.com> wrote:

> Are you guys using light weight transactions in your write path?
>
> On Thu, Feb 11, 2016 at 12:36 AM, Fabrice Facorat <
> fabrice.faco...@gmail.com> wrote:
>
>> Are your commitlog and data on the same disk ? If yes, you should put
>> commitlogs on a separate disk which don't have a lot of IO.
>>
>> Others IO may have great impact impact on your commitlog writing and
>> it may even block.
>>
>> An example of impact IO may have, even for Async writes:
>>
>> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic
>>
>> 2016-02-11 0:31 GMT+01:00 Mike Heffner <m...@librato.com>:
>> > Jeff,
>> >
>> > We have both commitlog and data on a 4TB EBS with 10k IOPS.
>> >
>> > Mike
>> >
>> > On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa <jeff.ji...@crowdstrike.com
>> >
>> > wrote:
>> >>
>> >> What disk size are you using?
>> >>
>> >>
>> >>
>> >> From: Mike Heffner
>> >> Reply-To: "user@cassandra.apache.org"
>> >> Date: Wednesday, February 10, 2016 at 2:24 PM
>> >> To: "user@cassandra.apache.org"
>> >> Cc: Peter Norton
>> >> Subject: Re: Debugging write timeouts on Cassandra 2.2.5
>> >>
>> >> Paulo,
>> >>
>> >> Thanks for the suggestion, we ran some tests against CMS and saw the
>> same
>> >> timeouts. On that note though, we are going to try doubling the
>> instance
>> >> sizes and testing with double the heap (even though current usage is
>> low).
>> >>
>> >> Mike
>> >>
>> >> On Wed, Feb 10, 2016 at 3:40 PM, Paulo Motta <pauloricard...@gmail.com
>> >
>> >> wrote:
>> >>>
>> >>> Are you using the same GC settings as the staging 2.0 cluster? If not,
>> >>> could you try using the default GC settings (CMS) and see if that
>> changes
>> >>> anything? This is just a wild guess, but there were reports before of
>> >>> G1-caused instabilities with small heap sizes (< 16GB - see
>> CASSANDRA-10403
>> >>> for more context). Please ignore if you already tried reverting back
>> to CMS.
>> >>>
>> >>> 2016-02-10 16:51 GMT-03:00 Mike Heffner <m...@librato.com>:
>> >>>>
>> >>>> Hi all,
>> >>>>
>> >>>> We've recently embarked on a project to update our Cassandra
>> >>>> infrastructure running on EC2. We are long time users of 2.0.x and
>> are
>> >>>> testing out a move to version 2.2.5 running on VPC with EBS. Our
>> test setup
>> >>>> is a 3 node, RF=3 cluster supporting a small write load (mirror of
>> our
>> >>>> staging load).
>> >>>>
>> >>>> We are writing at QUORUM and while p95's look good compared to our
>> >>>> staging 2.0.x cluster, we are seeing frequent write operations that
>> time out
>> >>>> at the max write_request_timeout_in_ms (10 seconds). CPU across the
>> cluster
>> >>>> is < 10% and EBS write load is < 100 IOPS. Cassandra is running with
>> the
>> >>>> Oracle JDK 8u60 and we're using G1GC and any GC pauses are less than
>> 500ms.
>> >>>>
>> >>>> We run on c4.2xl instances with GP2 EBS attached storage for data and
>> >>>> commitlog directories. The nodes are using EC2 enhanced networking
>> and have
>> >>>> the latest Intel network driver module. We are running on HVM
>> instances
>> >>>> using Ubuntu 14.04.2.
>> >>>>
>> >>>> Our schema is 5 tables, all with COMPACT STORAGE. Each table is
>> similar
>> >>>> to the definition here:
>> >>>> https://gist.github.com/mheffner/4d80f6b53ccaa24cc20a
>> >>>>
>> >>>> This is our cassandra.yaml:
>> >>>>
>> https://gist.github.com/mheffner/fea80e6e939dd483f94f#file-cassandra-yaml
>> >>>>
>> >>>> Like I mentioned we use 8u60 with G1GC and have used many of the GC
>> >>>> settings in Al Tobey's tuning guide. This is our upstart config with
>> JVM and
>> >>>> other CPU settings:
>> https://gist.github.com/mheffner/dc44613620b25c4fa46d
>> >>>>
>> >>>> We've used several of the sysctl settings from Al's guide as well:
>> >>>> https://gist.github.com/mheffner/ea40d58f58a517028152
>> >>>>
>> >>>> Our client application is able to write using either Thrift batches
>> >>>> using Asytanax driver or CQL async INSERT's using the Datastax Java
>> driver.
>> >>>>
>> >>>> For testing against Thrift (our legacy infra uses this) we write
>> batches
>> >>>> of anywhere from 6 to 1500 rows at a time. Our p99 for batch
>> execution is
>> >>>> around 45ms but our maximum (p100) sits less than 150ms except when
>> it
>> >>>> periodically spikes to the full 10seconds.
>> >>>>
>> >>>> Testing the same write path using CQL writes instead demonstrates
>> >>>> similar behavior. Low p99s except for periodic full timeouts. We
>> enabled
>> >>>> tracing for several operations but were unable to get a trace that
>> completed
>> >>>> successfully -- Cassandra started logging many messages as:
>> >>>>
>> >>>> INFO  [ScheduledTasks:1] - MessagingService.java:946 - _TRACE
>> messages
>> >>>> were dropped in last 5000 ms: 52499 for internal timeout and 0 for
>> cross
>> >>>> node timeout
>> >>>>
>> >>>> And all the traces contained rows with a "null" source_elapsed row:
>> >>>>
>> https://gist.githubusercontent.com/mheffner/1d68a70449bd6688a010/raw/0327d7d3d94c3a93af02b64212e3b7e7d8f2911b/trace.out
>> >>>>
>> >>>>
>> >>>> We've exhausted as many configuration option permutations that we can
>> >>>> think of. This cluster does not appear to be under any significant
>> load and
>> >>>> latencies seem to largely fall in two bands: low normal or max
>> timeout. This
>> >>>> seems to imply that something is getting stuck and timing out at the
>> max
>> >>>> write timeout.
>> >>>>
>> >>>> Any suggestions on what to look for? We had debug enabled for awhile
>> but
>> >>>> we didn't see any msg that pointed to something obvious. Happy to
>> provide
>> >>>> any more information that may help.
>> >>>>
>> >>>> We are pretty much at the point of sprinkling debug around the code
>> to
>> >>>> track down what could be blocking.
>> >>>>
>> >>>>
>> >>>> Thanks,
>> >>>>
>> >>>> Mike
>> >>>>
>> >>>> --
>> >>>>
>> >>>>   Mike Heffner <m...@librato.com>
>> >>>>   Librato, Inc.
>> >>>>
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >>
>> >>   Mike Heffner <m...@librato.com>
>> >>   Librato, Inc.
>> >>
>> >
>> >
>> >
>> > --
>> >
>> >   Mike Heffner <m...@librato.com>
>> >   Librato, Inc.
>> >
>>
>>
>>
>> --
>> Close the World, Open the Net
>> http://www.linux-wizard.net
>>
>
>


-- 

  Mike Heffner <m...@librato.com>
  Librato, Inc.

Re: Debugging write timeouts on Cassandra 2.2.5

Reply via email to