Re: Ring connection timeouts with 2.2.6

2016-07-23 Thread Mike Heffner
ikes up to 20-30, when the > normal average load is around 3-4. So far I haven't found any good reason, > but I'm going to try otc_coalescing_strategy: disabled tomorrow. > > - Garo > > On Fri, Jul 15, 2016 at 6:16 PM, Mike Heffner wrote: > >> Just to fo

Re: Ring connection timeouts with 2.2.6

2016-07-15 Thread Mike Heffner
Jul 5, 2016 at 12:14 PM, Mike Heffner wrote: > Jeff, > > Thanks, yeah we updated to the 2.16.4 driver version from source. I don't > believe we've hit the bugs mentioned in earlier driver versions. > > Mike > > On Mon, Jul 4, 2016 at 11:16 PM, Jeff Jirsa > w

Re: Ring connection timeouts with 2.2.6

2016-07-05 Thread Mike Heffner
epending on your instance types / hypervisor choice, you may want to > ensure you’re not seeing that bug. > > > > *From: *Mike Heffner > *Reply-To: *"user@cassandra.apache.org" > *Date: *Friday, July 1, 2016 at 1:10 PM > *To: *"user@cassandra.apache.org&q

Re: Ring connection timeouts with 2.2.6

2016-07-01 Thread Mike Heffner
en any >> noticeable change in heap. >> >> On Thu, Jun 23, 2016 at 10:38 AM, Mike Heffner wrote: >> >>> Hi, >>> >>> We have a 12 node 2.2.6 ring running in AWS, single DC with RF=3, that >>> is sitting at <25% CPU, doing mostly writes

Re: Ring connection timeouts with 2.2.6

2016-06-25 Thread Mike Heffner
One thing to add, if we do a rolling restart of the ring the timeouts disappear entirely for several hours and performance returns to normal. It's as if something is leaking over time, but we haven't seen any noticeable change in heap. On Thu, Jun 23, 2016 at 10:38 AM, Mike Heffner wr

Ring connection timeouts with 2.2.6

2016-06-23 Thread Mike Heffner
hts on what to look for? Can we increase thread count/pool sizes for the messaging service? Thanks, Mike -- Mike Heffner Librato, Inc.

Re: Consistent read timeouts for bursts of reads

2016-03-04 Thread Mike Heffner
wrote: > Mike, > > Is that where you've bisected it to having been introduced? > > I'll see what I can do, but doubt it, since we've long since upgraded prod > to 2.2.4 (and stage before that) and the tests I'm running were for a new > feature. > > > On

Re: Consistent read timeouts for bursts of reads

2016-03-03 Thread Mike Heffner
seen in >> https://gist.github.com/emilssolmanis/242d9d02a6d8fb91da8a >> * there's nothing funny in the timed out Cassandra node's logs around >> that time as far as I can tell, not even in the debug logs. >> >> Any ideas about what might be causing this, pointers to server config >> options, or how else we might debug this would be much appreciated. >> >> Kind regards, >> Emils >> >> -- Mike Heffner Librato, Inc.

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-24 Thread Mike Heffner
pending on the driver, these may now be allowing 32k streams per > connection(!) as detailed in v3 of the native protocol: > > https://github.com/apache/cassandra/blob/cassandra-2.1/doc/native_protocol_v3.spec#L130-L152 > > > > On Fri, Feb 19, 2016 at 8:48 AM, Mike Heffner wrote: >

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-19 Thread Mike Heffner
, in clause > etc? > > > Anuj > > > Sent from Yahoo Mail on Android > <https://overview.mail.yahoo.com/mobile/?.src=Android> > > On Thu, 18 Feb, 2016 at 8:45 pm, Mike Heffner > wrote: > Alain, > > Thanks for the suggestions. > > Sure, tpstats are he

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-18 Thread Mike Heffner
te: >>> >>>> Are your commitlog and data on the same disk ? If yes, you should put >>>> commitlogs on a separate disk which don't have a lot of IO. >>>> >>>> Others IO may have great impact impact on your commitlog writing and >>>&g

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-18 Thread Mike Heffner
st on that earlier. Thanks, Mike On Wed, Feb 10, 2016 at 2:51 PM, Mike Heffner wrote: > Hi all, > > We've recently embarked on a project to update our Cassandra > infrastructure running on EC2. We are long time users of 2.0.x and are > testing out a move to version 2.2.5 running o

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-17 Thread Mike Heffner
it may even block. >> >> An example of impact IO may have, even for Async writes: >> >> https://engineering.linkedin.com/blog/2016/02/eliminating-large-jvm-gc-pauses-caused-by-background-io-traffic >> >> 2016-02-11 0:31 GMT+01:00 Mike Heffner : >> > Jef

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-10 Thread Mike Heffner
Jeff, We have both commitlog and data on a 4TB EBS with 10k IOPS. Mike On Wed, Feb 10, 2016 at 5:28 PM, Jeff Jirsa wrote: > What disk size are you using? > > > > From: Mike Heffner > Reply-To: "user@cassandra.apache.org" > Date: Wednesday, February

Re: Debugging write timeouts on Cassandra 2.2.5

2016-02-10 Thread Mike Heffner
DRA-10403 > for more context). Please ignore if you already tried reverting back to CMS. > > 2016-02-10 16:51 GMT-03:00 Mike Heffner : > >> Hi all, >> >> We've recently embarked on a project to update our Cassandra >> infrastructure running on EC2. We are l

Debugging write timeouts on Cassandra 2.2.5

2016-02-10 Thread Mike Heffner
x27;t see any msg that pointed to something obvious. Happy to provide any more information that may help. We are pretty much at the point of sprinkling debug around the code to track down what could be blocking. Thanks, Mike -- Mike Heffner Librato, Inc.

Re: Significant drop in storage load after 2.1.6->2.1.8 upgrade

2015-07-19 Thread Mike Heffner
e drop. > > However, the discussion on > https://issues.apache.org/jira/browse/CASSANDRA-9683 seems to be similar > to what you saw and that is currently being investigated. > > On Fri, Jul 17, 2015 at 10:24 AM, Mike Heffner wrote: > >> Hi all, >> >> I've

Significant drop in storage load after 2.1.6->2.1.8 upgrade

2015-07-17 Thread Mike Heffner
27;org.apache.cassandra.io.compress.LZ4Compressor'} AND dclocal_read_repair_chance = 0.1 AND default_time_to_live = 0 AND gc_grace_seconds = 0 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND read_repair_chance = 0.0 AND speculative_retry = '99.0PERCENTILE'; Thanks, Mike -- Mike Heffner Librato, Inc.

Re: How to column slice with CQL + 1.2

2014-07-18 Thread Mike Heffner
umn2, column3) > (1, 3, 4) > AND (column1) < (2) > > > On Thu, Jul 17, 2014 at 6:01 PM, Mike Heffner wrote: > >> Michael, >> >> So if I switch to: >> >> SELECT * FROM CF WHERE key='X' AND column1=1 AND column2=3 AND column3>4 >

Re: How to column slice with CQL + 1.2

2014-07-17 Thread Mike Heffner
D column1=1 AND column2=3 AND > column3>4 AND column1<=2; > > On Thu, Jul 17, 2014 at 6:23 PM, Mike Heffner wrote: > > What is the proper way to perform a column slice using CQL with 1.2? > > > > I have a CF with a primary key X and 3 composite columns (A, B,

How to column slice with CQL + 1.2

2014-07-17 Thread Mike Heffner
lumn1=1 AND column2=3 AND column3>4 AND column1<=2; fails with: DoGetMeasures: column1 cannot be restricted by both an equal and an inequal relation This is against Cassandra 1.2.16. What is the proper way to perform this query? Cheers, Mike -- Mike Heffner Librato, Inc.

How to restart bootstrap after a failed streaming due to Broken Pipe (1.2.16)

2014-06-09 Thread Mike Heffner
om streaming it seems that simply restarting will inevitably hit this problem again. Cheers, Mike -- Mike Heffner Librato, Inc.

Re: Failed decommission

2013-08-25 Thread Mike Heffner
ill "up" but "leaving" but I > can't seem to get it out of the ring. For example: > > % nodetool removenode 6ba2c7d4-713e-4c14-8df8-f861fb211b0d > Exception in thread "main" java.lang.UnsupportedOperationException: Node / > 10.0.0.3 is alive and owns this ID. Use decommission command to remove it > from the ring > > Any ideas? > > /Janne -- Mike Heffner Librato, Inc.

Re: Decommission faster than bootstrap

2013-08-22 Thread Mike Heffner
at is the reason to have this time difference? For both operations, > what it is time-consuming the data streaming from (or to) other node, right? >Thanks in advance. > > Att. > > *Rodrigo Felix de Almeida* > LSBD - Universidade Federal do Ceará > Project Manager > MBA, CSM, CSPO, SCJP > -- Mike Heffner Librato, Inc.

Re: High performance hardware with lot of data per node - Global learning about configuration

2013-07-11 Thread Mike Heffner
ssandra-env.sh >> > >> > I am not sure about how to tune the heap, so I mainly use defaults >> > >> > MAX_HEAP_SIZE="8G" >> > HEAP_NEWSIZE="400M" (I tried with higher values, and it produced bigger >> GC times (1600 ms instead of < 200 ms now with 400M) >> > >> > -XX:+UseParNewGC >> > -XX:+UseConcMarkSweepGC >> > -XX:+CMSParallelRemarkEnabled >> > -XX:SurvivorRatio=8 >> > -XX:MaxTenuringThreshold=1 >> > -XX:CMSInitiatingOccupancyFraction=70 >> > -XX:+UseCMSInitiatingOccupancyOnly >> > >> > Does this configuration seems coherent ? Right now, performance are >> correct, latency < 5ms almost all the time. What can I do to handle more >> data per node and keep these performances or get even better once ? >> > >> > I know this is a long message but if you have any comment or insight >> even on part of it, don't hesitate to share it. I guess this kind of >> comment on configuration is usable by the entire community. >> > >> > Alain >> > >> >> > > > -- > > Mike Heffner > Librato, Inc. > > > -- Mike Heffner Librato, Inc.

Re: High performance hardware with lot of data per node - Global learning about configuration

2013-07-11 Thread Mike Heffner
UseParNewGC > > -XX:+UseConcMarkSweepGC > > -XX:+CMSParallelRemarkEnabled > > -XX:SurvivorRatio=8 > > -XX:MaxTenuringThreshold=1 > > -XX:CMSInitiatingOccupancyFraction=70 > > -XX:+UseCMSInitiatingOccupancyOnly > > > > Does this configuration seems coherent ? Right now, performance are > correct, latency < 5ms almost all the time. What can I do to handle more > data per node and keep these performances or get even better once ? > > > > I know this is a long message but if you have any comment or insight > even on part of it, don't hesitate to share it. I guess this kind of > comment on configuration is usable by the entire community. > > > > Alain > > > > -- Mike Heffner Librato, Inc.

Re: manually removing sstable

2013-07-10 Thread Mike Heffner
ever has expired, > and we'd rather just manually get rid of those. > > T# > -- Mike Heffner Librato, Inc.

Re: High performance hardware with lot of data per node - Global learning about configuration

2013-07-09 Thread Mike Heffner
nitiatingOccupancyOnly > > Does this configuration seems coherent ? Right now, performance are > correct, latency < 5ms almost all the time. What can I do to handle more > data per node and keep these performances or get even better once ? > > I know this is a long message but if you have any comment or insight even > on part of it, don't hesitate to share it. I guess this kind of comment on > configuration is usable by the entire community. > > Alain > > -- Mike Heffner Librato, Inc.

Re: Streaming performance with 1.2.6

2013-07-02 Thread Mike Heffner
indicate that the sending node is limiting our streaming rate. Mike On Tue, Jul 2, 2013 at 3:00 PM, Mike Heffner wrote: > Sankalp, > > Parallel sstableloader streaming would definitely be valuable. > > However, this ring is currently using vnodes and I was surprised to see > th

Re: Streaming performance with 1.2.6

2013-07-02 Thread Mike Heffner
> https://issues.apache.org/jira/browse/CASSANDRA-4784 > > > On Tue, Jul 2, 2013 at 7:35 AM, Mike Heffner wrote: > >> >> On Mon, Jul 1, 2013 at 10:06 PM, Mike Heffner wrote: >> >>> >>> The only changes we've made to the config (aside from di

Re: Streaming performance with 1.2.6

2013-07-02 Thread Mike Heffner
On Mon, Jul 1, 2013 at 10:06 PM, Mike Heffner wrote: > > The only changes we've made to the config (aside from dirs/hosts) are: > Forgot to include we've changed this as well: -partitioner: org.apache.cassandra.dht.Murmur3Partitioner +partitioner: org.apache.cassandra.dh

Streaming performance with 1.2.6

2013-07-01 Thread Mike Heffner
ansfers with rsync. Any suggestions for what to adjust to see better streaming performance? 5% of what a single rsync can do seems somewhat limited. Thanks, Mike -- Mike Heffner Librato, Inc.

Re: Upgrade 1.1.2 -> 1.1.6

2012-11-20 Thread Mike Heffner
On Tue, Nov 20, 2012 at 2:49 PM, Rob Coli wrote: > On Mon, Nov 19, 2012 at 7:18 PM, Mike Heffner wrote: > > We performed a 1.1.3 -> 1.1.6 upgrade and found that all the logs > replayed > > regardless of the drain. > > Your experience and desire for different (expected

Re: Upgrade 1.1.2 -> 1.1.6

2012-11-20 Thread Mike Heffner
nathan Ellis told that a drain would avoid > this issue, It seems like it doesn't. > > @Rob > > You understood precisely the 2 issues I met during the upgrade. I am sad > to see none of them is yet resolved and probably wont. > > > 2012/11/20 Mike Heffner > >

Re: Upgrade 1.1.2 -> 1.1.6

2012-11-19 Thread Mike Heffner
sandra.yaml.bak > 140 vim /etc/cassandra/cassandra.yaml > 141 service cassandra start > > After both of these updates I saw my current counters increase without any > reason. > > Did I do anything wrong ? > > Alain > > -- Mike Heffner Librato, Inc.

Re: Hinted Handoff runs every ten minutes

2012-11-08 Thread Mike Heffner
en Pierce napsal(a): > > I'm running 1.1.5; the bug says it's fixed in 1.0.9/1.1.0. > > > > How can I check to see why it keeps running HintedHandoff? > you have tombstone is system.HintsColumnFamily use list command in > cassandra-cli to check > > -- Mike Heffner Librato, Inc.

Re: Migrating data from a 0.8.8 -> 1.1.2 ring

2012-07-24 Thread Mike Heffner
On Mon, Jul 23, 2012 at 1:25 PM, Mike Heffner wrote: > Hi, > > We are migrating from a 0.8.8 ring to a 1.1.2 ring and we are noticing > missing data post-migration. We use pre-built/configured AMIs so our > preferred route is to leave our existing production 0.8.8 untouched a

Migrating data from a 0.8.8 -> 1.1.2 ring

2012-07-23 Thread Mike Heffner
wards. Any assistance would be appreciated. Thanks! Mike -- Mike Heffner Librato, Inc.

Wildcard character for CF in access.properties?

2011-04-14 Thread Mike Heffner
Is there a wildcard for the COLUMNFAMILY field in `access.properties`? I'd like to split read-write and read-only access between my backend and frontend users, respectively, however the full list of CFs is not known a priori. I'm using 0.7.4. Cheers, Mike -- Mike Heffner