Inexplicable dropped messages

2020-11-18 Thread Gediminas Blazys
Hello, Not so long ago, without any doing of our own we started observing an increased amount of dropped read messages and we can't find an explanation for it, perhaps you'll have some ideas where to look to try and decipher this. C* cluster: 7 DCs (5 of them with 18 nodes and 2 with 60) C* ver

Re: Cassandra 3.10 - Hints & Dropped messages logs Vs Cass 2.x version

2017-11-03 Thread Anumod Mullachery
ves wrote: >>> You can get dropped message statistics over JMX. for example nodetool >>> tpstats has a counter for dropped hints from startup. that would be the >>> preferred method for tracking this info, rather than parsing logs >>> >>> On 2 Nov. 2017 6

Re: Cassandra 3.10 - Hints & Dropped messages logs Vs Cass 2.x version

2017-11-02 Thread kurt greaves
this info, rather than parsing logs >> >> On 2 Nov. 2017 6:24 am, "Anumod Mullachery" >> wrote: >> >> >> Hi All, >> >> In cassandra v 2.1.15 , I'm able to pull the hints drop and dropped >> messages from cassandra.log as below-

Re: Cassandra 3.10 - Hints & Dropped messages logs Vs Cass 2.x version

2017-11-02 Thread Anumod Mullachery
rtup. that would be the > preferred method for tracking this info, rather than parsing logs > > On 2 Nov. 2017 6:24 am, "Anumod Mullachery" > wrote: > > > Hi All, > > In cassandra v 2.1.15 , I'm able to pull the hints drop and dropped > messages from cassan

Re: Cassandra 3.10 - Hints & Dropped messages logs Vs Cass 2.x version

2017-11-01 Thread kurt greaves
v 2.1.15 , I'm able to pull the hints drop and dropped messages from cassandra.log as below- dropped hints--> "/opt/xcal/apps/cassandra/logs/cassandra.log <https://splunk.ccp.cable.comcast.com/en-US/app/search/search?q=search%20ring%3A%3A*%20%20%20NOT%20ring%3A%3AXHOMEcls_P

Cassandra 3.10 - Hints & Dropped messages logs Vs Cass 2.x version

2017-11-01 Thread Anumod Mullachery
Hi All, In cassandra v 2.1.15 , I'm able to pull the hints drop and dropped messages from cassandra.log as below- dropped hints--> "/opt/xcal/apps/cassandra/logs/cassandra.log <https://splunk.ccp.cable.comcast.com/en-US/app/search/search?q=search%20ring%3A%3A*%20%20

Re: Dropped messages on random nodes.

2017-01-22 Thread Dikang Gu
Btw, the C* version is 2.2.5, with several backported patches. On Sun, Jan 22, 2017 at 10:36 PM, Dikang Gu wrote: > Hello there, > > We have a 100 nodes ish cluster, I find that there are dropped messages on > random nodes in the cluster, which caused error spikes and P99 latency

Dropped messages on random nodes.

2017-01-22 Thread Dikang Gu
Hello there, We have a 100 nodes ish cluster, I find that there are dropped messages on random nodes in the cluster, which caused error spikes and P99 latency spikes as well. I tried to figure out the cause. I do not see any obvious bottleneck in the cluster, the C* nodes still have plenty of

Re: Flush activity and dropped messages

2016-08-26 Thread Patrick McFadin
; space. >> >> In 2.1, none of these concerns apply. >> >> >> On 24 August 2016 at 23:40, Vasileios Vlachos > > wrote: >> >>> Hello, >>> >>> >>> >>> >>> >>> We have an 8-node cluster sprea

Re: Flush activity and dropped messages

2016-08-26 Thread Vasileios Vlachos
>> >> >> >> >> >> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We >> run C* 2.0.17 on Ubuntu 12.04 at the moment. >> >> >> >> >> Our C# application often throws logs, which correlate with dropped >> mess

Re: Flush activity and dropped messages

2016-08-26 Thread Vasileios Vlachos
gt;> >> >> >> >> >> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We >> run C* 2.0.17 on Ubuntu 12.04 at the moment. >> >> >> >> >> Our C# application often throws logs, which correlate with dropped >> mes

Re: Flush activity and dropped messages

2016-08-25 Thread Patrick McFadin
; >> >> >> >> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We >> run C* 2.0.17 on Ubuntu 12.04 at the moment. >> >> >> >> >> Our C# application often throws logs, which correlate with dropped >> messages (counter

Re: Flush activity and dropped messages

2016-08-25 Thread Benedict Elliott Smith
llo, > > > > > > We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We run > C* 2.0.17 on Ubuntu 12.04 at the moment. > > > > > Our C# application often throws logs, which correlate with dropped > messages (counter mutations usually) in our lo

Flush activity and dropped messages

2016-08-24 Thread Vasileios Vlachos
Hello, We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We run C* 2.0.17 on Ubuntu 12.04 at the moment. Our C# application often throws logs, which correlate with dropped messages (counter mutations usually) in our logs. We think that if a specific mutaiton stays in the

Re: Cassandra stalls and dropped messages not due to GC

2015-11-02 Thread Jeff Ferland
> On Nov 2, 2015, at 11:35 AM, Nate McCall wrote: > Forgive me, but what is CMS? > > Sorry - ConcurrentMarkSweep garbage collector. Ah, my brain was trying to think in terms of something Cassandra specific. I have full GC logging on and since moving to G1, I haven’t had any >500ms GC cycles

Re: Cassandra stalls and dropped messages not due to GC

2015-11-02 Thread Nate McCall
> > > Forgive me, but what is CMS? > Sorry - ConcurrentMarkSweep garbage collector. > > No. I’ve tried some mitigations since tuning thread pool sizes and GC, but > the problem begins with only an upgrade of Cassandra. No other system > packages, kernels, etc. > > > >From what 2.0 version did yo

Re: Cassandra stalls and dropped messages not due to GC

2015-11-02 Thread Jeff Ferland
Having caught a node in an undesirable state, many of my threads are reading like this: "SharedPool-Worker-5" #875 daemon prio=5 os_prio=0 tid=0x7f3e14196800 nid=0x96ce waiting on condition [0x7f3ddb835000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Nativ

Re: Cassandra stalls and dropped messages not due to GC

2015-10-30 Thread Nate McCall
Does tpstats show unusually high counts for blocked flush writers? As Sebastian suggests, running ttop will paint a clearer picture about what is happening within C*. I would however recommend going back to CMS in this case as that is the devil we all know and more folks will be able to offer advi

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Jeff Ferland
Upgraded from 2.0.x. Using the other commit log sync method and 10 seconds. Enabling batch mode is like swallowing a grenade. It’s starting to look to me like it’s possibly related to brief IO spikes that are smaller than my usual graphing granularity. It feels surprising to me that these would

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Graham Sanderson
Only if you actually change cassandra.yaml (that was the change in 2.1.6 which is why it matters what version he upgraded from) > On Oct 29, 2015, at 10:06 PM, Sebastian Estevez > wrote: > > The thing about the CASSANDRA-9504 theory is that it was solved in 2.1.6 and > Jeff's running 2.1.11.

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Sebastian Estevez
The thing about the CASSANDRA-9504 theory is that it was solved in 2.1.6 and Jeff's running 2.1.11. @Jeff How often does this happen? Can you watch ttop as soon as you notice increased read/write latencies? wget > https://bintray.com/artifact/download/aragozin/generic/sjk-plus-0.3.6.jar > java -

Re: Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Graham Sanderson
you didn’t say what you upgraded from, but if it is 2.0.x, then look at CASSANDRA-9504 If so and you use commitlog_sync: batch Then you probably want to set commitlog_sync_batch_window_in_ms: 1 (or 2) Note I’m only slightly convinced this is the cause because of your READ_REPAIR issues (though i

Cassandra stalls and dropped messages not due to GC

2015-10-29 Thread Jeff Ferland
Using DSE 4.8.1 / 2.1.11.872, Java version 1.8.0_66 We upgraded our cluster this weekend and have been having issues with dropped mutations since then. Intensely investigating a single node and toying with settings has revealed that GC stalls don’t make up enough time to explain the 10 seconds

2 hour bout of pending gossip, pending mutations, high CPU, high ParNew, dropped messages

2014-04-24 Thread Thunder Stumpges
ery couple hundred ms. CMS gen seemed OK at 4GB of 6GB and not much remaining after the par-new collection: 'Heap after GC invocations=151586 (full 137): par new generation total 1887488K, used 147K" ** Backed up Mutations in Mutation stage of TPStats and dropped messages: 201

Re: getting dropped messages in log even with no one running

2014-03-24 Thread Brian Tarbox
the other nodes went right back into the "dropping messages" state. Help please. Brian On Mon, Mar 24, 2014 at 10:01 AM, Brian Tarbox wrote: > I'm getting "messages dropped" messages in my cluster even when (like > right now) there are no clients running against t

getting dropped messages in log even with no one running

2014-03-24 Thread Brian Tarbox
I'm getting "messages dropped" messages in my cluster even when (like right now) there are no clients running against the cluster. 1) who could be generating the traffic if there are no clients? 2) is there a way to list active clients...on the off chance that there is a client I d

getting lots of dropped messages/requests/mutations but only on 2 of 6 servers

2014-03-20 Thread Brian Tarbox
I have a six node cluster (running m2-2xlarge instances in AWS) with RF=3 and I'm seeing two of the six nodes reporting lots of dropped messages. The six machines are identical (created from same AWS AMI) so this local behavior has me puzzled. BTW this is mostly happening when I'm r

Re: Mutation Dropped Messages

2012-03-06 Thread aaron morton
...@thelastpickle.com] > Sent: Monday, March 05, 2012 11:07 PM > To: user@cassandra.apache.org > Subject: Re: Mutation Dropped Messages > > I increased the size of the cluster also the concurrent_writes parameter. > Still there is a node which keeps on dropping the mutation m

RE: Mutation Dropped Messages

2012-03-06 Thread Tiwari, Dushyant
Subject: Re: Mutation Dropped Messages I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. Ensure all the nodes have the same spec, and the nodes have the same config. In a virtual environment consider moving the

Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
; Thanks, > Dushyant > > From: aaron morton [mailto:aa...@thelastpickle.com] > Sent: Monday, March 05, 2012 4:15 PM > To: user@cassandra.apache.org > Subject: Re: Mutation Dropped Messages > > 1. Which parameters to tune in the config files? – Especially looking > f

RE: Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
orton [mailto:aa...@thelastpickle.com] Sent: Monday, March 05, 2012 4:15 PM To: user@cassandra.apache.org Subject: Re: Mutation Dropped Messages 1. Which parameters to tune in the config files? - Especially looking for heavy writes The node is overloaded. It may be because there are no enough node

RE: Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Thanks a lot for the concurrent_writes hint that really improves the throughput. Do you mean dropped messages and no timedoutexception will mean the data is written somewhere in the cluster and by taking corrective measures desired CL can be achieved? From: aaron morton [mailto:aa

Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
s created by dropped messages are repaired via reads as high CL, HH (in 1.+), Read Repair or Anti Entropy. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote: > Hi All, > > While benchmark

Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hi All, While benchmarking Cassandra I found "Mutation Dropped" messages in the logs. Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questi

Re: What causes dropped messages?

2011-08-16 Thread Jeremy Hanna
http://wiki.apache.org/cassandra/FAQ#dropped_messages As to what's causing them - look in the logs and it will do the equivalent of a nodetool tpstats right after the dropped messages messages. That should give you a clue as to why there are dropped messages - which thread pools are backe

What causes dropped messages?

2011-08-16 Thread David Boxenhorn
How can I tell what's causing dropped messages? Is it just too much activity? I'm not getting any other, more specific messages, just these: WARN [ScheduledTasks:1] 2011-08-15 11:33:26,136 MessagingService.java (line 504) Dropped 1534 MUTATION messages in the last 5000ms WARN [Schedu

Re: Dropped messages

2011-08-07 Thread Philippe
Thanks Aaron. The first paragraph is very clear however the 2nd paragraph leaves me wondering regarding counter columns in my setup. I am writing at CL.ALL and reading at CL.ONE so if I get dropped messages, it will show up as Timeouts on the client side so possibly the mutation was not run on

Re: Dropped messages

2011-08-06 Thread aaron morton
Just added this to the wiki http://wiki.apache.org/cassandra/FAQ#dropped_messages Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 6 Aug 2011, at 10:53, Philippe wrote: > Hi, > I see lines like this in my log file > INFO [Schedu

Dropped messages

2011-08-05 Thread Philippe
Hi, I see lines like this in my log file INFO [ScheduledTasks:1] 2011-08-06 00:51:57,650 MessagingService.java (line 586) 358 MUTATION messages dropped in server lifetime INFO [ScheduledTasks:1] 2011-08-06 00:51:57,658 MessagingService.java (line 586) 297 READ messages dropped in server lifetime