Hello,
Not so long ago, without any doing of our own we started observing an increased
amount of dropped read messages and we can't find an explanation for it,
perhaps you'll have some ideas where to look to try and decipher this.
C* cluster: 7 DCs (5 of them with 18 nodes and 2 with 60)
C* ver
ves wrote:
>>> You can get dropped message statistics over JMX. for example nodetool
>>> tpstats has a counter for dropped hints from startup. that would be the
>>> preferred method for tracking this info, rather than parsing logs
>>>
>>> On 2 Nov. 2017 6
this info, rather than parsing logs
>>
>> On 2 Nov. 2017 6:24 am, "Anumod Mullachery"
>> wrote:
>>
>>
>> Hi All,
>>
>> In cassandra v 2.1.15 , I'm able to pull the hints drop and dropped
>> messages from cassandra.log as below-
rtup. that would be the
> preferred method for tracking this info, rather than parsing logs
>
> On 2 Nov. 2017 6:24 am, "Anumod Mullachery"
> wrote:
>
>
> Hi All,
>
> In cassandra v 2.1.15 , I'm able to pull the hints drop and dropped
> messages from cassan
v 2.1.15 , I'm able to pull the hints drop and dropped
messages from cassandra.log as below-
dropped hints-->
"/opt/xcal/apps/cassandra/logs/cassandra.log
<https://splunk.ccp.cable.comcast.com/en-US/app/search/search?q=search%20ring%3A%3A*%20%20%20NOT%20ring%3A%3AXHOMEcls_P
Hi All,
In cassandra v 2.1.15 , I'm able to pull the hints drop and dropped
messages from cassandra.log as below-
dropped hints-->
"/opt/xcal/apps/cassandra/logs/cassandra.log
<https://splunk.ccp.cable.comcast.com/en-US/app/search/search?q=search%20ring%3A%3A*%20%20
Btw, the C* version is 2.2.5, with several backported patches.
On Sun, Jan 22, 2017 at 10:36 PM, Dikang Gu wrote:
> Hello there,
>
> We have a 100 nodes ish cluster, I find that there are dropped messages on
> random nodes in the cluster, which caused error spikes and P99 latency
Hello there,
We have a 100 nodes ish cluster, I find that there are dropped messages on
random nodes in the cluster, which caused error spikes and P99 latency
spikes as well.
I tried to figure out the cause. I do not see any obvious bottleneck in the
cluster, the C* nodes still have plenty of
; space.
>>
>> In 2.1, none of these concerns apply.
>>
>>
>> On 24 August 2016 at 23:40, Vasileios Vlachos > > wrote:
>>
>>> Hello,
>>>
>>>
>>>
>>>
>>>
>>> We have an 8-node cluster sprea
>>
>>
>>
>>
>>
>> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We
>> run C* 2.0.17 on Ubuntu 12.04 at the moment.
>>
>>
>>
>>
>> Our C# application often throws logs, which correlate with dropped
>> mess
gt;>
>>
>>
>>
>>
>> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We
>> run C* 2.0.17 on Ubuntu 12.04 at the moment.
>>
>>
>>
>>
>> Our C# application often throws logs, which correlate with dropped
>> mes
;
>>
>>
>>
>> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We
>> run C* 2.0.17 on Ubuntu 12.04 at the moment.
>>
>>
>>
>>
>> Our C# application often throws logs, which correlate with dropped
>> messages (counter
llo,
>
>
>
>
>
> We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We run
> C* 2.0.17 on Ubuntu 12.04 at the moment.
>
>
>
>
> Our C# application often throws logs, which correlate with dropped
> messages (counter mutations usually) in our lo
Hello,
We have an 8-node cluster spread out in 2 DCs, 4 nodes in each one. We run
C* 2.0.17 on Ubuntu 12.04 at the moment.
Our C# application often throws logs, which correlate with dropped messages
(counter mutations usually) in our logs. We think that if a specific
mutaiton stays in the
> On Nov 2, 2015, at 11:35 AM, Nate McCall wrote:
> Forgive me, but what is CMS?
>
> Sorry - ConcurrentMarkSweep garbage collector.
Ah, my brain was trying to think in terms of something Cassandra specific. I
have full GC logging on and since moving to G1, I haven’t had any >500ms GC
cycles
>
>
> Forgive me, but what is CMS?
>
Sorry - ConcurrentMarkSweep garbage collector.
>
> No. I’ve tried some mitigations since tuning thread pool sizes and GC, but
> the problem begins with only an upgrade of Cassandra. No other system
> packages, kernels, etc.
>
>
>
>From what 2.0 version did yo
Having caught a node in an undesirable state, many of my threads are reading
like this:
"SharedPool-Worker-5" #875 daemon prio=5 os_prio=0 tid=0x7f3e14196800
nid=0x96ce waiting on condition [0x7f3ddb835000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Nativ
Does tpstats show unusually high counts for blocked flush writers?
As Sebastian suggests, running ttop will paint a clearer picture about what
is happening within C*. I would however recommend going back to CMS in this
case as that is the devil we all know and more folks will be able to offer
advi
Upgraded from 2.0.x. Using the other commit log sync method and 10 seconds.
Enabling batch mode is like swallowing a grenade.
It’s starting to look to me like it’s possibly related to brief IO spikes that
are smaller than my usual graphing granularity. It feels surprising to me that
these would
Only if you actually change cassandra.yaml (that was the change in 2.1.6 which
is why it matters what version he upgraded from)
> On Oct 29, 2015, at 10:06 PM, Sebastian Estevez
> wrote:
>
> The thing about the CASSANDRA-9504 theory is that it was solved in 2.1.6 and
> Jeff's running 2.1.11.
The thing about the CASSANDRA-9504 theory is that it was solved in 2.1.6
and Jeff's running 2.1.11.
@Jeff
How often does this happen? Can you watch ttop as soon as you notice
increased read/write latencies?
wget
> https://bintray.com/artifact/download/aragozin/generic/sjk-plus-0.3.6.jar
> java -
you didn’t say what you upgraded from, but if it is 2.0.x, then look at
CASSANDRA-9504
If so and you use
commitlog_sync: batch
Then you probably want to set
commitlog_sync_batch_window_in_ms: 1 (or 2)
Note I’m only slightly convinced this is the cause because of your READ_REPAIR
issues (though i
Using DSE 4.8.1 / 2.1.11.872, Java version 1.8.0_66
We upgraded our cluster this weekend and have been having issues with dropped
mutations since then. Intensely investigating a single node and toying with
settings has revealed that GC stalls don’t make up enough time to explain the
10 seconds
ery couple hundred
ms. CMS gen seemed OK at 4GB of 6GB and not much remaining after the
par-new collection:
'Heap after GC invocations=151586 (full 137):
par new generation total 1887488K, used 147K"
** Backed up Mutations in Mutation stage of TPStats and dropped messages:
201
the other nodes
went right back into the "dropping messages" state.
Help please.
Brian
On Mon, Mar 24, 2014 at 10:01 AM, Brian Tarbox wrote:
> I'm getting "messages dropped" messages in my cluster even when (like
> right now) there are no clients running against t
I'm getting "messages dropped" messages in my cluster even when (like right
now) there are no clients running against the cluster.
1) who could be generating the traffic if there are no clients?
2) is there a way to list active clients...on the off chance that there is
a client I d
I have a six node cluster (running m2-2xlarge instances in AWS) with RF=3
and I'm seeing two of the six nodes reporting lots of dropped messages.
The six machines are identical (created from same AWS AMI) so this local
behavior has me puzzled.
BTW this is mostly happening when I'm r
...@thelastpickle.com]
> Sent: Monday, March 05, 2012 11:07 PM
> To: user@cassandra.apache.org
> Subject: Re: Mutation Dropped Messages
>
> I increased the size of the cluster also the concurrent_writes parameter.
> Still there is a node which keeps on dropping the mutation m
Subject: Re: Mutation Dropped Messages
I increased the size of the cluster also the concurrent_writes parameter. Still
there is a node which keeps on dropping the mutation messages.
Ensure all the nodes have the same spec, and the nodes have the same config. In
a virtual environment consider moving the
; Thanks,
> Dushyant
>
> From: aaron morton [mailto:aa...@thelastpickle.com]
> Sent: Monday, March 05, 2012 4:15 PM
> To: user@cassandra.apache.org
> Subject: Re: Mutation Dropped Messages
>
> 1. Which parameters to tune in the config files? – Especially looking
> f
orton [mailto:aa...@thelastpickle.com]
Sent: Monday, March 05, 2012 4:15 PM
To: user@cassandra.apache.org
Subject: Re: Mutation Dropped Messages
1. Which parameters to tune in the config files? - Especially looking for
heavy writes
The node is overloaded. It may be because there are no enough node
Thanks a lot for the concurrent_writes hint that really improves the
throughput. Do you mean dropped messages and no timedoutexception will mean the
data is written somewhere in the cluster and by taking corrective measures
desired CL can be achieved?
From: aaron morton [mailto:aa
s created by dropped messages are repaired via reads as high CL,
HH (in 1.+), Read Repair or Anti Entropy.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 5/03/2012, at 11:32 PM, Tiwari, Dushyant wrote:
> Hi All,
>
> While benchmark
Hi All,
While benchmarking Cassandra I found "Mutation Dropped" messages in the logs.
Now I know this is a good old question. It will be really great if someone can
provide a check list to recover when such a thing happens. I am looking for
answers of the following questi
http://wiki.apache.org/cassandra/FAQ#dropped_messages
As to what's causing them - look in the logs and it will do the equivalent of a
nodetool tpstats right after the dropped messages messages. That should give
you a clue as to why there are dropped messages - which thread pools are backe
How can I tell what's causing dropped messages?
Is it just too much activity? I'm not getting any other, more specific
messages, just these:
WARN [ScheduledTasks:1] 2011-08-15 11:33:26,136 MessagingService.java (line
504) Dropped 1534 MUTATION messages in the last 5000ms
WARN [Schedu
Thanks Aaron.
The first paragraph is very clear however the 2nd paragraph leaves me
wondering regarding counter columns in my setup.
I am writing at CL.ALL and reading at CL.ONE so if I get dropped messages,
it will show up as Timeouts on the client side so possibly the mutation was
not run on
Just added this to the wiki
http://wiki.apache.org/cassandra/FAQ#dropped_messages
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 6 Aug 2011, at 10:53, Philippe wrote:
> Hi,
> I see lines like this in my log file
> INFO [Schedu
Hi,
I see lines like this in my log file
INFO [ScheduledTasks:1] 2011-08-06 00:51:57,650 MessagingService.java (line
586) 358 MUTATION messages dropped in server lifetime
INFO [ScheduledTasks:1] 2011-08-06 00:51:57,658 MessagingService.java (line
586) 297 READ messages dropped in server lifetime
39 matches
Mail list logo