Nope. And what really puzzle me is in the trace we really show the
difference between queries. The fast queries only request read from one
replicas, while slow queries request from multiple replicas (and not
only local to the dc).
On 07/08/2022 14:02, Stéphane Alleaume wrote:
Hi
Is there some GC which could affect coordinarir node ?
Kind regards
Stéphane
Le dim. 7 août 2022, 13:41, Raphael Mazelier <r...@futomaki.net> a écrit :
Thanks for the answer but I was well aware of this. I use localOne
as consistency level.
My client connect to a local seeds, then choose a local
coordinator (as far I can understand the trace log).
Then for a batch of request I got approximately 98% of request
treated in 2/3ms in local DC with one read request, and 2% treated
by many nodes (according to the trace) and then way longer (250ms).
?
On 06/08/2022 14:30, Bowen Song via user wrote:
See the diagram below. Your problem almost certainly arises from
step 4, in which an incorrect consistency level set by the client
caused the coordinator node to send the READ command to nodes in
other DCs.
The load balancing policy only affects step 2 and 3, not step 1 or 4.
You should change the consistency level to
LOCAL_ONE/LOCAL_QUORUM/etc. to fix the problem.
On 05/08/2022 22:54, Bowen Song wrote:
The DCAwareRoundRobinPolicy/TokenAwareHostPolicy controlls which
Cassandra coordinator node the client sends queries to, not the
nodes it connects to, nor the nodes that performs the actual read.
A client sends a CQL read query to a coordinator node, and the
coordinator node parses the CQL query, and send READ requests to
other nodes in the cluster based on the consistency level.
Have you checked the consistency level of the session (and the
query if applicable)? Is it prefixed with "LOCAL_"? If not, the
coordinator will send the READ requests to non-local DCs.
On 05/08/2022 19:40, Raphael Mazelier wrote:
Hi Cassandra Users,
I'm relatively new to Cassandra and first I have to say I'm
really impressed by the technology.
Good design and a lot of stuff to understand the underlying
(the Oreilly book help a lot as well as thelastpickle blog post).
I have an muli-datacenter c* cluster (US, Europe, Singapore)
with eight node on each (two seeds on each region), two racks
on Eu, Singapore, 3 on US. Everything deployed in AWS.
We have a keyspace configured with network topology and two
replicas on every region like this: {'class':
'NetworkTopologyStrategy', 'ap-southeast-1': '2', 'eu-west-1':
'2', 'us-east-1': '2'}
Investigating some performance issue I noticed strange things
in my experiment:
What we expect is very slow latency 3/5ms max for this specific
select query. So we want every read to be local the each
datacenter.
We configure DCAwareRoundRobinPolicy(local_dc=DC) in python,
and the same in Go
gocql.TokenAwareHostPolicy(gocql.DCAwareRoundRobinPolicy("DC"))
Testing a bit with two short program (I can provide them) in go
and python I notice very strange result. Basically I do the
same query over and over with a very limited dataset of id.
The first result were surprising cause the very first query
were always more than 250ms and after with stressing c*
(playing with sleep between query) I can achieve a good ratio
of query at 3/4 ms (what I expected).
My guess was that long query were somewhat executed not locally
(or at least imply multi datacenter queries) and short one no.
Activating tracing in my program (like enalbing trace in cqlsh)
kindla confirm my suspicion.
(I will provide trace in attachment).
My question is why sometime C* try to read not localy? how we
can disable it? what is the criteria for this?
(btw I'm very not fan of this multi region design for theses
very specific kind of issues...)
Also side question: why C* is so slow at connection? it's like
it's trying to reach every nodes in each DC? (we only provide
locals seeds however). Sometimes it take more than 20s...
Any help appreciated.
Best,
--
Raphael Mazelier