Well answering to myself this is not related to read_repair chance.
Settings them to 0 change also nothing. So the question remains: why
from time to time C* want to make multiple read on a non local dc ?
On 06/08/2022 12:31, Raphael Mazelier wrote:
Well I tried (but already have some whiteListFilter) it changed
nothing but it's more convenient that using whiteListFilter (speeding
up the connection time).
So still from time to time (dedanding of the frequency of my requests)
I got slow request when I notice in the trace that c* try to read on
other DC.
Btw it's not limited to gocql (I got the pretty same result in python).
I wonder if it's related to read_repair_chance parameter and
dclocal_read_repair_chance.
but still as I understand the documentation the read repair should not
be in the blocking path of a query ?
--
Raphael Mazelier
On 05/08/2022 23:13, Jim Shaw wrote:
I remember gocql.DataCentreHostFilter was used. try add it to see
whether will read local DC only in your case ?
Thanks,
James
On Fri, Aug 5, 2022 at 2:40 PM Raphael Mazelier <r...@futomaki.net>
wrote:
Hi Cassandra Users,
I'm relatively new to Cassandra and first I have to say I'm
really impressed by the technology.
Good design and a lot of stuff to understand the underlying (the
Oreilly book help a lot as well as thelastpickle blog post).
I have an muli-datacenter c* cluster (US, Europe, Singapore) with
eight node on each (two seeds on each region), two racks on Eu,
Singapore, 3 on US. Everything deployed in AWS.
We have a keyspace configured with network topology and two
replicas on every region like this: {'class':
'NetworkTopologyStrategy', 'ap-southeast-1': '2', 'eu-west-1':
'2', 'us-east-1': '2'}
Investigating some performance issue I noticed strange things in
my experiment:
What we expect is very slow latency 3/5ms max for this specific
select query. So we want every read to be local the each datacenter.
We configure DCAwareRoundRobinPolicy(local_dc=DC) in python, and
the same in Go
gocql.TokenAwareHostPolicy(gocql.DCAwareRoundRobinPolicy("DC"))
Testing a bit with two short program (I can provide them) in go
and python I notice very strange result. Basically I do the same
query over and over with a very limited dataset of id.
The first result were surprising cause the very first query were
always more than 250ms and after with stressing c* (playing with
sleep between query) I can achieve a good ratio of query at 3/4
ms (what I expected).
My guess was that long query were somewhat executed not locally
(or at least imply multi datacenter queries) and short one no.
Activating tracing in my program (like enalbing trace in cqlsh)
kindla confirm my suspicion.
(I will provide trace in attachment).
My question is why sometime C* try to read not localy? how we can
disable it? what is the criteria for this?
(btw I'm very not fan of this multi region design for theses very
specific kind of issues...)
Also side question: why C* is so slow at connection? it's like
it's trying to reach every nodes in each DC? (we only provide
locals seeds however). Sometimes it take more than 20s...
Any help appreciated.
Best,
--
Raphael Mazelier