You're right too, this option is not new, sorry. Is this option can be useful ?
Le dim. 7 août 2022, 22:18, Bowen Song via user <user@cassandra.apache.org> a écrit : > Do you mean "nodetool settraceprobability"? This is not exactly new, I > remember it was available on Cassandra 2.x. > On 07/08/2022 20:43, Stéphane Alleaume wrote: > > I think perhaps you already know but i read you can now trace only a % of > all queries, i will look to retrieve the name of this fonctionnality (in > new Cassandra release). > > Hope it will help > Kind regards > Stéphane > > > Le dim. 7 août 2022, 20:26, Raphael Mazelier <r...@futomaki.net> a écrit : > >> > "Read repair is in the blocking read path for the query, yep" >> >> OK interesting. This is not what I understood from the documentation. And >> I use localOne level consistency. >> >> I enabled tracing (see in the attachment of my first msg)/ but I didn't >> see read repair in the trace (and btw I tried to completely disable it on >> my table setting both read_repair_chance and local_dc_read_repair_chance to >> 0). >> >> The problem when enabling trace in cqlsh is that I only get slow result. >> For having fast answer I need to iterate faster on my queries. >> >> I can provide again trace for analysis. I got something more readable in >> python. >> >> Best, >> >> -- >> >> Raphael >> >> >> On 07/08/2022 19:30, C. Scott Andreas wrote: >> >> > but still as I understand the documentation the read repair should not >> be in the blocking path of a query ? >> >> Read repair is in the blocking read path for the query, yep. At quorum >> consistency levels, the read repair must complete before returning a result >> to the client to ensure the data returned would be visible on subsequent >> reads that address the remainder of the quorum. >> >> If you enable tracing - either for a single CQL statement that is >> expected to be slow, or probabilistic from the server side to catch a slow >> query in the act - that will help identify what’s happening. >> >> - Scott >> >> On Aug 7, 2022, at 10:25 AM, Raphael Mazelier <r...@futomaki.net> >> <r...@futomaki.net> wrote: >> >> >> >> Nope. And what really puzzle me is in the trace we really show the >> difference between queries. The fast queries only request read from one >> replicas, while slow queries request from multiple replicas (and not only >> local to the dc). >> On 07/08/2022 14:02, Stéphane Alleaume wrote: >> >> Hi >> >> Is there some GC which could affect coordinarir node ? >> >> Kind regards >> Stéphane >> >> Le dim. 7 août 2022, 13:41, Raphael Mazelier <r...@futomaki.net> a >> écrit : >> >>> Thanks for the answer but I was well aware of this. I use localOne as >>> consistency level. >>> >>> My client connect to a local seeds, then choose a local coordinator (as >>> far I can understand the trace log). >>> >>> Then for a batch of request I got approximately 98% of request treated >>> in 2/3ms in local DC with one read request, and 2% treated by many nodes >>> (according to the trace) and then way longer (250ms). >>> >>> ? >>> On 06/08/2022 14:30, Bowen Song via user wrote: >>> >>> See the diagram below. Your problem almost certainly arises from step 4, >>> in which an incorrect consistency level set by the client caused the >>> coordinator node to send the READ command to nodes in other DCs. >>> >>> The load balancing policy only affects step 2 and 3, not step 1 or 4. >>> >>> You should change the consistency level to LOCAL_ONE/LOCAL_QUORUM/etc. >>> to fix the problem. >>> >>> On 05/08/2022 22:54, Bowen Song wrote: >>> >>> The DCAwareRoundRobinPolicy/TokenAwareHostPolicy controlls which >>> Cassandra coordinator node the client sends queries to, not the nodes it >>> connects to, nor the nodes that performs the actual read. >>> >>> A client sends a CQL read query to a coordinator node, and the >>> coordinator node parses the CQL query, and send READ requests to other >>> nodes in the cluster based on the consistency level. >>> >>> Have you checked the consistency level of the session (and the query if >>> applicable)? Is it prefixed with "LOCAL_"? If not, the coordinator will >>> send the READ requests to non-local DCs. >>> >>> >>> On 05/08/2022 19:40, Raphael Mazelier wrote: >>> >>> >>> Hi Cassandra Users, >>> >>> I'm relatively new to Cassandra and first I have to say I'm really >>> impressed by the technology. >>> >>> Good design and a lot of stuff to understand the underlying (the Oreilly >>> book help a lot as well as thelastpickle blog post). >>> >>> I have an muli-datacenter c* cluster (US, Europe, Singapore) with eight >>> node on each (two seeds on each region), two racks on Eu, Singapore, 3 on >>> US. Everything deployed in AWS. >>> >>> We have a keyspace configured with network topology and two replicas on >>> every region like this: {'class': 'NetworkTopologyStrategy', >>> 'ap-southeast-1': '2', 'eu-west-1': '2', 'us-east-1': '2'} >>> >>> >>> Investigating some performance issue I noticed strange things in my >>> experiment: >>> >>> What we expect is very slow latency 3/5ms max for this specific select >>> query. So we want every read to be local the each datacenter. >>> >>> We configure DCAwareRoundRobinPolicy(local_dc=DC) in python, and the >>> same in Go gocql.TokenAwareHostPolicy(gocql.DCAwareRoundRobinPolicy("DC")) >>> >>> Testing a bit with two short program (I can provide them) in go and >>> python I notice very strange result. Basically I do the same query over and >>> over with a very limited dataset of id. >>> >>> The first result were surprising cause the very first query were always >>> more than 250ms and after with stressing c* (playing with sleep between >>> query) I can achieve a good ratio of query at 3/4 ms (what I expected). >>> >>> My guess was that long query were somewhat executed not locally (or at >>> least imply multi datacenter queries) and short one no. >>> >>> Activating tracing in my program (like enalbing trace in cqlsh) kindla >>> confirm my suspicion. >>> >>> (I will provide trace in attachment). >>> >>> My question is why sometime C* try to read not localy? how we can >>> disable it? what is the criteria for this? >>> >>> (btw I'm very not fan of this multi region design for theses very >>> specific kind of issues...) >>> >>> Also side question: why C* is so slow at connection? it's like it's >>> trying to reach every nodes in each DC? (we only provide locals seeds >>> however). Sometimes it take more than 20s... >>> >>> Any help appreciated. >>> >>> Best, >>> >>> -- >>> >>> Raphael Mazelier >>> >>>