Mutation dropped and Read-Repair performance issue

2020-12-19 Thread sunil pawar
Hi All, We are facing problems of failure of Read-Repair stages with error Digest Mismatch and count is 300+ per day per node. At the same time, we are experiencing node is getting overloaded for a quick couple of seconds due to long GC pauses (of around 7-8 seconds). We are not running a repair

Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Tobias Eriksson
2020 at 17:27 To: cassandra Subject: Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls Yes, it's possible. A typical JVM GC pause for most configs is on the order of 50-200ms. If you have a host do a small collection/pause, then the read at #4 is basically r

Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Jeff Jirsa
DATE) >2. Data replicated by Cassandra, but will not finish before (4) below >3. Wait 80 ms on average >4. Data read again with QUORUM i.e asking for atleast 2 out of 3 nodes >for result, and now ONE replies with inaccurate data >5. (4) triggers a READ REPAIR &g

Re: Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Erick Ramirez
Did you mean LOCAL_QUORUM? Because QUORUM will require 4 out of 6 replicas, not 2 out of 3. :) But it sounds like you are using QUORUM because you said it syncs to all nodes in DC2. To answer your question, RR *can* be triggered if you're reading before the replicas are *eventually* consistent. Ch

Could a READ REPAIR really be triggered even if there avg 80 ms between calls

2020-09-01 Thread Tobias Eriksson
QUORUM i.e asking for atleast 2 out of 3 nodes for result, and now ONE replies with inaccurate data 5. (4) triggers a READ REPAIR 6. The READ REPAIR now synchs to ALL nodes also in DC2 So my question is: Is it really possible that Cassandra within 80 ms is not able to replicate to all 3 nodes

Re: Why a READ REPAIR ?

2020-08-12 Thread Erick Ramirez
You can check for the string "digest mismatch" in the logs. Similarly, you can track the RR stats in nodetool netstats and the dropped mutations in nodetool tpstats. To be clear though, RRs are a side-effect of nodes either dropping mutations or being unresponsive so they miss mutations. RRs do *n

Re: Why a READ REPAIR ?

2020-08-12 Thread Tobias Eriksson
Thanx Erick Is there a way to turn on tracing based on certain criteria, I would like to start tracing when there is some sort of failure, i.e. in this case when a READ REPAIR is triggered as I would like to know why we sometimes can’t reach one of the nodes -Tobias From: Erick Ramirez Reply

Re: Why a READ REPAIR ?

2020-08-11 Thread Erick Ramirez
um : SELECT * FROM products WHERE id = ABC123 > > READ 2 with Local One : SELECT * FROM products WHERE id = ABC123 > > > > Would read (2) be blocked by the READ REPAIR that was done by read (1) > > As I understand that the read repair is working not on the whole table but > o

Re: Why a READ REPAIR ?

2020-08-11 Thread manish khandelwal
Hi Tobias READ2 will not be blocked by READ repair of READ1. Regards Manish On Tue, Aug 11, 2020 at 6:02 PM Tobias Eriksson wrote: > Thanx Erick, > > Perhaps this is super obvious but I need a confirmation as you say “…not > subsequent reads for other data unrelated to the read be

Re: Why a READ REPAIR ?

2020-08-11 Thread Tobias Eriksson
= ABC123 READ 2 with Local One : SELECT * FROM products WHERE id = ABC123 Would read (2) be blocked by the READ REPAIR that was done by read (1) As I understand that the read repair is working not on the whole table but on the partition key it had problems with -Tobias From: Erick Ramirez Reply

Re: Why a READ REPAIR ?

2020-08-11 Thread Erick Ramirez
> > If a READ triggers a READ REPAIR, and then if we do an additional READ > would then that BLOCK until the “first” READ REPAIR would be done ? > > -Tobias > Not all read repairs are blocking RRs (aka foreground RRs). There are also background RRs which by definition are no

Re: Why a READ REPAIR ?

2020-08-11 Thread Tobias Eriksson
If a READ triggers a READ REPAIR, and then if we do an additional READ would then that BLOCK until the “first” READ REPAIR would be done ? -Tobias From: Jeff Jirsa Reply to: "user@cassandra.apache.org" Date: Tuesday, 11 August 2020 at 07:30 To: cassandra Subject: Re: Why a R

Re: Why a READ REPAIR ?

2020-08-10 Thread Jeff Jirsa
Your schema may have read repair (non-blocking, background) set to 10% (0.1, for dclocal). You may have GC pauses causing writes (or reads) to be delayed. You may be hitting a cassandra bug. Would need the `TRACING` output to know for sure. On Mon, Aug 10, 2020 at 10:10 PM Tobias Eriksson

Why a READ REPAIR ?

2020-08-10 Thread Tobias Eriksson
Hi We have a Cassandra solution with 2 DCs where each DC has >30 nodes From time to time we see problems with READ REPAIR, but I am stuck with the analysis We have a pattern for these faults where we do 1. INSERT with Local Quorum (2 out of 3) 2. Wait for 0.5 - 1 seconds time window

Re: disable debug message on read repair

2020-03-10 Thread Paul Chandler
Hi Gil, All the logging is controlled via logback. You can change the level of any type of message. Take a look here for some more details: https://docs.datastax.com/en/cassandra-oss/3.0/cassandra/configuration/configLoggingLevels.html

Re: disable debug message on read repair

2020-03-10 Thread Gil Ganz
That's one option, I wish I there was a way to disable just that and not the entire debug log level, there are some things there I would like to keep. On Sun, Mar 8, 2020 at 6:41 PM Jeff Jirsa wrote: > There are likely two log configs - one for debug.log and one for > system.log. Disable the deb

Re: disable debug message on read repair

2020-03-08 Thread Jeff Jirsa
There are likely two log configs - one for debug.log and one for system.log. Disable the debug.log one, or change org.apache.cassandra.service to log at INFO instead Nobody needs to see every digest mismatch and that someone thought this was a good idea is amazing to me. Someone should jira th

Re: disable debug message on read repair

2020-03-08 Thread Gil Ganz
Thanks Shalom, I know why these read repairs are happening, and they will continue to happen for some time, even if I will run a full repair. I would like to disable these warning messages. On Sun, Mar 8, 2020 at 10:19 AM Shalom Sagges wrote: > Hi Gil, > > You can run a full repair on your clust

Re: disable debug message on read repair

2020-03-08 Thread Shalom Sagges
Hi Gil, You can run a full repair on your cluster. But if these messages come back again, you need to check what's causing these data inconsistencies. On Sun, Mar 8, 2020 at 10:11 AM Gil Ganz wrote: > Hey all > I have a lot of debug message about read repairs in my debug log : > > DEBUG [ReadR

disable debug message on read repair

2020-03-08 Thread Gil Ganz
Hey all I have a lot of debug message about read repairs in my debug log : DEBUG [ReadRepairStage:346] 2020-03-08 08:09:12,959 ReadCallback.java:242 - Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-28476014476640, 000400871130303a3

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Patrick Lee
doesn't seem to be the same, it looks like just less than 10% of the read traffic. the query i originally posted was one that we captured and used as an example. every time i would run it at local_quorum, all, quorum... it would do a read repair. the record hasn't been updated for a

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Jeff Jirsa
t; >> >> >> >> >> >> *From:* Patrick Lee [mailto:patrickclee0...@gmail.com] >> *Sent:* Wednesday, October 16, 2019 12:22 PM >> *To:* user@cassandra.apache.org >> *Subject:* Re: Constant blocking read repair for such a tiny table >> >>

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Patrick Lee
ee0...@gmail.com] > *Sent:* Wednesday, October 16, 2019 12:22 PM > *To:* user@cassandra.apache.org > *Subject:* Re: Constant blocking read repair for such a tiny table > > > > haven't really figured this out yet. it's not a big problem but it is > annoying for s

RE: Constant blocking read repair for such a tiny table

2019-10-16 Thread ZAIDI, ASAD
atrick Lee [mailto:patrickclee0...@gmail.com] Sent: Wednesday, October 16, 2019 12:22 PM To: user@cassandra.apache.org Subject: Re: Constant blocking read repair for such a tiny table haven't really figured this out yet. it's not a big problem but it is annoying for sure! the cluster w

Re: Constant blocking read repair for such a tiny table

2019-10-16 Thread Patrick Lee
just not 100% sure. just 1 table, out of all the ones on the cluster has this behavior. repair has been run few times via reaper. even did a nodetool compact on the nodes (since this table is like 1GB per node..) . just don't see why there would be any inconsistency that would trigger read rep

Re: Constant blocking read repair for such a tiny table

2019-10-15 Thread Alain RODRIGUEZ
o upward to 20ms to 50ms.. the only >> odd thing i see is just that there are constant read repairs that follow >> the same traffic pattern on the reads, which shows constant writes on the >> table (from the read repairs), which after read repair or just normal full >> repairs (

Oversized Read Repair Mutations

2019-10-14 Thread Isaac Reath (BLOOMBERG/ 731 LEX)
Hi Cassandra users, Recently on some of our production clusters we have run into the following error: 2019-10-11 15:14:46,803 DataResolver.java:507 - Encountered an oversized (x/y) read repair mutation for table. Which is described in this jira: https://issues.apache.org/jira/browse

Re: Constant blocking read repair for such a tiny table

2019-10-03 Thread Patrick Lee
rn on the reads, which shows constant writes on the > table (from the read repairs), which after read repair or just normal full > repairs (all full through reaper, never ran any incremental repair) i would > expect it to not have any mismatches. the other 5 tables they use on the > clu

Re: Constant blocking read repair for such a tiny table

2019-10-03 Thread Patrick Lee
e only odd thing i see is just that there are constant read repairs that follow the same traffic pattern on the reads, which shows constant writes on the table (from the read repairs), which after read repair or just normal full repairs (all full through reaper, never ran any incremental repair) i wo

RE: Constant blocking read repair for such a tiny table

2019-10-03 Thread John Belliveau
PM To: user@cassandra.apache.org Subject: Constant blocking read repair for such a tiny table I have a cluster that is running 3.11.4 ( was upgraded a while back from 2.1.16 ).  what I see is a steady rate of read repair which is about 10% constantly on only this 1 table.  Repairs have been run

Constant blocking read repair for such a tiny table

2019-10-03 Thread Patrick Lee
I have a cluster that is running 3.11.4 ( was upgraded a while back from 2.1.16 ). what I see is a steady rate of read repair which is about 10% constantly on only this 1 table. Repairs have been run (actually several times). The table does not have a lot of writes to it so after repair, or

Re: read repair with consistency one

2018-04-25 Thread Grzegorz Pietrusza
Hi Ben Thanks a lot. From my analysis of the code it looks like you are right. When global read repair kicks in all live endpoints are queried for data, regardless of consistency level. Only EACH_QUORUM is treated differently. Cheers Grzegorz 2018-04-22 1:45 GMT+02:00 Ben Slater : > I have

Re: read repair with consistency one

2018-04-21 Thread Ben Slater
eers Ben On Sat, 21 Apr 2018 at 22:20 Grzegorz Pietrusza wrote: > I haven't asked about "regular" repairs. I just wanted to know how read > repair behaves in my configuration (or is it doing anything at all). > > 2018-04-21 14:04 GMT+02:00 Rahul Singh : > >&g

Re: read repair with consistency one

2018-04-21 Thread Grzegorz Pietrusza
I haven't asked about "regular" repairs. I just wanted to know how read repair behaves in my configuration (or is it doing anything at all). 2018-04-21 14:04 GMT+02:00 Rahul Singh : > Read repairs are one anti-entropy measure. Continuous repairs is another. > If you do repai

Re: read repair with consistency one

2018-04-21 Thread Rahul Singh
Read repairs are one anti-entropy measure. Continuous repairs is another. If you do repairs via Reaper or your own method it will resolve your discrepencies. On Apr 21, 2018, 3:16 AM -0400, Grzegorz Pietrusza , wrote: > Hi all > > I'm a bit confused with how read repair works in

read repair with consistency one

2018-04-21 Thread Grzegorz Pietrusza
Hi all I'm a bit confused with how read repair works in my case, which is: - multiple DCs with RF 1 (NetworkTopologyStrategy) - reads with consistency ONE The article #1 says that read repair in fact runs RF reads for some percent of the requests. Let's say I have read_repair_chance =

Re: Blocking read repair giving consistent data but not repairing existing data

2017-12-11 Thread Michael Semb Wever
on key, 2 clustering key > for a row but 3 other normal values are null. > > When doing consistency level all query we get complete view of the row and > in the tracing output it says that inconsistency found in digest and read > repair is sent out to the nodes. > <

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-27 Thread Jeff Jirsa
which table the > DigestMismatchException happens? > No, the read repair stats we provide are not per table, so if it’s not in the log, it’s not apparent. Feel free to open a jira to ask for it to be added to the log message. > Can the AsyncRepairRunner be triggered if read and writes for

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-27 Thread Artur Siekielski
triggered if read and writes for all other tables are done with CL=LOCAL_QUORUM (RF=3)? I assumed in that case async read repair is not done even if dclocal_read_repair_chance > 0. Could it be that the async repair runs for that case and it's executed faster than the background syncing to m

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-26 Thread Artur Siekielski
It was set to the default 99PERCENTILE, I changed it to NONE but the exceptions are still logged (for the same table). I'm assuming node restarts are not required for that ALTER. On 10/26/2017 05:13 PM, Jeff Jirsa wrote: Is speculative retry enabled? --

Re: Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-26 Thread Jeff Jirsa
Is speculative retry enabled? -- Jeff Jirsa > On Oct 26, 2017, at 3:19 AM, Artur Siekielski wrote: > > Hi, > > we have one table for which reads and writes are done with CL=ONE. The table > contains counters. We wanted to disable async read repair for the table (to >

Getting DigestMismatchExceptions despite setting read repair chances to zero

2017-10-26 Thread Artur Siekielski
Hi, we have one table for which reads and writes are done with CL=ONE. The table contains counters. We wanted to disable async read repair for the table (to lessen cluster load and to avoid DigestMismatchExceptions in debug.log). After altering the table with read_repair_chance=0

Re: Does async read repair happen when using CL.LOCAL_QUORUM?

2017-09-25 Thread Lutaya Shafiq Holmes
endpoints.size() && n == endpoints.size())` > > > Whereas n is the received data from endpoints in local datacenter. > > In that case, the async repair runner won’t be created, thus only foreground > read repair is possible to happen (when DigestMismatchException is raised) &g

Does async read repair happen when using CL.LOCAL_QUORUM?

2017-09-24 Thread 孟靖
he received data from endpoints in local datacenter. In that case, the async repair runner won’t be created, thus only foreground read repair is possible to happen (when DigestMismatchException is raised) when CL = LOCAL_QUORUM. Is it true, or am I missing something here? Btw, the cassandra

Re: Tomstones impact on repairs both anti-entropy and read repair

2016-11-16 Thread Alain RODRIGUEZ
Hi, > My question to the community is will tombstone cause issues in data > consistency across the DCs. It might, if your repairs are not succeeding for some reason or not running fully (all the token ranges) within gc_grace_second (parameter at the table level) I wrote a blog post and talked

Tomstones impact on repairs both anti-entropy and read repair

2016-11-14 Thread K F
Hi Folks, I have a table that has lot of tombstones generated and has caused inconsistent data across various datacenters. we run anti-entropy repairs and also have read_repair_chance tuned-up during our non busy hours. But yet when we try to compare data residing in various replicas across DCs,

Can I monitor Read Repair from the logs

2016-11-04 Thread James Rothering
What should I grep for in the logs to see if read repair is happening on a table?

RE: Question on Read Repair

2016-11-03 Thread Anubhav Kale
Subject: Re: Question on Read Repair Yes: https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f58de1328e713eaf53c/src/java/org/apache/cassandra/db/ConsistencyLevel.java#L286 From: Anubhav Kale mailto:anubhav.k...@microsoft.com>> Reply-To: "user@cassandra.apache.org&

Re: Scenarios when blocking read repair takes place

2016-10-17 Thread siddharth verma
is? > Mankapur? > > Krishna > > On Oct 14, 2016 12:15 PM, "siddharth verma" > wrote: > >> Hi, >> Does blocking read repair take place only when we read on the primary key >> or >> does it take place in the following scenarios as well? >&g

Re: Scenarios when blocking read repair takes place

2016-10-15 Thread Krishna Chandra Prajapati
Hi which side is this? Mankapur? Krishna On Oct 14, 2016 12:15 PM, "siddharth verma" wrote: > Hi, > Does blocking read repair take place only when we read on the primary key > or > does it take place in the following scenarios as well? > > Consistemcy ALL > 1.

Scenarios when blocking read repair takes place

2016-10-13 Thread siddharth verma
Hi, Does blocking read repair take place only when we read on the primary key or does it take place in the following scenarios as well? Consistemcy ALL 1. select * from ks.table_name 2. select * from ks.table_name where token(pk) >= ? and token(pk) <= ? While using manual paging or aut

Re: Question on Read Repair

2016-10-11 Thread Jeff Jirsa
ndra.apache.org" Subject: RE: Question on Read Repair Thank you. Interesting detail. Does it work the same way for other consistency levels as well ? From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: Tuesday, October 11, 2016 10:29 AM To: user@cassandra.apache.org Subject:

RE: Question on Read Repair

2016-10-11 Thread Anubhav Kale
Thank you. Interesting detail. Does it work the same way for other consistency levels as well ? From: Jeff Jirsa [mailto:jeff.ji...@crowdstrike.com] Sent: Tuesday, October 11, 2016 10:29 AM To: user@cassandra.apache.org Subject: Re: Question on Read Repair If the failuredetector knows that the

Re: Question on Read Repair

2016-10-11 Thread Edward Capriolo
) might start a read process. One of the three nodes may not respond within the read timeout window.Call the end of the read timeout window time('3) Note: Anti-entropy read-repair like Read repair is set to only happen a fraction of requests. Note: Anti-entropy read-repair is (async) not guarantee

Re: Question on Read Repair

2016-10-11 Thread Jeff Jirsa
To: "user@cassandra.apache.org" Subject: Question on Read Repair Hello, This is more of a theory / concept question. I set CL=ALL and do a read. Say one replica was down, will the rest of the replicas get repaired as part of this ? (I am hoping the answer

Question on Read Repair

2016-10-11 Thread Anubhav Kale
Hello, This is more of a theory / concept question. I set CL=ALL and do a read. Say one replica was down, will the rest of the replicas get repaired as part of this ? (I am hoping the answer is yes). Thanks !

Blocking read repair giving consistent data but not repairing existing data

2016-05-12 Thread Bhuvan Rawal
y for a row but 3 other normal values are null. When doing consistency level all query we get complete view of the row and in the tracing output it says that inconsistency found in digest and read repair is sent out to the nodes. <*Exact error in tracing : Digest

RE: Read Repair

2015-07-08 Thread Ashic Mahtab
8 Jul 2015 15:06:46 -0700 Subject: Re: Read Repair From: rc...@eventbrite.com To: user@cassandra.apache.org; naidusp2...@yahoo.com On Wed, Jul 8, 2015 at 2:07 PM, Saladi Naidu wrote: Suppose I have a row of existing data with set of values for attributes I call this State1, and issue an update

Re: Read Repair

2015-07-08 Thread Robert Coli
ng nodes. As there is no Rollback, Node1 row attributes will > remain new state, State2 and rest of the nodes row will have old state, > State1. If I do a Read and Cassandra detects state difference, it will > issue a Read repair which will result in new state, State2 being propagated >

Read Repair

2015-07-08 Thread Saladi Naidu
state, State2 and rest of the nodes row will have old state, State1. If I do a Read and Cassandra detects state difference, it will issue a Read repair which will result in new state, State2 being propagated to other nodes. But from a application point of view the update never happened because

RE: Read Repair in cassandra

2015-04-07 Thread Jan Karlsson
The request would return with the latest data. The read request would fire against node 1 and node 3. The coordinator would get answers from both and would merge the answers and return the latest. Then read repair might run to update node 3. QUORUM does not take into consideration whether an

Read Repair in cassandra

2015-04-07 Thread ankit tyagi
Hi All, I have a doubt regarding read repair while reading data. I and using QUORUM for both read and write operations with RF 3 for strong consistency suppose while write data node1 and node2 replicate the data but it doesn't get replicate on node3 because of various factors. coordinator

Re: read repair across DC and latency

2014-11-21 Thread Tyler Hobbs
spect read_repair > chance may have something to do wit it. > Anything we can look into and see what may cause the latency spike when we > have large number of same cql hitting the server? > I doubt read repair is related. I would try tracing a few of your queries. -- Tyler Hobbs DataStax <http://datastax.com/>

Re: read repair across DC and latency

2014-11-19 Thread Jimmy Lin
s wrote: > > On Sun, Nov 16, 2014 at 5:13 PM, Jimmy Lin wrote: > >> I have read that read repair suppose to be running as background, but >> does the co-ordinator node need to wait for the response(along with other >> normal read tasks) before return the entire result

Re: read repair across DC and latency

2014-11-18 Thread Tyler Hobbs
On Sun, Nov 16, 2014 at 5:13 PM, Jimmy Lin wrote: > I have read that read repair suppose to be running as background, but > does the co-ordinator node need to wait for the response(along with other > normal read tasks) before return the entire result back to the caller? > Fo

read repair across DC and latency

2014-11-16 Thread Jimmy Lin
I have a CF that use the default, read_repair_chance (0.1) and dc_read_repair_chance(0). Our read and write is all local_quorum, on one of the 2 DC, replication of 3. so a read will have 10% chance trigger a read repair to other DC. # I have read that read repair suppose to be running as

Re: Understanding about Cassandra read repair with QUORUM

2014-01-16 Thread Aaron Morton
> I have following understanding about Cassandra read repair: Read Repair is an automatic process that reads from more nodes than necessary during a normal read and checks and repairs differences in the background. It’s different to “repair” or Anti Entropy that you run with nodetool rep

Understanding about Cassandra read repair with QUORUM

2014-01-11 Thread chovatia jaydeep
Hi, I have following understanding about Cassandra read repair: * If we write with QUORUM and read with QUORUM then we do not need to externally (nodetool) trigger read repair.  * Since we are reading + writing with QUORUM then it is safe to set "read_repair_cha

Re: Read repair

2013-10-31 Thread Baskar Duraikannu
Yes, it helps. Thanks --- Original Message --- From: "Aaron Morton" Sent: October 31, 2013 3:51 AM To: "Cassandra User" Subject: Re: Read repair (assuming RF 3 and NTS is putting a replica in each rack) > Rack1 goes down and some writes happen in quorum against ra

Re: Read repair

2013-10-31 Thread Aaron Morton
> mins, there is no quorum until failed rack comes back up. > > Hope this explains the scenario. > From: Aaron Morton > Sent: ‎10/‎28/‎2013 2:42 AM > To: Cassandra User > Subject: Re: Read repair > >> As soon as it came back up, due to some human error, rack1 goes

RE: Read repair

2013-10-29 Thread Baskar Duraikannu
hour and 30 mins, there is no quorum until failed rack comes back up. Hope this explains the scenario. From: Aaron Morton<mailto:aa...@thelastpickle.com> Sent: ‎10/‎28/‎2013 2:42 AM To: Cassandra User<mailto:user@cassandra.apache.org> Subject: Re: Read

Re: Read repair

2013-10-27 Thread Aaron Morton
of the nodes available and would be able to achieve a QUORUM. > Just to minimize the issues, we are thinking of running read repair manually > every night. If you are reading and writing at QUORUM and the cluster does not have a QUORUM of nodes available writes will not be processed. Duri

Re: manual read repair

2013-10-27 Thread Aaron Morton
> We have seen read repair take very long time even for few GBs Read Repair is a process that runs during a read to repair differences in the background. It’s active on (by default) 10% of the reads. I assume you mean nodetool repair (aka anti entropy). It runs in two phases, first

manual read repair

2013-10-25 Thread Baskar Duraikannu
We have seen read repair take very long time even for few GBs of data even though we don't see disk or network bottlenecks. Do you use any specific configuration to speed up read repairs?

Read repair

2013-10-25 Thread Baskar Duraikannu
for some rows it is possible that Quorum cannot be established. Just to minimize the issues, we are thinking of running read repair manually every night. Is this a good idea? How often do you perform read repair on your cluster?

Re: Waiting on read repair?

2013-03-20 Thread aaron morton
91890130 >> bytes) for commitlog position ReplayPosition(segmentId=1363628611044, >> position=21069295) >> 168050:2013-03-18 17:53:55,948 INFO [FlushWriter:3] >> org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents >> (Memtable.java:458) - Completed flushing >

Re: Waiting on read repair?

2013-03-19 Thread Jasdeep Hundal
og position ReplayPosition(segmentId=1363628611047, > position=4213859) > 168052:2013-03-18 17:53:55,966 INFO [FlushWriter:3] > org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents > (Memtable.java:458) - Completed flushing > /mnt/test/jasdeep/counters/jasdeep-counte

Re: Waiting on read repair?

2013-03-19 Thread aaron morton
,966 INFO [FlushWriter:3] > org.apache.cassandra.db.Memtable$FlushRunnable.writeSortedContents > (Memtable.java:458) - Completed flushing > /mnt/test/jasdeep/counters/jasdeep-counters-ia-1204-Data.db (342 > bytes) for commitlog position ReplayPosition(segmentId=1363628611047, > posit

Re: Waiting on read repair?

2013-03-18 Thread Jasdeep Hundal
sdeep On Mon, Mar 18, 2013 at 10:24 AM, aaron morton wrote: > 1. With a ConsistencyLevel of quorum, does > FBUtilities.waitForFutures() wait for read repair to complete before > returning? > > No > That's just a utility method. > Nothing on the read path waits for Read

Re: Waiting on read repair?

2013-03-18 Thread aaron morton
> 1. With a ConsistencyLevel of quorum, does > FBUtilities.waitForFutures() wait for read repair to complete before > returning? No That's just a utility method. Nothing on the read path waits for Read Repair, and controlled by read_repair_chance CF property, it's all async to

Waiting on read repair?

2013-03-15 Thread Jasdeep Hundal
I've got a couple of questions related issues I'm encountering using Cassandra under a heavy write load: 1. With a ConsistencyLevel of quorum, does FBUtilities.waitForFutures() wait for read repair to complete before returning? 2. When read repair applies a mutation, it needs to obtain

Re: Read-repair working, repair not working?

2013-02-11 Thread aaron morton
> Dropped mutations in a multi DC setup may be a sign of network congestion or >> overloaded nodes. >> >> >>> - Could anybody suggest anything specific to look at to see why >>> the repair operations aren’t having the desired effect? >>>

Re: Read-repair working, repair not working?

2013-02-11 Thread Brian Fleming
ns aren’t having the desired effect? > I would first build a test case to ensure correct operation when using strong > consistency. i.e. QUOURM write and read. Because you are using RF 2 per DC I > assume you are not using LOCAL_QUOURM because that is 2 and you would not > have any redunda

Re: Read-repair working, repair not working?

2013-02-10 Thread aaron morton
operation when using strong consistency. i.e. QUOURM write and read. Because you are using RF 2 per DC I assume you are not using LOCAL_QUOURM because that is 2 and you would not have any redundancy in the DC. > > - Would increasing logging level to ‘DEBUG’ show read-repair &g

Read-repair working, repair not working?

2013-02-10 Thread Brian Fleming
consistency & availability: I’d request data, nothing would be returned, I would then re-request the data and it would correctly be returned: i.e. read-repair appeared to be occurring. However running repairs on the nodes didn’t resolve this (I tried general ‘*repair’* commands as well as targ

Re: neither 'nodetool repair' nor 'hinted hanoff/read repair' work for secondary indexes

2013-02-05 Thread Alexei Bakanov
ssing rows for userId %s, data length is > %d'%(userId, len(data))) > Exception: missing rows for userId 256, data length is 0 > > $ ccm cli > [default@unknown] use testks; > Authenticated to keyspace: testks > [default@testks] get cf1 where 'indexedColumn'=&#x

neither 'nodetool repair' nor 'hinted hanoff/read repair' work for secondary indexes

2013-02-01 Thread Alexei Bakanov
h is %d'%(userId, len(data))) Exception: missing rows for userId 256, data length is 0 $ ccm cli [default@unknown] use testks; Authenticated to keyspace: testks [default@testks] get cf1 where 'indexedColumn'='userId_256'; 0 Row Returned. Elapsed time: 47 msec(s). $ p

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Aaron Turner
inline... On Mon, Oct 1, 2012 at 7:46 PM, Hiller, Dean wrote: > Thanks, (actually new it was configurable) BUT what I don't get is why I > have to run a repair. IF all nodes became consistent on the delete, it > should not be possible to get a forgotten delete, correct. The forgotten > delete w

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Hiller, Dean
ir once per/gc_grace period. >>You won't see empty/deleted rows go away until they're compacted away. >> >>On Mon, Oct 1, 2012 at 6:32 PM, Hiller, Dean >>wrote: >>> I know there is a 10 day limit if you have a node out of the cluster >>>where yo

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Hiller, Dean
you always need to run repair once per/gc_grace period. >You won't see empty/deleted rows go away until they're compacted away. > >On Mon, Oct 1, 2012 at 6:32 PM, Hiller, Dean wrote: >> I know there is a 10 day limit if you have a node out of the cluster >>where yo

Re: read-repair and deletes / forgotten deletes

2012-10-01 Thread Aaron Turner
t if you have a node out of the cluster where > you better be running read-repair or you end up with forgotten deletes, but > what about on a clean cluster with all nodes always available? Shouldn't the > deletes eventually take place or does one have to keep running read-repair

read-repair and deletes / forgotten deletes

2012-10-01 Thread Hiller, Dean
I know there is a 10 day limit if you have a node out of the cluster where you better be running read-repair or you end up with forgotten deletes, but what about on a clean cluster with all nodes always available? Shouldn't the deletes eventually take place or does one have to keep ru

Re: Question on Read Repair

2012-09-18 Thread Vijay
reads at LOCAL_QUORUM in DC1, will > read repair happen on the replicas in DC2? > > Thanks > -Raj >

Question on Read Repair

2012-09-16 Thread Raj N
Hi, I have a 2 DC setup(DC1:3, DC2:3). All reads and writes are at LOCAL_QUORUM. The question is if I do reads at LOCAL_QUORUM in DC1, will read repair happen on the replicas in DC2? Thanks -Raj

Re: Node crashing during read repair

2012-07-03 Thread Robin Verlangen
Hi Aaron, This was the first error. It occurred a couple of times after this. We did an hardware upgrade on the server and increased the max heap size. Now running fine. Seems that 1.1.1 uses a little more memory, or or data set just grew ;-) Thank you for your time! 2012/7/3 aaron morton > Is

Re: Node crashing during read repair

2012-07-02 Thread aaron morton
Is this still an issue ? It looks like something shut down the messaging service. Was there anything else in the logs ? Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/06/2012, at 3:49 AM, Robin Verlangen wrote: > Hi there, > > Toda

Node crashing during read repair

2012-06-27 Thread Robin Verlangen
Hi there, Today I found one node (running 1.1.1 in a 3 node cluster) being dead for the third time this week, it died with the following message: ERROR [ReadRepairStage:3] 2012-06-27 14:28:30,929 AbstractCassandraDaemon.java (line 134) Exception in thread Thread[ReadRepairStage:3,5,main] java.uti

Re: read-repair?

2012-02-04 Thread Mr.Quintero

Re: read-repair?

2012-02-02 Thread Peter Schuller
> sorry to be dense, but which is it?  do i get the old version or the new > version?  or is it indeterminate? Indeterminate, depending on which nodes happen to be participating in the read. Eventually you should get the new version, unless the node that took the new version permanently crashed wi

Re: read-repair?

2012-02-02 Thread Guy Incognito
oius write at quorum failed (since it only made it to one node), so this is not a violation of the contract. Once node 2 and/or 3 return their response, read repair (if it is active) will cause re-read and re-conciliation followed by a row mutation being send to the nodes to correct the column.

  1   2   >