Re: unable to repair

2021-05-31 Thread Jeff Jirsa
> On May 30, 2021, at 2:12 AM, Sébastien Rebecchi > wrote: > >  > Hello, > > I have a more general question about that, I cannot find clear answer. > > In my use case I have many tables (around 10k new tables created per months) > and they are created from many clients and only dynamically

Re: unable to repair

2021-05-30 Thread Sébastien Rebecchi
I was not aware of such limitations. Thank you for your answer. Sébastien Le dim. 30 mai 2021 à 18:52, Bowen Song a écrit : > This sounds like a really bad idea. > > In Cassandra 4.0 RC1, when you have more than 150 tables or 40 keyspaces (code > reference >

Re: unable to repair

2021-05-30 Thread Bowen Song
This sounds like a really bad idea. In Cassandra 4.0 RC1, when you have more than 150 tables or 40 keyspaces (code reference ), Cassandra will warn you about

Re: unable to repair

2021-05-30 Thread Sébastien Rebecchi
Hello, I have a more general question about that, I cannot find clear answer. In my use case I have many tables (around 10k new tables created per months) and they are created from many clients and only dynamically, with several clients creating same tables simulteanously. What is the recommende

Re: unable to repair

2021-05-28 Thread Sébastien Rebecchi
Thank you for your answer. If I send all my create operations still from many clients but to 1 coordinator node, always the same, would it prevent schema mismatch? Sébastien. Le ven. 28 mai 2021 à 01:14, Kane Wilson a écrit : > Which client operations could trigger schema change at node level

Re: unable to repair

2021-05-27 Thread Kane Wilson
> > Which client operations could trigger schema change at node level? Do you > mean that for ex creating a new table trigger a schema change globally, not > only at KS/table single level? > Yes, any DDL statement (creating tables, altering, dropping, etc) triggers a schema change across the cluste

Re: unable to repair

2021-05-27 Thread Sébastien Rebecchi
OK I will check that, thank you! Sébastien Le jeu. 27 mai 2021 à 11:07, Bowen Song a écrit : > Hi Sébastien, > > > The error message you shared came from the repair coordinator node's > log, and it's the result of failures reported by 3 other nodes. If you > could have a look at the 3 nodes lis

Re: unable to repair

2021-05-27 Thread Bowen Song
Hi Sébastien, The error message you shared came from the repair coordinator node's log, and it's the result of failures reported by 3 other nodes. If you could have a look at the 3 nodes listed in the error message - 135.181.222.100, 135.181.217.109 and 135.181.221.180, you should be able to

Re: unable to repair

2021-05-26 Thread Sébastien Rebecchi
Sorry Kane, I am a little bit confused, we are talking about schema version at node level. Which client operations could trigger schema change at node level? Do you mean that for ex creating a new table trigger a schema change globally, not only at KS/table single level? Sébastien Le jeu. 27 mai

Re: unable to repair

2021-05-26 Thread Sébastien Rebecchi
I don't have schema changes, except keyspaces and tables creations. But they are done from multiple sources indeed. With a "create if not exists" statement, on demand. Thanks you for your answer, I will try to see if I could precreate them then. As for the schema mismatch, what is the best way of

Re: unable to repair

2021-05-26 Thread Kane Wilson
> > I have had that error sometimes when schema mismatch but also when all > schema match. So I think this is not the only cause. > Have you checked the logs for errors on 135.181.222.100, 135.181.217.109, and 135.181.221.180? They may give you some better information about why they are sending bad

Re: unable to repair

2021-05-26 Thread Sébastien Rebecchi
Thank you for your answer. I have had that error sometimes when schema mismatch but also when all schema match. So I think this is not the only cause. By the way, what could cause such a shema mismatch. I would like to know what should be or not be done in order to keep schema agreements between

Re: unable to repair

2021-05-26 Thread Dipan Shah
Hello Sebastien, Not sure but have you checked the output of "nodetool describecluster"? A schema mismatch or node unavailability might result in this. Thanks, Dipan Shah From: Sébastien Rebecchi Sent: Wednesday, May 26, 2021 7:35 PM To: user@cassandra.apache

Re: Unable to repair a node

2011-08-19 Thread Peter Schuller
> Somewhere I remember discussions about issues with the merkle tree range > splitting or some such that resulted in repair always thinking a little bit > of data was out of sync. https://issues.apache.org/jira/browse/CASSANDRA-2324 - fixed for early 0.8. I don't *think* there's a know open bug t

Re: Unable to repair a node

2011-08-19 Thread Peter Schuller
> I've know run 7 repairs in a row on this keyspace and every single one has > finished successfully but performed streams between all nodes. This keyspace > was written to over the course of several weeks, sometimes with How much data is streamed, do you know? Mainly interesting is if there is a

Re: Unable to repair a node

2011-08-18 Thread aaron morton
Somewhere I remember discussions about issues with the merkle tree range splitting or some such that resulted in repair always thinking a little bit of data was out of sync. If you want to get a better idea about what's been transfered turn the logging up to DEBUG or turn it up just for org.ap

Re: Unable to repair a node

2011-08-17 Thread Philippe
I have a smallish keyspace on my 3 node, RF=3 cluster. My cluster has no read/write traffic while I am testing repairs. I am running 0.8.4 of debian packages on ubuntu. I've know run 7 repairs in a row on this keyspace and every single one has finished successfully but performed streams between al

Re: Unable to repair a node

2011-08-16 Thread Philippe
> > ctrl-c will not stop the repair. > Ok, so that's why I've been seeing logs of repairs on other CFs That's probably the 2280 issue. Data from all CF's is streamed over > Ah, I get it now. Thanks > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorto

Re: Unable to repair a node

2011-08-16 Thread aaron morton
ctrl-c will not stop the repair. You kind of check things by looking at netstat compationstats , that will just tell you if there are compactions backing up. Not necessarily if they are validation compactions used during repairs. You can trawl the logs to look for messages from the AntiEntropy

Re: Unable to repair a node

2011-08-16 Thread Philippe
One last thought : what happens when you ctrl-c a nodetool repair ? Does it stop the repair on the server ? If not, then I think I have multiple repairs still running. Is there any way to check this ? Thanks 2011/8/16 Philippe > Even more interesting behavior : a repair on a CF has consequences

Re: Unable to repair a node

2011-08-16 Thread Philippe
Thanks for the pointers, responses inline. On Tue, Aug 16, 2011 at 3:48 PM, Philippe wrote: > > I have been able to repair some small column families by issuing a repair > > [KS] [CF]. When testing on the ring with no writes at all, it still takes > > about 2 repairs to get "consistent" logs for

Re: Unable to repair a node

2011-08-16 Thread Philippe
Even more interesting behavior : a repair on a CF has consequences on other CFs. I didn't expect that. There are no writes being issued to the cluster yet the logs indicate that - SSTableReader has opened dozens and dozens of files, most of them unrelated to the CF being repaired - compa

Re: Unable to repair a node

2011-08-16 Thread Jonathan Ellis
On Tue, Aug 16, 2011 at 3:48 PM, Philippe wrote: > I have been able to repair some small column families by issuing a repair > [KS] [CF]. When testing on the ring with no writes at all, it still takes > about 2 repairs to get "consistent" logs for all AES requests. I think I linked these in anoth

Re: Unable to repair a node

2011-08-16 Thread Philippe
I'm still trying different stuff. Here are my latest findings, maybe someone will find them useful: - I have been able to repair some small column families by issuing a repair [KS] [CF]. When testing on the ring with no writes at all, it still takes about 2 repairs to get "consistent" log

Re: Unable to repair a node

2011-08-14 Thread Philippe
@Teijo : thanks for the procedure, I hope I won't have to do that Peter, I'll answer inline. Thanks for the detailed answer. > > the number of SSTables for some keyspaces goes dramatically up (from 3 or > 4 > > to several dozens). > > Typically with a long running compaction, such as that trigge

Re: Unable to repair a node

2011-08-14 Thread Teijo Holzer
Forgot to mention, you want to check the following in cassandra.yaml on the node that you bootstrap before you initiate the bootstrap: * Ensure that the initial_token is set to the correct value (see nodetool) * Ensure that the seeds list doesn't contain the IP of the node you are trying to boo

Re: Unable to repair a node

2011-08-14 Thread Peter Schuller
> oh i know you can run rf 3 on a 3 node cluster. more i thought that if you > have one fail you have less nodes than the rf, so the cluster is at less > than rf, and writes might be disabled or something like that, while at 4 you > still have met the rf... A node failing is independent of RF. *De

Re: Unable to repair a node

2011-08-14 Thread Peter Schuller
Sorry about the lack of response to your actual issue. I'm afraid I don't have an exhaustive analysis, but some quick notes: > balanced ring but the other nodes are at 60GB. Each repair basically > generates thousands of pending compactions of various types (SSTable build, > minor, major & validat

Re: Unable to repair a node

2011-08-14 Thread Teijo Holzer
Hi, I took the following steps to get a node that refused to repair back under control. WARNING: This resulted in some data loss for us, YMMV with your replication factor * Turn off all row & key caches via cassandra-cli * Set "disk_access_mode: standard" in cassandra.yaml * Kill Cassandra on

Re: Unable to repair a node

2011-08-14 Thread Philippe
No it depends on the consistency level. It's different : for example, QUORUM = 2 for RF=3 Anyway, anyone have an answer to my real issue ? Thanks 2011/8/14 Stephen Connolly > oh i know you can run rf 3 on a 3 node cluster. more i thought that if you > have one fail you have less nodes than the

Re: Unable to repair a node

2011-08-14 Thread Stephen Connolly
oh i know you can run rf 3 on a 3 node cluster. more i thought that if you have one fail you have less nodes than the rf, so the cluster is at less than rf, and writes might be disabled or something like that, while at 4 you still have met the rf... - Stephen --- Sent from my Android phone, so ra

Re: Unable to repair a node

2011-08-14 Thread Philippe
5 hours later, the number of pending compactions host up to 8k as usual, the number of SST tables for another keyspace shot up to 160 (from 4). At 4pm, a daily cron job that runs repair starts on that same node and all of a sudden, the number of pending compactions went down to 4k and to number of

Re: Unable to repair a node

2011-08-14 Thread Peter Schuller
> i am always wondering why people run clusters with number of nodes == rf > > i thought you needed to have number of nodes > rf ti gave any sensible > behaviour... but i am no expert at all No. The only requirement is that the number of nodes be >= RF, since clearly in a cluster with fewer nodes

Re: Unable to repair a node

2011-08-14 Thread Stephen Connolly
i am always wondering why people run clusters with number of nodes == rf i thought you needed to have number of nodes > rf ti gave any sensible behaviour... but i am no expert at all - Stephen --- Sent from my Android phone, so random spelling mistakes, random nonsense words and other nonsense a