date:20200213

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread manish khandelwal

Thanks Jeff for your response. Do you see any risk in following approach 1. Stop the node. 2. Remove all sstable files from */var/lib/cassandra/data/keyspace/tablename-23dfadf32adf33d33s333s33s3s33 * directory. 3. Start the node. 4. Run full repair on this particular table I wanted to go thi

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread Jeff Jirsa

Agree this is both strictly possible and more common with LCS. The only thing that's strictly correct to do is treat every corrupt sstable exception as a failed host, and replace it just like you would a failed host. On Thu, Feb 13, 2020 at 10:55 PM manish khandelwal < manishkhandelwa...@gmail.co

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread manish khandelwal

Thanks Erick I would like to explain how data resurrection can take place with single SSTable deletion. Consider this case of table with Levelled Compaction Strategy 1. Data A written a long time back. 2. Data A is deleted and tombstone is created. 3. After GC grace tombstone is purgeable. 4. No

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread Erick Ramirez

The log shows that the the problem occurs when decompressing the SSTable but there's not much actionable info from it. I would like to know what will be "ordinary hammer" in this case. Do you > want to suggest that deleting only corrupt sstable file ( in this case > mc-1234-big-*.db) would be su

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread manish khandelwal

Hi Erick Thanks for your quick response. I have attached the full stacktrace which show exception during validation phase of table repair. I would like to know what will be "ordinary hammer" in this case. Do you want to suggest that deleting only corrupt sstable file ( in this case *mc-1234-big

Re: AWS I3.XLARGE retiring instances advices

2020-02-13 Thread Jeff Jirsa

Feels that way and most people don’t do it, but definitely required for strict correctness. > On Feb 13, 2020, at 8:57 PM, Erick Ramirez wrote: > > > Interesting... though it feels a bit extreme unless you're dealing with a > cluster that's constantly dropping mutations. In which case, you

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread Erick Ramirez

It will achieve the outcome you are after but I doubt anyone would recommend that approach. It's like using a sledgehammer when an ordinary hammer would suffice. And if you were hitting some bug then you'd run into the same problem anyway. Can you post the full stack trace? It might provide us som

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread manish khandelwal

Hi Eric Thanks for reply. Reason for corruption is unknown to me. I just found the corrupt table when scheduled repair failed with logs showing *ERROR [ValidationExecutor:16] 2020-01-21 19:13:18,123 CassandraDaemon.java:228 - Exception in thread Thread[ValidationExecutor:16,1,main]org.apach

Re: AWS I3.XLARGE retiring instances advices

2020-02-13 Thread Erick Ramirez

Interesting... though it feels a bit extreme unless you're dealing with a cluster that's constantly dropping mutations. In which case, you have bigger problems anyway. :)

Re: AWS I3.XLARGE retiring instances advices

2020-02-13 Thread Jeff Jirsa

Option 1 is only strictly safe if you run repair while the down replica is down (otherwise you validate quorum consistency guarantees) Option 2 is probably easier to manage and wont require any special effort to avoid violating consistency. I'd probably go with option 2. On Thu, Feb 13, 2020 at

Re: AWS I3.XLARGE retiring instances advices

2020-02-13 Thread Sergio

Thank you for the advices! Best! Sergio On Thu, Feb 13, 2020, 7:44 PM Erick Ramirez wrote: > Option 1 is a cheaper option because the cluster doesn't need to rebalance > (with the loss of a replica) post-decommission then rebalance again when > you add a new node. > > The hints directory on EB

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez

Not a problem. And I've just responded on the new thread. Cheers! 👍 >

Re: AWS I3.XLARGE retiring instances advices

2020-02-13 Thread Erick Ramirez

Option 1 is a cheaper option because the cluster doesn't need to rebalance (with the loss of a replica) post-decommission then rebalance again when you add a new node. The hints directory on EBS is irrelevant because it would only contain mutations to replay to down replicas if the node was a coor

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

Thank you very much for this helpful information! I opened a new thread for the other question :) Sergio Il giorno gio 13 feb 2020 alle ore 19:22 Erick Ramirez < erick.rami...@datastax.com> ha scritto: > I want to have more than one seed node in each DC, so unless I don't >> restart the node af

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez

> > I want to have more than one seed node in each DC, so unless I don't > restart the node after changing the seed_list in that node it will not > become the seed. That's not really going to hurt you if you have other seeds in other DCs. But if you're willing to take the hit from the restart the

AWS I3.XLARGE retiring instances advices

2020-02-13 Thread Sergio

We have i3xlarge instances with data directory in the XFS filesystem that is ephemeral and *hints*, *commit_log* and *saved_caches* in the EBS volume. Whenever AWS is going to retire the instance due to degraded hardware performance is it better: Option 1) - Nodetool drain - Stop cassandra

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

Right now yes I have one seed per DC. I want to have more than one seed node in each DC, so unless I don't restart the node after changing the seed_list in that node it will not become the seed. Do I need to update the seed_list across all the nodes even in separate DCs and perform a rolling rest

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez

> > 1) If I don't restart the node after changing the seed list this will > never become the seed and I would like to be sure that I don't find my self > in a spot where I don't have seed nodes and this means that I can not add a > node in the cluster Are you saying you only have 1 seed node in t

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

Thank you very much for your response! 2 things: 1) If I don't restart the node after changing the seed list this will never become the seed and I would like to be sure that I don't find my self in a spot where I don't have seed nodes and this means that I can not add a node in the cluster 2) We

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez

> > I did decommission of this node and I did all the steps mentioned except > the -Dcassandra.replace_address and now it is streaming correctly! That works too but I was trying to avoid the rebalance operations (like streaming to restore replica counts) since they can be expensive. So basically

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

I did decommission of this node and I did all the steps mentioned except the -Dcassandra.replace_address and now it is streaming correctly! So basically, if I want this new node as seed should I add its IP address after it joined the cluster and after - nodetool drain - restart cassandra? I deact

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez

> > Should I do something to fix it or leave as it? It depends on what your intentions are. I would use the "replace" method to build it correctly. At a high level: - remove the IP from it's own seeds list - delete the contents of data, commitlog and saved_caches - add the replace flag in cassand

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

Thanks for your fast reply! No repairs are running! https://cassandra.apache.org/doc/latest/faq/index.html#does-single-seed-mean-single-point-of-failure I added the node IP itself and the IP of existing seeds and I started Cassandra. So the right procedure is not to add in the seed list the new

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Erick Ramirez

> > I wanted to add a new node in the cluster and it looks to be working fine > but instead to wait for 2-3 hours data streaming like 100GB it immediately > went to the UN (UP and NORMAL) state. > Are you running a repair? I can't see how it's possibly receiving 100GB since it won't bootstrap.

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

Should I do something to fix it or leave as it? On Thu, Feb 13, 2020, 5:29 PM Jon Haddad wrote: > Seeds don't bootstrap, don't list new nodes as seeds. > > On Thu, Feb 13, 2020 at 5:23 PM Sergio wrote: > >> Hi guys! >> >> I don't know how but this is the first time that I see such behavior. I >

Re: New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Jon Haddad

Seeds don't bootstrap, don't list new nodes as seeds. On Thu, Feb 13, 2020 at 5:23 PM Sergio wrote: > Hi guys! > > I don't know how but this is the first time that I see such behavior. I > wanted to add a new node in the cluster and it looks to be working fine but > instead to wait for 2-3 hours

New seed node in the cluster immediately UN without passing for UJ state

2020-02-13 Thread Sergio

Hi guys! I don't know how but this is the first time that I see such behavior. I wanted to add a new node in the cluster and it looks to be working fine but instead to wait for 2-3 hours data streaming like 100GB it immediately went to the UN (UP and NORMAL) state. I saw a bunch of exception in t

Re: Corruption of frozen UDT during upgrade

2020-02-13 Thread Erick Ramirez

Paul, if you do a sstabledump in C* 3.0 (before upgrading) and compare it to the dump output after upgrading to C* 3.11 then you will see that the cell names in the outputs are different. This is the symptom of the broken serialization header which leads to various exceptions during compactions and

Re: Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread Erick Ramirez

You need to stop C* in order to run the offline sstable scrub utility. That's why it's referred to as "offline". :) Do you have any idea on what caused the corruption? It's highly unusual that you're thinking of removing all the files for just one table. Typically if the corruption was a result of

Re: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-13 Thread Sergio

- Verify that nodetool upgradesstables has completed successfully on all nodes from any previous upgrade - Turn off repairs and any other streaming operations (add/remove nodes) - Nodetool drain on the node that needs to be stopped (seeds first, preferably) - Stop an un-upgraded n

Corrupt SSTable Cassandra 3.11.2

2020-02-13 Thread manish khandelwal

Hi I see a corrupt SSTable in one of my keyspace table on one node. Cluster is 3 nodes with replication 3. Cassandra version is 3.11.2. I am thinking on following lines to resolve the corrupt SSTable issue. 1. Run nodetool scrub. 2. If step 1 fails, run offline sstabablescrub. 3. If step 2 fails,

Re: [EXTERNAL] Re: Cassandra Encyrption between DC

2020-02-13 Thread Jai Bheemsen Rao Dhanwada

thank you On Thu, Feb 13, 2020 at 6:30 AM Durity, Sean R wrote: > I will just add-on that I usually reserve security changes as the primary > exception where app downtime may be necessary with Cassandra. (DSE has some > Transitional tools that are useful, though.) Sometimes a short outage is > p

Re: Connection reset by peer

2020-02-13 Thread Reid Pinchback

Since ping is ICMP, not TCP, you probably want to investigate a mix of TCP and CPU stats to see what is behind the slow pings. I’d guess you are getting network impacts beyond what the ping times are hinting at. ICMP isn’t subject to retransmission, so your TCP situation could be far worse than

Corruption of frozen UDT during upgrade

2020-02-13 Thread Paul Chandler

Hi all, I have looked at the release notes for the up coming release 3.11.6 and seen the part about corruption of frozen UDT types during upgrade from 3.0. We have a number of cluster using UDT and have been upgrading to 3.11.4 and haven’t noticed any problems. In the ticket ( CASSANDRA-15035

RE: [EXTERNAL] Cassandra 3.11.X upgrades

2020-02-13 Thread Durity, Sean R

+1 on nodetool drain. I added that to our upgrade automation and it really helps with post-upgrade start-up time. Sean Durity From: Erick Ramirez Sent: Wednesday, February 12, 2020 10:29 PM To: user@cassandra.apache.org Subject: Re: [EXTERNAL] Cassandra 3.11.X upgrades Yes to the steps. The on

RE: [EXTERNAL] Re: Cassandra Encyrption between DC

2020-02-13 Thread Durity, Sean R

I will just add-on that I usually reserve security changes as the primary exception where app downtime may be necessary with Cassandra. (DSE has some Transitional tools that are useful, though.) Sometimes a short outage is preferred over a longer, more-complicated attempt to keep the app up. And

Re: Connection reset by peer

2020-02-13 Thread Erick Ramirez

> > Last question: In all your experiences, how high can the latency (simple > ping response times go) before it becomes a problem? (Obviously the lower > the better but is there some sort of cut off/formula where problems can be > expected intermittently like the connection resets) Unfortunately

37 matches

Mail list logo