I think it does on drop keyspace. We had a recent enough snapshot so it wasn't a big deal to recover. However, we didn't have a snapshot for when the keyspace disappeared.
@Romain: I believe you are correct about reliability. We just had a repair --full fail and CPU lock up one of the nodes at 100%. This occurred on a fairly new keyspace that only have writes. We also are now receiving a very high percentage of read timeouts. ... might be time to rebuild the cluster. On Fri, Mar 3, 2017 at 2:34 PM, Edward Capriolo <edlinuxg...@gmail.com> wrote: > > On Fri, Mar 3, 2017 at 7:56 AM, Romain Hardouin <romainh...@yahoo.fr> > wrote: > >> I suspect a lack of 3.x reliability. Cassandra could had gave up with >> dropped messages but not with a "drop keyspace". I mean I already saw some >> spark jobs with too much executors that produce a high load average on a >> DC. I saw a C* node with a 1 min. load avg of 140 that can still have a P99 >> read latency at 40ms. But I never saw a disappearing keyspace. There are >> old tickets regarding C* 1.x but as far as I remember it was due to a >> create/drop/create keyspace. >> >> >> Le Vendredi 3 mars 2017 13h44, George Webster <webste...@gmail.com> a >> écrit : >> >> >> Thank you for your reply and good to know about the debug statement. I >> haven't >> >> We never dropped or re-created the keyspace before. We haven't even >> performed writes to that keyspace in months. I also checked the permissions >> of Apache, that user had read only access. >> >> Unfortunately, I reverted from a backend recently. I cannot say for sure >> anymore if I saw something in system before the revert. >> >> Anyway, hopefully it was just a fluke. We have some crazy ML libraries >> running on it maybe Cassandra just gave up? Ohh well, Cassandra is a a >> champ and we haven't really had issues with it before. >> >> On Thu, Mar 2, 2017 at 6:51 PM, Romain Hardouin <romainh...@yahoo.fr> >> wrote: >> >> Did you inspect system tables to see if there is some traces of your >> keyspace? Did you ever drop and re-create this keyspace before that? >> >> Lines in debug appear because fd interval is > 2 seconds (logs are in >> nanoseconds). You can override intervals via -Dcassandra.fd_initial_value_ >> ms and -Dcassandra.fd_max_interval_ms properties. Are you sure you didn't >> have these lines in debug logs before? I used to see them a lot prior to >> increase intervals to 4 seconds. >> >> Best, >> >> Romain >> >> Le Mardi 28 février 2017 18h25, George Webster <webste...@gmail.com> a >> écrit : >> >> >> Hey Cassandra Users, >> >> We recently encountered an issue with a keyspace just disappeared. I was >> curious if anyone has had this occur before and can provide some insight. >> >> We are using cassandra 3.10. 2 DCs 3 nodes each. >> The data was still located in the storage folder but is not located >> inside Cassandra >> >> I searched the logs for any hints of error or commands being executed >> that could have caused a loss of a keyspace. Unfortunately I found nothing. >> In the logs the only unusual issue i saw was a series of read timeouts that >> occurred right around when the keyspace went away. Since then I see >> numerous entries in debug log as the following: >> >> DEBUG [GossipStage:1] 2017-02-28 18:14:12,580 FailureDetector.java:457 - >> Ignoring interval time of 2155674599 for /x.x.x..12 >> DEBUG [GossipStage:1] 2017-02-28 18:14:16,580 FailureDetector.java:457 - >> Ignoring interval time of 2945213745 for /x.x.x.81 >> DEBUG [GossipStage:1] 2017-02-28 18:14:19,590 FailureDetector.java:457 - >> Ignoring interval time of 2006530862 for /x.x.x..69 >> DEBUG [GossipStage:1] 2017-02-28 18:14:27,434 FailureDetector.java:457 - >> Ignoring interval time of 3441841231 for /x.x.x.82 >> DEBUG [GossipStage:1] 2017-02-28 18:14:29,588 FailureDetector.java:457 - >> Ignoring interval time of 2153964846 for /x.x.x.82 >> DEBUG [GossipStage:1] 2017-02-28 18:14:33,582 FailureDetector.java:457 - >> Ignoring interval time of 2588593281 for /x.x.x.82 >> DEBUG [GossipStage:1] 2017-02-28 18:14:37,588 FailureDetector.java:457 - >> Ignoring interval time of 2005305693 for /x.x.x.69 >> DEBUG [GossipStage:1] 2017-02-28 18:14:38,592 FailureDetector.java:457 - >> Ignoring interval time of 2009244850 for /x.x.x.82 >> DEBUG [GossipStage:1] 2017-02-28 18:14:43,584 FailureDetector.java:457 - >> Ignoring interval time of 2149192677 for /x.x.x.69 >> DEBUG [GossipStage:1] 2017-02-28 18:14:45,605 FailureDetector.java:457 - >> Ignoring interval time of 2021180918 for /x.x.x.85 >> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 - >> Ignoring interval time of 2436026101 for /x.x.x.81 >> DEBUG [GossipStage:1] 2017-02-28 18:14:46,432 FailureDetector.java:457 - >> Ignoring interval time of 2436187894 for /x.x.x.82 >> >> During the time of the disappearing keyspace we had two concurrent >> activities: >> 1) Running a Spark job (via HDP 2.5.3 in Yarn) that was performing a >> countbykey. It was using they Keyspace that disappeared. The operation >> crashed. >> 2) We created a new keyspace to test out scheme. Only "fancy" thing in >> that keyspace are a few material view tables. Data was being loaded into >> that keyspace during the crash. The load process was extracting information >> and then just writing to Cassandra. >> >> Any ideas? Anyone seen this before? >> >> Thanks, >> George >> >> >> >> >> >> > Cassandra takes snapshots for certain events. Does this extend to drop > keyspace commands? Maybe it should. >