If you always drain you won't have any commit logs. On Thu, Nov 4, 2021 at 2:57 PM Elliott Sims <elli...@backblaze.com> wrote: > > To deal with this, I've just made a very small Bash script that looks at > commitlog age, then set the script as an "ExecStartPre=" in systemd: > > if [[ -d '/opt/cassandra/data/data' && $(/usr/bin/find > /opt/cassandra/data/commitlog/ -name 'CommitLog*.log' -mtime -8 | wc -l) > -eq 0 ]]; then > >&2 echo "ERROR: precheck filed, Cassandra data too old" > exit 10 > fi > > First conditional is to reduce false-positives on brand new machines with > no data. > I suspect it'll false-positive if your writes are extremely rare (that is, > basically read-only), but at that point you may not need it at all. > (adjust as needed for your grace period and paths) > > On Thu, Nov 4, 2021 at 12:54 AM Berenguer Blasi <berenguerbl...@gmail.com> > wrote: > > > Apologies, I missed Paulo's reply on my email client threading funnies... > > > > On 4/11/21 7:50, Berenguer Blasi wrote: > > > What about an hourly heartbeat 'lastSeenAlive' timestamp? my 2cts. > > > > > > On 3/11/21 21:53, Stefan Miklosovic wrote: > > >> Hi, > > >> > > >> We see a lot of cases out there when a node was down for longer than > > >> the GC period and once that node is up there are a lot of zombie data > > >> issues ... you know the story. > > >> > > >> We would like to implement some kind of a check which would detect > > >> this so that node would not start in the first place so no issues > > >> would be there at all and it would be up to operators to figure out > > >> first what to do with it. > > >> > > >> There are a couple of ideas we were exploring with various pros and > > >> cons and I would like to know what you think about them. > > >> > > >> 1) Register a shutdown hook on "drain". This is already there (1). > > >> "drain" method is doing quite a lot of stuff and this is called on > > >> shutdown so our idea is to write a timestamp to system.local into a > > >> new column like "lastly_drained" or something like that and it would > > >> be read on startup. > > >> > > >> The disadvantage of this approach, or all approaches via shutdown > > >> hooks, is that it will only react only on SIGTERM and SIGINT. If that > > >> node is killed via SIGKILL, JVM just stops and there is basically > > >> nothing we have any guarantee of that would leave some traces behind. > > >> > > >> If it is killed and that value is not overwritten, on the next startup > > >> it might happen that it would be older than 10 days so it will falsely > > >> evaluate it should not be started. > > >> > > >> 2) Doing this on startup, you would check how old all your sstables > > >> and commit logs are, if no file was modified less than 10 days ago you > > >> would abort start, there is pretty big chance that your node did at > > >> least something in 10 days, there does not need to be anything added > > >> to system tables or similar and it would be just another StartupCheck. > > >> > > >> The disadvantage of this is that some dev clusters, for example, may > > >> run more than 10 days and they are just sitting there doing absolutely > > >> nothing at all, nobody interacts with them, nobody is repairing them, > > >> they are just sitting there. So when nobody talks to these nodes, no > > >> files are modified, right? > > >> > > >> It seems like there is not a silver bullet here, what is your opinion > > on this? > > >> > > >> Regards > > >> > > >> (1) > > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799 > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > > >> > > >> . > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > >
--------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For additional commands, e-mail: dev-h...@cassandra.apache.org