How about a last_checkpoint (or better name) system.local column that is updated periodically (ie. every minute) + on drain? This would give a lower time bound on when the node was last live without requiring an external marker file.
On Wed, 3 Nov 2021 at 18:03 Stefan Miklosovic < stefan.mikloso...@instaclustr.com> wrote: > The third option would be to have some thread running in the > background "touching" some (empty) marker file, it is the most simple > solution but I do not like the idea of this marker file, it feels > dirty, but hey, while it would be opt-in feature for people knowing > what they want, why not right ... > > On Wed, 3 Nov 2021 at 21:53, Stefan Miklosovic > <stefan.mikloso...@instaclustr.com> wrote: > > > > Hi, > > > > We see a lot of cases out there when a node was down for longer than > > the GC period and once that node is up there are a lot of zombie data > > issues ... you know the story. > > > > We would like to implement some kind of a check which would detect > > this so that node would not start in the first place so no issues > > would be there at all and it would be up to operators to figure out > > first what to do with it. > > > > There are a couple of ideas we were exploring with various pros and > > cons and I would like to know what you think about them. > > > > 1) Register a shutdown hook on "drain". This is already there (1). > > "drain" method is doing quite a lot of stuff and this is called on > > shutdown so our idea is to write a timestamp to system.local into a > > new column like "lastly_drained" or something like that and it would > > be read on startup. > > > > The disadvantage of this approach, or all approaches via shutdown > > hooks, is that it will only react only on SIGTERM and SIGINT. If that > > node is killed via SIGKILL, JVM just stops and there is basically > > nothing we have any guarantee of that would leave some traces behind. > > > > If it is killed and that value is not overwritten, on the next startup > > it might happen that it would be older than 10 days so it will falsely > > evaluate it should not be started. > > > > 2) Doing this on startup, you would check how old all your sstables > > and commit logs are, if no file was modified less than 10 days ago you > > would abort start, there is pretty big chance that your node did at > > least something in 10 days, there does not need to be anything added > > to system tables or similar and it would be just another StartupCheck. > > > > The disadvantage of this is that some dev clusters, for example, may > > run more than 10 days and they are just sitting there doing absolutely > > nothing at all, nobody interacts with them, nobody is repairing them, > > they are just sitting there. So when nobody talks to these nodes, no > > files are modified, right? > > > > It seems like there is not a silver bullet here, what is your opinion on > this? > > > > Regards > > > > (1) > https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/service/StorageService.java#L786-L799 > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >