Hi Bowen, Yes there are a large number of "Skipping delete of FINALIZED LocalSession” messages.
We have a script that repairs ranges, stepping through the complete range in 5 days, this should create 1600 ranges over the 5 days, this runs commands like this: nodetool -h localhost -p 7199 repair -pr -st +09152150533683801432 -et +09154639946886262655 I am also seeing lots of "Auto deleting repair session LocalSession” messages - these seem to be deleting the rows with a repairedAt value of more than 5 days, so it seems like that part is working correctly, but just taking 5 days to delete them. Thanks Paul > On 24 Jan 2022, at 22:12, Bowen Song <bo...@bso.ng> wrote: > > From the source code I've read, by default Cassandra will run a clean up for > the system.repairs table every 10 minutes, any row related to a repair that > has completed over 1 day ago will be automatically removed. I highly doubt > that you have ran 75,000 repairs in the 24 hours prior to shutting down that > node, because that's nearly one repair every second. > > Do you see any logs like these? > > Auto failing timed out repair session... > Skipping delete of FINALIZED LocalSession ... because it has not been > superseded by a more recent session > Skipping delete of LocalSession ... because it still contains sstables > They are the logs from the cleanup() method in > https://github.com/apache/cassandra/blob/6709111ed007a54b3e42884853f89cabd38e4316/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java#L416 > > <https://github.com/apache/cassandra/blob/6709111ed007a54b3e42884853f89cabd38e4316/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java#L416> > which indicates a record was not deleted during the cleaned up for a number > of reasons. > > On 24/01/2022 19:45, Paul Chandler wrote: >> Hi Bowen, >> >> Yes, there does seem to be a lot of rows, on one of the upgraded clusters >> there 75,000 rows. >> >> I have been experimenting on a test cluster, this has about a 5 minute >> pause, and around 15,000 rows. >> >> If I clear the system.repairs table ( by deleting the sstables ) then this >> does not pause at all, so seems to fix the problem. However I don’t really >> understand what the implications are of just removing that data. >> >> Thanks >> >> Paul >> >>> On 24 Jan 2022, at 18:50, Bowen Song <bo...@bso.ng <mailto:bo...@bso.ng>> >>> wrote: >>> >>> Hmm, interesting... Try "select * from system.repairs;" in cqlsh on a slow >>> starting node, do you get a lots of rows? This is the most obvious loop run >>> (indirectly) by the ActiveRepairService.start(). >>> >>> On 24/01/2022 13:30, Romain Anselin wrote: >>>> >>>> Hi everyone, >>>> >>>> We generated a JFR profile of the startup phase of Cassandra with Paul, >>>> and it would appear that the time is spent in the ActiveRepairSession >>>> within the main thread (11mn of execution of the "main" thread in his >>>> environment, vs 15s in mine), which has been introduced in CASSANDRA-9143 >>>> based on a "view git blame" of the source code >>>> https://github.com/apache/cassandra/blob/6709111ed007a54b3e42884853f89cabd38e4316/src/java/org/apache/cassandra/service/CassandraDaemon.java#L381 >>>> >>>> <https://github.com/apache/cassandra/blob/6709111ed007a54b3e42884853f89cabd38e4316/src/java/org/apache/cassandra/service/CassandraDaemon.java#L381> >>>> >>>> That seem to match with the gap we see in the logs, where the time is >>>> spent just before the "Preloaded" statement in the logs which comes just >>>> after in the CassandraDaemon code. >>>> >>>> INFO [main] 2022-01-20 09:44:40,198 StorageService.java:830 - Token >>>> metadata: Normal Tokens: >>>> ... <skipping non "main" thread logs> ... >>>> WARN [Messaging-EventLoop-3-1] 2022-01-20 09:45:13,243 >>>> NoSpamLogger.java:95 - /IP1:7000->/IP1:7000-SMALL_MESSAGES-[no-channel] >>>> dropping message of type SCHEMA_VERSION_REQ whose timeout expired before >>>> reaching the network >>>> INFO [main] 2022-01-20 09:55:01,134 QueryProcessor.java:150 - Preloaded >>>> 0 prepared statements >>>> >>>> Remains to determine what that is doing more in details and why it's >>>> taking longer and longer on startup. >>>> >>>> We also exported the sstablemetadata from all sstables in one node, and at >>>> this stage, we can see we have 300 sstables out of 577 with "Repaired at:" >>>> set. >>>> >>>> cd /var/lib/cassandra/data >>>> find . -name '*Data*' | while read datf; do echo ++++ $datf ++++; sudo -u >>>> cassandra sstablemetadata $datf; done >> ~/sstablemetadata.txt >>>> cqlsh -e "paging off; select * from system.repairs" >> ~/repairs.out >>>> >>>> $ egrep 'Repaired at: 1' sstablemetadata.txt | wc -l >>>> 300 >>>> $ egrep 'Repaired at:' sstablemetadata.txt | wc -l >>>> 577 >>>> >>>> More info to come >>>> >>>> Regards - Romain >>>> >>>> On 19/01/2022 13:10, Paul Chandler wrote: >>>>> Hi Bowen, >>>>> >>>>> Thanks for the reply, these have been our normal shutdowns, so we do a >>>>> nodetool drain before restarting the service, so I would have thought >>>>> there should not be any commtlogs >>>>> >>>>> However there is these messages for one commit log, But looks like it has >>>>> finished quickly and correctly: >>>>> >>>>> INFO [main] 2022-01-19 10:08:22,811 CommitLog.java:173 - Replaying >>>>> /var/lib/cassandra/commitlog/CommitLog-7-1642094921295.log >>>>> WARN [main] 2022-01-19 10:08:22,839 CommitLogReplayer.java:305 - Origin >>>>> of 2 sstables is unknown or doesn't match the local node; >>>>> commitLogIntervals for them were ignored >>>>> Repeated about 10 times >>>>> WARN [main] 2022-01-19 10:08:22,842 CommitLogReplayer.java:305 - Origin >>>>> of 3 sstables is unknown or doesn't match the local node; >>>>> commitLogIntervals for them were ignored >>>>> INFO [main] 2022-01-19 10:08:22,853 CommitLogReader.java:256 - Finished >>>>> reading /var/lib/cassandra/commitlog/CommitLog-7-1642094921295.log >>>>> INFO [main] 2022-01-19 10:08:22,882 CommitLog.java:175 - Log replay >>>>> complete, 0 replayed mutations >>>>> >>>>> Thanks >>>>> >>>>> Paul >>>>> >>>>>> On 19 Jan 2022, at 13:03, Bowen Song <bo...@bso.ng> >>>>>> <mailto:bo...@bso.ng> wrote: >>>>>> >>>>>> Nothing obvious from the logs you posted. >>>>>> >>>>>> Generally speaking, replaying commit log is often the culprit when a >>>>>> node takes a long time to start. I have seen many nodes with large >>>>>> memtable and commit log size limit spending over half an hour replaying >>>>>> the commit log. I usually do a "nodetool flush" before shutting down the >>>>>> node to help speed up the start time if the shutdown was planned. There >>>>>> isn't much you can do about unexpected shutdown, such as server crashes. >>>>>> When that happens, the only reasonable thing to do is wait for the >>>>>> commit log replay to finish. You should see log entries related to >>>>>> replaying commit logs if this is the case. >>>>>> >>>>>> However, if you don't find any logs related to replaying commit logs, >>>>>> the cause may be completely different. >>>>>> >>>>>> >>>>>> On 19/01/2022 11:54, Paul Chandler wrote: >>>>>>> Hi all, >>>>>>> >>>>>>> We have upgraded a couple of clusters from 3.11.6, now we are having >>>>>>> issues when we restart the nodes. >>>>>>> >>>>>>> The node will either hang or take 10-30 minute to restart, these are >>>>>>> the last messages we have in the system.log: >>>>>>> >>>>>>> INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,267 FileUtils.java:545 >>>>>>> - Deleting file during startup: >>>>>>> /var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-184-big-Summary.db >>>>>>> INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,268 >>>>>>> LogTransaction.java:240 - Unfinished transaction log, deleting >>>>>>> /var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-185-big-Data.db >>>>>>> INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,268 FileUtils.java:545 >>>>>>> - Deleting file during startup: >>>>>>> /var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-185-big-Summary.db >>>>>>> INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,269 >>>>>>> LogTransaction.java:240 - Unfinished transaction log, deleting >>>>>>> /var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-186-big-Data.db >>>>>>> INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,270 FileUtils.java:545 >>>>>>> - Deleting file during startup: >>>>>>> /var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-186-big-Summary.db >>>>>>> INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,272 >>>>>>> LogTransaction.java:240 - Unfinished transaction log, deleting >>>>>>> /var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb_txn_unknowncompactiontype_bc501d00-790f-11ec-9f80-85 >>>>>>> 8854746758.log >>>>>>> INFO [MemtableFlushWriter:2] 2022-01-19 10:08:23,289 >>>>>>> LogTransaction.java:240 - Unfinished transaction log, deleting >>>>>>> /var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb_txn_flush_bc52dc20-790f-11ec-9f80-858854746758.log >>>>>>> >>>>>>> The debug log has messages from DiskBoundaryManager.java at the same >>>>>>> time, then it just has the following messages:|| >>>>>>> >>>>>>> DEBUG [ScheduledTasks:1] 2022-01-19 10:28:09,430 SSLFactory.java:354 - >>>>>>> Checking whether certificates have been updated [] >>>>>>> DEBUG [ScheduledTasks:1] 2022-01-19 10:38:09,431 SSLFactory.java:354 - >>>>>>> Checking whether certificates have been updated [] >>>>>>> DEBUG [ScheduledTasks:1] 2022-01-19 10:48:09,431 SSLFactory.java:354 - >>>>>>> Checking whether certificates have been updated [] >>>>>>> DEBUG [ScheduledTasks:1] 2022-01-19 10:58:09,431 SSLFactory.java:354 - >>>>>>> Checking whether certificates have been updated [] >>>>>>> >>>>>>> >>>>>>> It seems to get worse after each restart, and then it gets to the state >>>>>>> where it just hangs, then the only thing to do is to re bootstrap the >>>>>>> node. >>>>>>> >>>>>>> Once I had re bootstrapped all the nodes in the cluster, I thought the >>>>>>> cluster was stable, but I have now got the case where the one of the >>>>>>> nodes is hanging again. >>>>>>> >>>>>>> Does anyone have an ideas what is causing the problems ? >>>>>>> >>>>>>> >>>>>>> Thanks >>>>>>> >>>>>>> Paul Chandler >>>>>>> >>