From the source code I've read, by default Cassandra will run a clean
up for the system.repairs table every 10 minutes, any row related to a
repair that has completed over 1 day ago will be automatically removed.
I highly doubt that you have ran 75,000 repairs in the 24 hours prior to
shutting down that node, because that's nearly one repair every second.
Do you see any logs like these?
* Auto failing timed out repair session...
* Skipping delete of FINALIZED LocalSession ... because it has not
been superseded by a more recent session
* Skipping delete of LocalSession ... because it still contains sstables
They are the logs from the cleanup() method in
https://github.com/apache/cassandra/blob/6709111ed007a54b3e42884853f89cabd38e4316/src/java/org/apache/cassandra/repair/consistent/LocalSessions.java#L416
which indicates a record was not deleted during the cleaned up for a
number of reasons.
On 24/01/2022 19:45, Paul Chandler wrote:
Hi Bowen,
Yes, there does seem to be a lot of rows, on one of the upgraded
clusters there 75,000 rows.
I have been experimenting on a test cluster, this has about a 5 minute
pause, and around 15,000 rows.
If I clear the system.repairs table ( by deleting the sstables ) then
this does not pause at all, so seems to fix the problem. However I
don’t really understand what the implications are of just removing
that data.
Thanks
Paul
On 24 Jan 2022, at 18:50, Bowen Song <bo...@bso.ng> wrote:
Hmm, interesting... Try "select * from system.repairs;" in cqlsh on a
slow starting node, do you get a lots of rows? This is the most
obvious loop run (indirectly) by the ActiveRepairService.start().
On 24/01/2022 13:30, Romain Anselin wrote:
Hi everyone,
We generated a JFR profile of the startup phase of Cassandra with Paul, and it would appear that
the time is spent in the ActiveRepairSession within the main thread (11mn of execution of the
"main" thread in his environment, vs 15s in mine), which has been introduced in
CASSANDRA-9143 based on a "view git blame" of the source code
https://github.com/apache/cassandra/blob/6709111ed007a54b3e42884853f89cabd38e4316/src/java/org/apache/cassandra/service/CassandraDaemon.java#L381
That seem to match with the gap we see in the logs, where the time is spent just before
the "Preloaded" statement in the logs which comes just after in the
CassandraDaemon code.
INFO [main] 2022-01-20 09:44:40,198 StorageService.java:830 - Token metadata:
Normal Tokens:
... <skipping non "main" thread logs> ...
WARN [Messaging-EventLoop-3-1] 2022-01-20 09:45:13,243 NoSpamLogger.java:95 -
/IP1:7000->/IP1:7000-SMALL_MESSAGES-[no-channel] dropping message of type
SCHEMA_VERSION_REQ whose timeout expired before reaching the network
INFO [main] 2022-01-20 09:55:01,134 QueryProcessor.java:150 - Preloaded 0
prepared statements
Remains to determine what that is doing more in details and why it's taking
longer and longer on startup.
We also exported the sstablemetadata from all sstables in one node, and at this stage, we
can see we have 300 sstables out of 577 with "Repaired at:" set.
|cd /var/lib/cassandra/data find . -name '*Data*' | while read datf;
do echo ++++ $datf ++++; sudo -u cassandra sstablemetadata $datf;
done >> ~/sstablemetadata.txt cqlsh -e "paging off; select * from
system.repairs" >> ~/repairs.out $ egrep 'Repaired at: 1'
sstablemetadata.txt | wc -l 300 $ egrep 'Repaired at:'
sstablemetadata.txt | wc -l 577|
More info to come
Regards - Romain
On 19/01/2022 13:10, Paul Chandler wrote:
Hi Bowen,
Thanks for the reply, these have been our normal shutdowns, so we do a nodetool
drain before restarting the service, so I would have thought there should not
be any commtlogs
However there is these messages for one commit log, But looks like it has
finished quickly and correctly:
INFO [main] 2022-01-19 10:08:22,811 CommitLog.java:173 - Replaying
/var/lib/cassandra/commitlog/CommitLog-7-1642094921295.log
WARN [main] 2022-01-19 10:08:22,839 CommitLogReplayer.java:305 - Origin of 2
sstables is unknown or doesn't match the local node; commitLogIntervals for
them were ignored
Repeated about 10 times
WARN [main] 2022-01-19 10:08:22,842 CommitLogReplayer.java:305 - Origin of 3
sstables is unknown or doesn't match the local node; commitLogIntervals for
them were ignored
INFO [main] 2022-01-19 10:08:22,853 CommitLogReader.java:256 - Finished
reading /var/lib/cassandra/commitlog/CommitLog-7-1642094921295.log
INFO [main] 2022-01-19 10:08:22,882 CommitLog.java:175 - Log replay complete,
0 replayed mutations
Thanks
Paul
On 19 Jan 2022, at 13:03, Bowen Song<bo...@bso.ng> wrote:
Nothing obvious from the logs you posted.
Generally speaking, replaying commit log is often the culprit when a node takes a long
time to start. I have seen many nodes with large memtable and commit log size limit
spending over half an hour replaying the commit log. I usually do a "nodetool
flush" before shutting down the node to help speed up the start time if the shutdown
was planned. There isn't much you can do about unexpected shutdown, such as server
crashes. When that happens, the only reasonable thing to do is wait for the commit log
replay to finish. You should see log entries related to replaying commit logs if this is
the case.
However, if you don't find any logs related to replaying commit logs, the cause
may be completely different.
On 19/01/2022 11:54, Paul Chandler wrote:
Hi all,
We have upgraded a couple of clusters from 3.11.6, now we are having issues
when we restart the nodes.
The node will either hang or take 10-30 minute to restart, these are the last
messages we have in the system.log:
INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,267 FileUtils.java:545 -
Deleting file during startup:
/var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-184-big-Summary.db
INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,268 LogTransaction.java:240 -
Unfinished transaction log, deleting
/var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-185-big-Data.db
INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,268 FileUtils.java:545 -
Deleting file during startup:
/var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-185-big-Summary.db
INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,269 LogTransaction.java:240 -
Unfinished transaction log, deleting
/var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-186-big-Data.db
INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,270 FileUtils.java:545 -
Deleting file during startup:
/var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb-186-big-Summary.db
INFO [NonPeriodicTasks:1] 2022-01-19 10:08:23,272 LogTransaction.java:240 -
Unfinished transaction log, deleting
/var/lib/cassandra/data/system/table_estimates-176c39cdb93d33a5a2188eb06a56f66e/nb_txn_unknowncompactiontype_bc501d00-790f-11ec-9f80-85
8854746758.log
INFO [MemtableFlushWriter:2] 2022-01-19 10:08:23,289 LogTransaction.java:240
- Unfinished transaction log, deleting
/var/lib/cassandra/data/system/local-7ad54392bcdd35a684174e047860b377/nb_txn_flush_bc52dc20-790f-11ec-9f80-858854746758.log
The debug log has messages from DiskBoundaryManager.java at the same time, then
it just has the following messages:||
DEBUG [ScheduledTasks:1] 2022-01-19 10:28:09,430 SSLFactory.java:354 -
Checking whether certificates have been updated []
DEBUG [ScheduledTasks:1] 2022-01-19 10:38:09,431 SSLFactory.java:354 -
Checking whether certificates have been updated []
DEBUG [ScheduledTasks:1] 2022-01-19 10:48:09,431 SSLFactory.java:354 -
Checking whether certificates have been updated []
DEBUG [ScheduledTasks:1] 2022-01-19 10:58:09,431 SSLFactory.java:354 -
Checking whether certificates have been updated []
It seems to get worse after each restart, and then it gets to the state where
it just hangs, then the only thing to do is to re bootstrap the node.
Once I had re bootstrapped all the nodes in the cluster, I thought the cluster
was stable, but I have now got the case where the one of the nodes is hanging
again.
Does anyone have an ideas what is causing the problems ?
Thanks
Paul Chandler