Hi, all,
We had a failed HDD on one node. The node was shut down pending repair. There
are now 4 other nodes with Cassandra not running and unable to startup due to
the following kinds of error. Is this kind of thing due to the original
stopped node?
ERROR [main] 2022-12-12 14:58:10,838 LogReplicaSet.java:145 - Mismatched line
in file
nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log: got
'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]'
expected
'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]',
giving up
ERROR [main] 2022-12-12 14:58:10,838 LogFile.java:161 - Failed to read records
for transaction log
[nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in
/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
ERROR [main] 2022-12-12 14:58:10,840 LogTransaction.java:551 - Unexpected disk
state: failed to read transaction log
[nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in
/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
Files and contents follow:
/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67416-big-,0,8][1963077611]
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67405-big-,1665045804823,8][1428695358]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67402-big-,1665050002894,8][2407633150]
COMMIT:[,0,0][2613697770]
/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67416-big-,0,8][1963077611]
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]
***Does not match
<ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]>
in first replica file
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67405-big-,1665045804823,8][1428695358]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67402-big-,1665050002894,8][2407633150]
COMMIT:[,0,0][2613697770]
ERROR [main] 2022-12-12 14:58:10,841 CassandraDaemon.java:911 - Cannot remove
temporary or obsoleted files for hades.prod_md5_sha1 due to a problem with
transaction log files. Please check records with problems in the log messages
above and fix them. Refer to the 3.0 upgrading instructions in NEWS.txt for a
description of transaction log files.
Sstableutil only returned
ERROR 15:35:52,217 Mismatched line in file
nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log: got
'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]'
expected
'ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]',
giving up
ERROR 15:35:52,219 Failed to read records for transaction log
[nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in
/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
ERROR 15:35:52,220 Unexpected disk state: failed to read transaction log
[nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log in
/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c,
/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c]
Files and contents follow:
/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67416-big-,0,8][1963077611]
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67405-big-,1665045804823,8][1428695358]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67402-big-,1665050002894,8][2407633150]
COMMIT:[,0,0][2613697770]
/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb_txn_anticompactionafterrepair_5865e530-7a18-11ed-950f-954f6819a607.log
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67416-big-,0,8][1963077611]
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67417-big-,0,8][3940068469]
***Does not match
<ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]>
in first replica file
ADD:[/mnt/data01/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67418-big-,0,8][2798461787]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67405-big-,1665045804823,8][1428695358]
REMOVE:[/mnt/data02/cassandra/data/hades/prod_md5_sha1-bb5bdca002b111edb9761fc3bb7c847c/nb-67402-big-,1665050002894,8][2407633150]
COMMIT:[,0,0][2613697770]
Is there a simple and clean way to fix this? Ie., for the "got" and "expected"
can I just add the "expected" into the transaction file and remove the problem
"got" ?