Full repair with -pr option getting stuck on Cassandra 4.0.10

2024-02-08 Thread manish khandelwal
In a two datacenter cluster (11 nodes each) we are seeing repair getting
stuck. Issue is when repair is triggered on a particular keyspace repair
session is lost and cassandra never returns for that particular session.
There are no "WARN" or "ERROR" logs in Cassandra logs. No message dropped
seen  in tpstats statistics. How can we debug such a scenario? What else
could we look into?

Regards
Manish


SStables stored in directory with different table ID than the one found in system_schema.tables

2024-02-08 Thread Michalis Kotsiouros (EXT) via user
Hello community,
I have a Cassandra server on 3.11.13 on SLESS 12.5.
I have noticed in the logs the following line:
Datacenter A
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId d8c1bea0-82ed-11ee-8ac8-1513e17b60b1. If a table was just created, this is 
likely due to the schema not being fully propagated.  Please wait for schema 
agreement on table creation.
Datacenter B
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId 0fedabd0-11f7-11ea-9450-e3ff59b2496b. If a table was just created, this is 
likely due to the schema not being fully propagated.  Please wait for schema 
agreement on table creation.

This error results in failure of all streaming tasks.
I have checked the sstables directories and I see that:

In Datacenter A the sstables directory is:
-0fedabd0-11f7-11ea-9450-e3ff59b2496b

In Datacenter B the sstables directory are:
-0fedabd011f711ea9450e3ff59b2496b
- d8c1bea082ed11ee8ac81513e17b60b1
In this datacenter although the - d8c1bea082ed11ee8ac81513e17b60b1 
dir is more recent it is empty and all sstables are stored under 
-0fedabd011f711ea9450e3ff59b2496b

I have also checked the system_schema.tables in all Cassandra nodes and I see 
that for the specific table the ID is consistent across all nodes and it is:
d8c1bea0-82ed-11ee-8ac8-1513e17b60b1

So it seems that the schema is a bit mess in all my datacenters. I am not 
really interested to understand how it ended up in this status but more on how 
to recover.
Both datacenters seem to have this inconsistency between the id stored 
system_schema.tables and the one used in the sstables directory.
Do you have any proposal on how to recover?
I have thought of renaming the dir from 
-0fedabd011f711ea9450e3ff59b2496b to - 
d8c1bea082ed11ee8ac81513e17b60b1 but it does not look safe and I would not want 
to risk my data since this is a production system.

Thank you in advance.

BR
Michail Kotsiouros


RE: SStables stored in directory with different table ID than the one found in system_schema.tables

2024-02-08 Thread Michalis Kotsiouros (EXT) via user
Hello everyone,
I have found this post on-line and seems to be recent.
Mismatch between Cassandra table uuid in linux file directory and 
system_schema.tables - Stack 
Overflow
The description seems to be the same as my problem as well.
In this post, the proposal is to copy the sstables to the dir with the ID found 
in system_schema.tables. I think it is equivalent with my assumption to rename 
the directories
Have anyone seen this before? Do you consider those approaches safe?

BR
MK

From: Michalis Kotsiouros (EXT)
Sent: February 08, 2024 11:33
To: user@cassandra.apache.org
Subject: SStables stored in directory with different table ID than the one 
found in system_schema.tables

Hello community,
I have a Cassandra server on 3.11.13 on SLESS 12.5.
I have noticed in the logs the following line:
Datacenter A
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId d8c1bea0-82ed-11ee-8ac8-1513e17b60b1. If a table was just created, this is 
likely due to the schema not being fully propagated.  Please wait for schema 
agreement on table creation.
Datacenter B
org.apache.cassandra.db.UnknownColumnFamilyException: Couldn't find table for 
cfId 0fedabd0-11f7-11ea-9450-e3ff59b2496b. If a table was just created, this is 
likely due to the schema not being fully propagated.  Please wait for schema 
agreement on table creation.

This error results in failure of all streaming tasks.
I have checked the sstables directories and I see that:

In Datacenter A the sstables directory is:
-0fedabd0-11f7-11ea-9450-e3ff59b2496b

In Datacenter B the sstables directory are:
-0fedabd011f711ea9450e3ff59b2496b
- d8c1bea082ed11ee8ac81513e17b60b1
In this datacenter although the - d8c1bea082ed11ee8ac81513e17b60b1 
dir is more recent it is empty and all sstables are stored under 
-0fedabd011f711ea9450e3ff59b2496b

I have also checked the system_schema.tables in all Cassandra nodes and I see 
that for the specific table the ID is consistent across all nodes and it is:
d8c1bea0-82ed-11ee-8ac8-1513e17b60b1

So it seems that the schema is a bit mess in all my datacenters. I am not 
really interested to understand how it ended up in this status but more on how 
to recover.
Both datacenters seem to have this inconsistency between the id stored 
system_schema.tables and the one used in the sstables directory.
Do you have any proposal on how to recover?
I have thought of renaming the dir from 
-0fedabd011f711ea9450e3ff59b2496b to - 
d8c1bea082ed11ee8ac81513e17b60b1 but it does not look safe and I would not want 
to risk my data since this is a production system.

Thank you in advance.

BR
Michail Kotsiouros


Regarding Cassandra 4 Support End time

2024-02-08 Thread ranju goel
Hi All,

As per the link (https://cassandra.apache.org/_/download.html) Cassandra
4.0 is going to be maintained till release of 5.1. (July 2024 tentative).
Since Cassandra 5 is yet to be released, Can we expect Cassandra 4.0.x
support to be increased. This information will help us in planning our
upgrade.

Thanks & Regards
Ranju