Re: Issue replacing a dead node

2025-05-27 Thread Courtney
One last update: After kicking it more, it finally fully joined the cluster. The third time the server was rebooted and after that it eventually reached the UN state. I wish I had kept the link, but I had read that someone had a similar issue joining a node to a cluster with 4.1.x and the

Re: Issue replacing a dead node

2025-05-23 Thread Courtney
Some updates after getting back to this. I did hardware tests and could not find any hardware issues. Instead of trying a replace, I went the route of removing the dead node entirely and then adding in a new node. The new node is still joining, but I am hitting some oddities in the log. When

Re: Issue replacing a dead node

2025-05-16 Thread Sebastian Marsching
To add on to what Bowen already wrote, if you cannot find any reason in the logs at all, I would retry using different hardware. In the recent past I have seen two cases where strange Cassandra problems were actually caused by broken hardware (in both cases, a faulty memory module caused the

Re: Issue replacing a dead node

2025-05-16 Thread Courtney
y had open connections to the server still. The hardware is new. A disk issue already would be odd, but perhaps not unsurprising if that was the case. On 5/16/25 7:27 PM, Bowen Song via user wrote: In my experience, failed bootstrap / node replacement always leave some traces in the logs. At the

Re: Issue replacing a dead node

2025-05-16 Thread Bowen Song via user
anything that can cause the process to fail and doesn't leave a trace in the log. BTW, the relevant logs can be hours before the symptom becomes visible, because a failed streaming session does not cause Cassandra to immediately abort other active streaming sessions, and the remaining acti

Re: Issue replacing a dead node

2025-05-15 Thread Courtney
I checked all the logs and really couldn't find anything. I couldn't find any sort of errors in dmesg, system.log, debug.log, gc.log (maybe up the log level?), systemd journal...the logs are totally clean. It just stops gossiping all of a sudden at 22GB of data each time, then the

Re: Issue replacing a dead node

2025-05-15 Thread Bowen Song via user
logs, dmesg, systemd journal, etc., on the new node, and other nodes in the cluster too. Also, I would try `nodetool bootstrap resume` on the replacement node. On 12/05/2025 09:53, Courtney wrote: Hello everyone, I have a cluster with 2 datacenters. I am using GossipingPropertyFileSnitch

Issue replacing a dead node

2025-05-12 Thread Courtney
Hello everyone, I have a cluster with 2 datacenters. I am using GossipingPropertyFileSnitch as my endpoint snitch. Cassandra version 4.1.8. One datacenter is fully Ubuntu 24.04 and OpenJDK 11 and another is Ubuntu 20.04 on OpenJDK 8. A seed node died in my second DC running Ubuntu 20.04

Maximum number of tables in a cluster

2025-04-29 Thread Sébastien Rebecchi via user
Hello I've heard that it's considered good practice to keep a cluster under 200 tables. Is that still true with Cassandra 5? Best regards, Sébastien.

Re: Batch Queries Timeout When Private IP of a Node Fails in Multi-DC Cassandra 4.1.4

2025-03-25 Thread manish khandelwal
public interface, the affected node still appears up, leading to batch query timeouts. I would like to seek suggestions on the best approach to filter out such nodes (i.e., nodes with a down private interface) to prevent query disruptions. Some possible approaches I am considering are: 1. Identifyi

Batch Queries Timeout When Private IP of a Node Fails in Multi-DC Cassandra 4.1.4

2025-02-04 Thread manish khandelwal
Hi All I have a Cassandra 4.1.4 cluster with two data centers, each having 3 nodes. The configuration is: listen_address = private IP, broadcast_address = public IP, listen_on_broadcast_address = true, prefer_local = true *Issue Observed:* - We execute a multi-partition batch query with

Re: Migration Cassandra to a new data center

2024-11-05 Thread Bowen Song via user
Hinted hand off is a best effort approach, and relying on it alone is a bad idea. Hints can get lost due to a number of reasons, such as getting too old or too big, or the node storing the hints dies. You should rely on regular repair to guarantee the correctness of the data. You may use

Re: Migration Cassandra to a new data center

2024-11-05 Thread edi mari
Thank you for your reply, Bowen. Correct, the questions were about migrating the server hardware to a new location, not the Cassandra DC. Wouldn’t it be a good idea to use the hints to complete the data to DC3? I'll extend the hint window (e.g., to one week) and allow the other data centers

Re: Migration Cassandra to a new data center

2024-11-05 Thread Bowen Song via user
You just confirmed my suspicion. You are indeed referring to both physical location of servers and the logical Cassandra DC with the same term here. The questions are related to the procedure of migrating the server hardware to a new location, not the Cassandra DC. Assuming that the IP

Re: Migration Cassandra to a new data center

2024-11-05 Thread edi mari
Each physical data center corresponds to a "logical" Cassandra DC (a group of nodes). In our situation, we need to move one of our physical data centers (i.e., the server rooms) to a new location, which will involve an extended period of downtime. Thanks Edi On Tue, Nov 5, 2024 at 1:2

Re: Migration Cassandra to a new data center

2024-11-05 Thread Bowen Song via user
From the way you wrote this, I suspect the name DC may have different meaning here. Are you talking about the physical location (i.e server rooms), or the Cassandra DC (i.e. group of nodes for replication purposes)? On 05/11/2024 11:01, edi mari wrote: Hello, We have a Cassandra cluster

Migration Cassandra to a new data center

2024-11-05 Thread edi mari
Hello, We have a Cassandra cluster deployed across three different data centers, with each data center (DC1, DC2, and DC3) hosting 50 Cassandra nodes. We are currently saving one replica in each data center. We plan to migrate DC3, including storage and servers, to a new data center. 1. What

Re: Change num_tokens in a live cluster

2024-05-16 Thread Gábor Auth
Hi, On Thu, 16 May 2024, 17:40 Bowen Song via user, wrote: > Replacing nodes one by one in the existing DC is not the same as replacing > an entire DC. > > For example, if you change from 256 vnodes to 4 vnodes on a 100 nodes > single DC cluster. Before you start, each node

Re: Change num_tokens in a live cluster

2024-05-16 Thread Bowen Song via user
Replacing nodes one by one in the existing DC is not the same as replacing an entire DC. For example, if you change from 256 vnodes to 4 vnodes on a 100 nodes single DC cluster. Before you start, each node owns ~1% of the cluster's data. But after changing 99 nodes, the last remaining

Re: Change num_tokens in a live cluster

2024-05-16 Thread Gábor Auth
Hi, On Thu, 16 May 2024, 16:55 Jon Haddad, wrote: > Unless your cluster is very small, using the method of adding / removing > nodes will eventually result in putting a much larger portion of your > dataset on a very few number of nodes. I *highly* discourage this. > It has ~15 GB

Re: Change num_tokens in a live cluster

2024-05-16 Thread Gábor Auth
Hi, On Thu, 16 May 2024, 10:37 Bowen Song via user, wrote: > You can also add a new DC with the desired number of nodes and num_tokens > on each node with auto bootstrap disabled, then rebuild the new DC from the > existing DC before decommission the existing DC. This method only

Re: Change num_tokens in a live cluster

2024-05-16 Thread Jon Haddad
Unless your cluster is very small, using the method of adding / removing nodes will eventually result in putting a much larger portion of your dataset on a very few number of nodes. I *highly* discourage this. The only correct, safe path is Bowen's suggestion of adding another D

Re: Change num_tokens in a live cluster

2024-05-16 Thread Bowen Song via user
You can also add a new DC with the desired number of nodes and num_tokens on each node with auto bootstrap disabled, then rebuild the new DC from the existing DC before decommission the existing DC. This method only needs to copy data once, and can copy from/to multiple nodes concurrently

Change num_tokens in a live cluster

2024-05-16 Thread Gábor Auth
Hi. Is there a newer/easier workflow to change num_tokens in an existing cluster than add a new node to the cluster with the other num_tokens value and decommission an old one, repeat and rinse through all nodes? -- Bye, Gábor AUTH

Streaming a working session with 5.0 - UCS

2024-03-05 Thread Jon Haddad
Hey everyone, Today starting at 10am PT I'm going to be streaming my session messing with 5.0, looking at UCS. I'm doing this with my easy-cass-lab and easy-cass-stress tools using a build of C* from last night. I'll also show some of the cool things you can do with my tools.

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-19 Thread Gowtham S
Thanks for your valuable reply, will check. Thanks and regards, Gowtham S On Mon, 19 Feb 2024 at 15:46, Bowen Song via user wrote: > You can have a read at > https://www.datastax.com/blog/cassandra-anti-patterns-queues-and-queue-datasets > > Your table schema does not incl

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-19 Thread Bowen Song via user
You can have a read at https://www.datastax.com/blog/cassandra-anti-patterns-queues-and-queue-datasets Your table schema does not include the most important piece of information - the partition keys (and clustering keys, if any). Keep in mind that you can only efficiently query Cassandra by

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-18 Thread Gowtham S
Hi Bowen which is a well documented anti-pattern. > Can you please explain more on this, I'm not aware of it. It will be helpful to make decisions. Please find the below table schema *Table schema* TopicName - text Partition - int MessageUUID - text Actual data - text OccurredTime - T

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-17 Thread Slater, Ben via user
TBH, this sounds to me like a very expensive (in terms of effort) way to deal with whatever Kafka unreliability you’re having. We have lots of both Kafka and Cassandra clusters under management and I have no doubt that Kafka is capable of being as reliable as Cassandra (and both are capable

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-17 Thread Bowen Song via user
Hi Gowtham, On the face of it, it sounds like you are planning to use Cassandra for a queue-like application, which is a well documented anti-pattern. If that's not the case, can you please show the table schema and some example queries? Cheers, Bowen On 17/02/2024 08:44, Gowtham S

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-17 Thread Gowtham S
With that approach I assume you will use Cassandra as a queue. You have to > be careful about modeling and should use multiple partitions may be based > on hour or fixed size partitions. > > Another thing is that Kafka has really high throughput so you should plan > how many Cassandra n

Re: Requesting Feedback for Cassandra as a backup solution.

2024-02-17 Thread CPC
hi, We implemented same strategy in one of our customers. Since 2016 we had one downtime in one DC because of high temperature(whole physical DC shutdown). With that approach I assume you will use Cassandra as a queue. You have to be careful about modeling and should use multiple partitions may

Requesting Feedback for Cassandra as a backup solution.

2024-02-17 Thread Gowtham S
Dear Cassandra Community, I am reaching out to seek your valuable feedback and insights on a proposed solution we are considering for managing Kafka outages using Cassandra. At our organization, we heavily rely on Kafka for real-time data processing and messaging. However, like any technology

Re: Cassandra Summit. What a week!

2023-12-22 Thread Patrick McFadin
Back again with an update for everyone asking for the talk recordings. The Linux Foundation has broken up each talk into individual videos and made a playlist. You can find it here: https://www.youtube.com/playlist?list=PLbzoR-pLrL6rgDn-2Yo-d5liFEnuCCi85 Patrick On Mon, Dec 18, 2023 at 11:41 AM

Cassandra Summit. What a week!

2023-12-18 Thread Patrick McFadin
YouTube channel. They aren’t easy to find because they are stream recordings per room for an entire day. Takes a bit of searching to find a talk. We have some folks working on getting them indexed and possibly split up. I’ll send an email when it’s been worked out.On the tail of last week's e

Re: Who wants a free Cassandra t-shirt?

2023-07-22 Thread Patrick McFadin
that’s happening. Thanks for the feedback! On Fri, Jul 21, 2023 at 6:59 PM Deepak Vohra wrote: > > Suggestion: The link should display some message if I already took the > survey such as - "You have already completed the survey". Instead, it lets > you take a survey again. > &

Re: Who wants a free Cassandra t-shirt?

2023-07-21 Thread Deepak Vohra via user
Suggestion: The link should display some message if I already took the survey such as - "You have already completed the survey". Instead, it lets you take a survey again. On Friday, July 21, 2023 at 08:48:27 p.m. EDT, Patrick McFadin wrote: We have about another week l

Re: Who wants a free Cassandra t-shirt?

2023-07-21 Thread guo Maxwell
It seems I‘ve never had one… 😸 Patrick McFadin 于2023年7月22日 周六上午8:48写道: > We have about another week left on the user survey I posted last week. The > response has been slow, so it's time to get things in gear. > > I found a box of Cassandra t-shirts that will make an excellen

Who wants a free Cassandra t-shirt?

2023-07-21 Thread Patrick McFadin
We have about another week left on the user survey I posted last week. The response has been slow, so it's time to get things in gear. I found a box of Cassandra t-shirts that will make an excellent thank you for anyone filling out the survey. Once the survey window closes, I'll pic

LocalQuorum requiring 3 replicas with RF=3 when a node is joining

2023-07-19 Thread Luciano Greiner
Hello. This company I am working for has small production cluster with the following setup: 2 DCs 3 nodes each RF = 3 (all keyspaces) num_tokens = 4 repairs are up to date We are in the process of adding more nodes to the cluster, although while testing our procedures on a staging cluster (same

Re: Is there a way to find out if a server is part of application connection string?

2023-06-07 Thread Miles Garnsey
t; Hi, > > We have a cluster with many applications connecting to it. > We need to decommission few of the servers from the cluster . > Without asking the application team, is there any way to know the ips > of the application connection string? > Does cassandra logs (system or de

Re: Is there a way to find out if a server is part of application connection string?

2023-06-06 Thread Miklosovic, Stefan
Hi Surbhi, maybe looking into system_views.clients virtual table in case you are on a cluster of version 4.0+ would be helpful? That table contains all clients connected to that particular Cassandra node having "address" and "hostname" columns as well as "username"

Is there a way to find out if a server is part of application connection string?

2023-06-06 Thread Surbhi Gupta
Hi, We have a cluster with many applications connecting to it. We need to decommission few of the servers from the cluster . Without asking the application team, is there any way to know the ips of the application connection string? Does cassandra logs (system or debug) this information somewhere

Re: Request To Allow For a New Account Creation in ASF JIRA

2023-04-20 Thread Priya Sharma
Hello Ranju, You can use the ASF self-help portal to request a Jira account https://selfserve.apache.org/jira-account.html On Fri, 21 Apr 2023 at 11:03, Ranju Jain via user wrote: > > Hi, > > > > I need to create an ASG JIRA account. Please guide me steps. > > > >

Request To Allow For a New Account Creation in ASF JIRA

2023-04-20 Thread Ranju Jain via user
Hi, I need to create an ASG JIRA account. Please guide me steps. Regards Ranju

Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

2023-04-12 Thread Ralph Boehme
f the existing paxos implementation (under contention, under latency, under cluster strain) can cause undefined states. I believe that a subsequent serial read will deterministically resolve the state (look at cassandra-12126), but that has a cost (both the extra operation and the code complexity)

Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

2023-04-12 Thread Jeff Jirsa
Are you always inserting into the same partition (with contention) or different ? Which version are you using ? The short tldr is that the failure modes of the existing paxos implementation (under contention, under latency, under cluster strain) can cause undefined states. I believe that a

Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

2023-04-12 Thread Ralph Boehme
sync clocks or long stop-the-world GC pauses. hm, I'll check the logs, but I can reproduce this 100% on an idle test cluster just by running a simple test client that generates a smallish workload where just 2 processes on a single host hammer the Cassandra cluster with LWTs. nothing i

Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

2023-04-11 Thread Ralph Boehme
pauses. hm, I'll check the logs, but I can reproduce this 100% on an idle test cluster just by running a simple test client that generates a smallish workload where just 2 processes on a single host hammer the Cassandra cluster with LWTs. Maybe LWTs are not meant to be used this way? BTW,

Re: CAS operation result is unknown - proposal accepted by 1 but not a quorum

2023-04-11 Thread Bowen Song via user
fast, I think it's worth mentioning that LWT comes with additional cost and is much slower than a straight forward INSERT/UPDATE. You should avoid using it if possible. For example, if all of the Cassandra clients (samba servers) are running on the same machine, it may be far more efficie

CAS operation result is unknown - proposal accepted by 1 but not a quorum

2023-04-11 Thread Ralph Boehme
mentation of the SMB filesharing protocol from Microsoft, we have some specific requirements wrt to database behaviour: - fast - fast - fast - highly consistent, iow linearizable We got away without a linearizable database as historically the SMB protocol and the SMB client implementations w

Re: Adding an IPv6-only server to a dual-stack cluster

2022-11-18 Thread Bowen Song via user
Not that simple. By making a node listen on both IPv4 and IPv6, they will accept connections from both, but other nodes will still only trying to connect to this node on the address it is broadcasting. That means if a node's broadcasting a IPv4 address, then all other nodes in the cluster

Re: Adding an IPv6-only server to a dual-stack cluster

2022-11-18 Thread Lapo Luchini
esses for you" case? On 2022-11-16 14:03, Bowen Song via user wrote: I would expect that you'll need NAT64 in order to have a cluster with mixed nodes between IPv6-only servers and dual-stack servers that's broadcasting their IPv4 addresses. Once all IPv4-broadcasting dual-stack

Re: Adding an IPv6-only server to a dual-stack cluster

2022-11-16 Thread Bowen Song via user
I would expect that you'll need NAT64 in order to have a cluster with mixed nodes between IPv6-only servers and dual-stack servers that's broadcasting their IPv4 addresses. Once all IPv4-broadcasting dual-stack nodes are replaced with nodes either IPv6-only or dual-stack but broadca

Adding an IPv6-only server to a dual-stack cluster

2022-11-09 Thread Lapo Luchini
I have a (3.11) cluster running on IPv4 addresses on a set of dual-stack servers; I'd like to add a new IPv6-only server to the cluster… is it possible to have the dual-stack ones answer on IPv6 addresses as well (while keeping the single IPv4 address as broadcast_address, I guess)?

Re: Denylisting with a composite partition key

2022-10-25 Thread Cheng Wang via user
set to >> QUORUM: >> >> WARN [main] 2022-10-25 11:57:27,238 NoSpamLogger.java:108 - Attempting >> to load denylist and not enough nodes are available for a QUORUM refresh. >> Reload the denylist when unavailable nodes are recovered to ensure your >> denylist remains in

Re: Denylisting with a composite partition key

2022-10-25 Thread Cheng Wang via user
iling because the denylist_consistency_level was set to > QUORUM: > > WARN [main] 2022-10-25 11:57:27,238 NoSpamLogger.java:108 - Attempting to > load denylist and not enough nodes are available for a QUORUM refresh. > Reload the denylist when unavailable nodes are recovered to ensure your > denylist r

Re: Denylisting with a composite partition key

2022-10-25 Thread Aaron Ploetz
Works! So I was running on my *local*, and all of my attempts to add to the denylist were failing because the denylist_consistency_level was set to QUORUM: WARN [main] 2022-10-25 11:57:27,238 NoSpamLogger.java:108 - Attempting to load denylist and not enough nodes are available for a QUORUM

Re: Denylisting with a composite partition key

2022-10-21 Thread Aaron Ploetz
Awesome. Thank you, Cheng! I’ll give this a shot and let you know. Thanks, Aaron > On Oct 21, 2022, at 12:45 AM, Cheng Wang wrote: > >  > Hi Aaron, > > After reading through the code, I finally figured out the issue. So back to > your original question where you f

Re: Denylisting with a composite partition key

2022-10-20 Thread Cheng Wang via user
meters doesn't exist in bean org.apache.cassandra.db:type=StorageProxy It's not a Cassandra issue since it failed at the JMX parser stage, even before it goes to the Cassandra internal StorageProxy::denylistKey method. Yes, you got the right gist. It's because of the extra space be

Re: Denylisting with a composite partition key

2022-10-20 Thread Aaron Ploetz
No worries, Cheng! So I actually pivoted a little and adjusted my example table to use a single integer-based partition key. aaron@cqlsh:stackoverflow> SELECT ks_name, table_name, blobAsint(key) FROM system_distributed.partition_denylist WHERE ks_name='stackoverflow' A

Re: Denylisting with a composite partition key

2022-10-19 Thread Cheng Wang via user
Hi Aaron, Sorry for the late reply, was dealing with a production issue (maybe another topic for Cassandra Summit :-)). Are you running on your local machine? Then yes, you do need to enable the config for all the following enable_partition_denylist: true enable_denylist_writes: true

Re: Denylisting with a composite partition key

2022-10-19 Thread Aaron Ploetz
2 > Minneapolis,MN | 202210 | 2022-10-17 11:00:00.00+ |2 > > (7 rows) > > As you can see, I can still select the partition. I was really hoping one > of those combinations would do it. > > Looking at the StorageProxyTest.java in the project, I saw that it was >

Re: Denylisting with a composite partition key

2022-10-17 Thread Aaron Ploetz
I was really hoping one of those combinations would do it. Looking at the StorageProxyTest.java in the project, I saw that it was delimited by a colon ":", which is why I tried that, too. Still looking for the right way to enter both of those keys. Thanks, Aaron On Mon, Oct 17, 20

Re: Denylisting with a composite partition key

2022-10-17 Thread Cheng Wang via user
low | weather_sensor_data | 'Minneapolis, MN', 202210 On Mon, Oct 17, 2022 at 2:30 PM Cheng Wang wrote: > Hi Aaron, > > Yes, you can directly insert into the system_distributed.partition_denylist > instead of using JMX. Jordan wrote a blog post for denylist > > https://c

Re: Denylisting with a composite partition key

2022-10-17 Thread Cheng Wang via user
Hi Aaron, Yes, you can directly insert into the system_distributed.partition_denylist instead of using JMX. Jordan wrote a blog post for denylist https://cassandra.apache.org/_/blog/Apache-Cassandra-4.1-Denylisting-Partitions.html And the syntax error, one way around is to put $$ around like

Denylisting with a composite partition key

2022-10-17 Thread Aaron Ploetz
tKey with 4 parameters doesn't exist in bean org.apache.cassandra.db:type=StorageProxy Obviously, it's reading the space between "Minneapolis," and "MN" as a delimiter. What's the right way to handle commas, spaces, and composite keys for this? Also, is there another way to accomplish this without using JMX? Thanks, Aaron

Re: Change the compression algorithm on a production table at runtime

2022-09-19 Thread C. Scott Andreas
Thanks for reaching out. Changing the compressor for a table is both safe and common. Future flushes / compactions will use the new codec as SSTables are written, and SSTables currently present on disk will remain readable with the previous codec. You may also want to take a look at the

Change the compression algorithm on a production table at runtime

2022-09-19 Thread Eunsu Kim
Hi all According to https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlAlterTable.html <https://docs.datastax.com/en/cql-oss/3.3/cql/cql_reference/cqlAlterTable.html>, it can be very problematic to modify the Compaction strategy on a table running in production. Similarly,

RE: Erroneous node. - node is not a member of the

2022-08-24 Thread Marc Hoppins
Update again: The dc1-cass14 node stopped accepting/bootstrapping/streaming early on and now there is just a bunch of WARN [OptionalTasks:1] 2022-08-24 13:10:16,761 CassandraRoleManager.java:344 - CassandraRoleManager skipped default role setup: some nodes were not ready INFO [OptionalTasks

RE: Erroneous node. - node is not a member of the

2022-08-24 Thread Marc Hoppins
Sent: Wednesday, August 24, 2022 9:35 AM To: user@cassandra.apache.org Subject: RE: Erroneous node. - node is not a member of the Also, I just had some changes made to the cass.yml config so thought that is I rolling restart the nodes it might help the problem. Now I have a startup problem with

RE: Erroneous node. - node is not a member of the

2022-08-24 Thread Marc Hoppins
Also, I just had some changes made to the cass.yml config so thought that is I rolling restart the nodes it might help the problem. Now I have a startup problem with an existing node...with a similar name Original problem node = dc2-cass14 Existing node = dc1-cass14 and am getting: ERROR

Erroneous node. - node is not a member of the

2022-08-24 Thread Marc Hoppins
Hi all, I added a node but forgot to specify the correct rack so I stopped the join and removed it. When I tried adding it again it was taking a LONG time to join. I tried draining before stopping the service but that failed. I killed the process and cleared the directories but the cluster

Re: removing a drive - 4.0.1

2022-06-09 Thread Joe Obernberger
When a drive fails in a large cluster and you don't immediately have a replacement drive, is it OK to just remove the drive from cassandra.yaml and restart the node?  Will the missing data (assuming RF=3) be re-replicated? I have disk_failure_policy set to "best_effort", but

Last week to submit a talk to ApacheCon New Orleans and the Cassandra track

2022-05-17 Thread Mick Semb Wever
ApacheCon North America will be held October 3-6, at the Sheraton Hotel in New Orleans. The CFP closes this weekend! https://www.apachecon.com/acna2022/cfp.html It will be fantastic to catch up with as many of you as possible. Even better will be the talks you share with us, but you gotta submit

Re: JMX exposing non-standard java classes, to fix requires a breaking change

2022-05-05 Thread David Capwell
Filed CASSANDRA-17580 and have a patch ready which will hide non-native java exceptions from leaking into JMX (Cassandra and third-party libraries), this is enabled by default but can be disabled via config or system property Config: jmx_hide_non_java_exceptions: false System property

Re: JMX exposing non-standard java classes, to fix requires a breaking change

2022-04-06 Thread David Capwell
concern here is maintenance, you can fix it once but it will break again. I feel we need a more general solution, open to input here! Looking into our own NodeTool we have two error codes - 1 that catches a > few java and airlift exceptions and 2 for other errors. My guess is we aim > to throw only

Re: JMX exposing non-standard java classes, to fix requires a breaking change

2022-04-06 Thread Ekaterina Dimitrova
things in our codebase. Looking into our own NodeTool we have two error codes - 1 that catches a few java and airlift exceptions and 2 for other errors. My guess is we aim to throw only exceptions that will lead to exit code 1, no? I am not sure. GetColumnIndexSize was added in trunk and I see it

JMX exposing non-standard java classes, to fix requires a breaking change

2022-04-05 Thread David Capwell
hrows org.apache.cassandra.exceptions.ConfigurationException, removing that exception does not break binary compatibility, but does break source as javac will say catching it isn't allowed as it doesn't throw. If you call the method without Cassandra jars the method will work properly until a ConfigurationException is thrown,

Re: removing a drive - 4.0.1

2022-01-07 Thread Joe Obernberger
1/7/2022 4:38 PM, Dmitry Saprykin wrote: There is a jira ticket describing your situation https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-14793 I may be wrong but is seems that system directories are pinned to first data directory in cassandra.yaml by default. When you

Re: removing a drive - 4.0.1

2022-01-07 Thread Dmitry Saprykin
There is a jira ticket describing your situation https://issues.apache.org/jira/plugins/servlet/mobile#issue/CASSANDRA-14793 I may be wrong but is seems that system directories are pinned to first data directory in cassandra.yaml by default. When you removed first item from the list system data

Re: removing a drive - 4.0.1

2022-01-07 Thread Joe Obernberger
/cassandra     - /data/7/cassandra #    - /data/8/cassandra the node starts up OK.  I assume it will recover the missing data during a repair? -Joe On 1/7/2022 4:13 PM, Mano ksio wrote: Hi, you may have already tried, but this may help. https://stackoverflow.com/questions/29323709/unable-to-start

Re: removing a drive - 4.0.1

2022-01-07 Thread Mano ksio
Hi, you may have already tried, but this may help. https://stackoverflow.com/questions/29323709/unable-to-start-cassandra-node-already-exists can you be little narrate 'If I remove a drive other than the first one'? what does it means On Fri, Jan 7, 2022 at 2:52 PM Joe Obernberger wr

removing a drive - 4.0.1

2022-01-07 Thread Joe Obernberger
Hi All - I have a 13 node cluster running Cassandra 4.0.1.  If I stop a node, edit the cassandra.yaml file, comment out the first drive in the list, and restart the node, it fails to start saying that a node already exists in the cluster with the IP address. If I put the drive back into the

Re: [E] Re: Anyone connecting the Cassandra on a server

2021-11-29 Thread Saha, Sushanta K
Thanks Bowen! Sushanta On Fri, Nov 19, 2021 at 4:01 PM Bowen Song wrote: > This could be two questions with different answers: > > > 1. Is there anyone / who is connected to the Cassandra server right now? > > Use the netstat or ss command and check the active TCP connections on > native

Re: Incremental repairs getting stuck a lot

2021-11-26 Thread James Brown
I filed this as CASSANDRA-17172 <https://issues.apache.org/jira/browse/CASSANDRA-17172> On Fri, Nov 26, 2021 at 5:33 PM Dinesh Joshi wrote: > Could you file a jira with the details? > > Dinesh > > On Nov 26, 2021, at 2:40 PM, James Brown wrote: > >  >

Re: Incremental repairs getting stuck a lot

2021-11-26 Thread Dinesh Joshi
Could you file a jira with the details? Dinesh > On Nov 26, 2021, at 2:40 PM, James Brown wrote: > >  > We're on 4.0.1 and switched to incremental repairs a couple of months ago. > They work fine about 95% of the time, but once in a while a session will get > st

Incremental repairs getting stuck a lot

2021-11-26 Thread James Brown
We're on 4.0.1 and switched to incremental repairs a couple of months ago. They work fine about 95% of the time, but once in a while a session will get stuck and will have to be cancelled (with `nodetool repair_admin cancel -s `). Typically the session will be in REPAIRING but nothing

Re: Anyone connecting the Cassandra on a server

2021-11-19 Thread Bowen Song
This could be two questions with different answers: 1. Is there anyone / who is connected to the Cassandra server right now? Use the netstat or ss command and check the active TCP connections on native port (default is 9042) 2. Is there anyone / who is connecting to the Cassandra servers be

Re: [E] Re: Anyone connecting the Cassandra on a server

2021-11-19 Thread Saha, Sushanta K
Thanks a lot Soumya, Surbhi, and Paul. Appreciate your help! Sushanta On Fri, Nov 19, 2021 at 2:19 PM Paul Chandler wrote: > I wrote a blog post describing how to do this a few years ago: > http://www.redshots.com/who-is-connecting-to-a-cassandra-cluster/ &g

Re: Anyone connecting the Cassandra on a server

2021-11-19 Thread Paul Chandler
I wrote a blog post describing how to do this a few years ago: http://www.redshots.com/who-is-connecting-to-a-cassandra-cluster/ Sent from my iPhone > On 19 Nov 2021, at 18:13, Saha, Sushanta K > wrote: > >  > I need to shutdown an old Apache Cassandra server for good

Re: Anyone connecting the Cassandra on a server

2021-11-19 Thread Surbhi Gupta
You can use tcpdump On Fri, 19 Nov 2021 at 10:34, Soumya Jena wrote: > You can just do a netstat on port 9042 to see if anything connected . > > Something like > netstat -anp | grep 9042 . > > Or you can also check for read/write client requests metrics . You can > check i

Re: Anyone connecting the Cassandra on a server

2021-11-19 Thread Soumya Jena
You can just do a netstat on port 9042 to see if anything connected . Something like netstat -anp | grep 9042 . Or you can also check for read/write client requests metrics . You can check if specific tables are taking read or writes . There is also a metrics to see number of connected clients

Anyone connecting the Cassandra on a server

2021-11-19 Thread Saha, Sushanta K
I need to shutdown an old Apache Cassandra server for good. Running 3.0.x. Any way I can determine if anyone is still connecting to the Cassandra instance running on this server? Thanks Sushanta

Re: Enabling SSL on a live cluster

2021-11-12 Thread Shaurya Gupta
> wrote: > > > > Hi Shaurya, > > > > On Tue, Nov 9, 2021 at 11:57 PM Shaurya Gupta > wrote: > >> > >> Hi, > >> > >> We want to enable node-to-node SSL on a live cluster. Could it be done > without any down time ? > > >

Re: Enabling SSL on a live cluster

2021-11-12 Thread Kiran mk
04 PM Tolbert, Andy wrote: > > Hi Shaurya, > > On Tue, Nov 9, 2021 at 11:57 PM Shaurya Gupta wrote: >> >> Hi, >> >> We want to enable node-to-node SSL on a live cluster. Could it be done >> without any down time ? > > > Yup, this is definitely doa

Re: Enabling SSL on a live cluster

2021-11-12 Thread Shaurya Gupta
Thanks Andy! It was very helpful. On Wed, Nov 10, 2021 at 12:04 PM Tolbert, Andy wrote: > Hi Shaurya, > > On Tue, Nov 9, 2021 at 11:57 PM Shaurya Gupta > wrote: > >> Hi, >> >> We want to enable node-to-node SSL on a live cluster. Could it be done >> with

Re: Enabling SSL on a live cluster

2021-11-09 Thread Tolbert, Andy
Hi Shaurya, On Tue, Nov 9, 2021 at 11:57 PM Shaurya Gupta wrote: > Hi, > > We want to enable node-to-node SSL on a live cluster. Could it be done > without any down time ? > Yup, this is definitely doable for both internode and client connections. You will have to bounce your

Enabling SSL on a live cluster

2021-11-09 Thread Shaurya Gupta
Hi, We want to enable node-to-node SSL on a live cluster. Could it be done without any down time ? Would the nodes which have been restarted be able to communicate with the nodes which have not yet come up and vice versa ? Regards -- Shaurya Gupta

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-09 Thread Bowen Song
sponse: *num_tokens* define the number of vnodes a node can have. Default is 256. *Initial token* range is predefined (For murmur -2**63 to 2**63-1) So if you have one node in (does not make sense) cluster with num_tokens as 256 then you will have 256vnodes. Scaling up will increase the total num

  1   2   3   4   5   6   7   8   9   10   >