Re: Cassandra on ZFS: disable compression?

2021-02-09 Thread Bowen Song
I'm running Cassandra 3.11 on ZFS on Linux. I use the ZFS compression and have disabled the Cassandra SSTable compression. I can't comment on the possible pros you've mentioned, because I didn't do enough tests on them. I know the compression ratio is marginally better with the same compressio

Re: owns (effective)? Cassandra 4 b4

2021-02-10 Thread Bowen Song
In your example, the nodetool output depends on whether the keyspace name is present in the "nodetool status" command line parameters. "nodetool status" command without a keyspace name may not show the effective ownerships, but it should always show the effective ownership information when you

Re: number of racks in a deployment with VMs

2021-02-15 Thread Bowen Song
The reason for the zoho (and my) emails go to the spam box is because the Apache mailing list software is messing around with the DKIM signature and the "From:" address. I have created INFRA-21415 for this. On 15/02/2021 22:36, Kane Wilson wr

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
The first thing I'd check is the server log. The log may contain vital information about the cause of it, and that there may be different ways to recover from it depending on the cause. Also, please allow me to ask a seemingly obvious question, do you have a backup? On 01/03/2021 09:34, Mar

Re: Recovery after server crash 4.0b3

2021-03-01 Thread Bowen Song
Has the IP address changed? If the IP address hasn't changed and the data is still on disk, you should be able to start this node and it will become available again. Note: you may need to repair this node after that. However, if the IP address has changed as the result of replacing the serve

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
space and I found this very directory full of files where the modification timestamp was the same as the first error I got in the log. Il giorno lun 1 mar 2021 alle ore 12:13 Bowen Song ha scritto: The first thing I'd check is the server log. The log may contain vital inform

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
not a native english speaker :) I usually "remove" snapshots via 'nodetool clearsnapshot' or cassandra-reaper user interface. Il giorno lun 1 mar 2021 alle ore 12:39 Bowen Song ha scritto: What was the warning? Is it related to the disk failure policy? Could you please

Re: MISSING keyspace

2021-03-01 Thread Bowen Song
in.system.issuetabpanels%3Aall-tabpanel <https://issues.apache.org/jira/browse/CASSANDRA-14957?page=com.atlassian.jira.plugin.system.issuetabpanels%3Aall-tabpanel>, maybe they are correlated... Thank you @Bowen and @Erick Il giorno lun 1 mar 2021 alle ore 13:39 Bowen Song ha scritto:

Re: underutilized servers

2021-03-05 Thread Bowen Song
Based on my personal experience, the combination of slow read queries and low CPU usage is often an indicator of bad table schema design (e.g.: large partitions) or bad query (e.g. without partition key). Check the Cassandra logs first, is there any long stop-the-world GC? tombstone warning? an

Re: underutilized servers

2021-03-06 Thread Bowen Song
Hi Erick, Please allow me to disagree on this. A node dropping reads and writes doesn't always mean the disk is the bottleneck. I have seen the same behaviour when a node had excessive STW GCs and a lots of timeouts, and I have also seen writes get dropped because the size of the mutation ex

Re: underutilized servers

2021-03-06 Thread Bowen Song
    0.00 PAXOS_PROPOSE_RSP    0  0.00 0.00  0.00  0.00 Attila Wind http://www.linkedin.com/in/attilaw <http://www.linkedin.com/in/attilaw> Mobile: +49 176 43556932 05.03.2021 17:45 keltezéssel, Bowen Song írta: Based on my personal experience,

Re: Node removal causes spike in pending native-transport requests and clients suffer

2021-03-11 Thread Bowen Song
May I ask why do you scale your Cassandra cluster vertically instead of horizontally as recommended? I'm asking because I had dealt with a vertically scaled cluster before. It was because they had query performance issue and blamed the hardware wasn't strong enough. Scaling vertically had help

Re: Barman equivalent for Cassandra?

2021-03-12 Thread Bowen Song
You can have a separate DC, so a physical destruction of an entire DC (such as a fire 😉) will not result in data loss; you can turn on automatic snapshot on truncate & drop table to help prevent some data losses caused by bugs and human errors; you can also have a cron job to take snapshots (an

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
Millions rows in a single query? That sounds like a bad idea to me. Your "NoNodeAvailableException" could be caused by stop-the-world GC pauses, and the GC pauses are likely caused by the query itself. On 12/03/2021 13:39, Joe Obernberger wrote: Thank you Paul and Erick.  The keyspace is defi

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
s.hasNext()) where itRs is an iterator over a select query from another table.  I'm iterating over a result set from a select and inserting those results via executeAsync. -Joe On 3/12/2021 9:07 AM, Bowen Song wrote: Millions rows in a single query? That sounds like a bad idea to me.

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
oming from HBase where we'd run map reduce jobs. Thank you. -Joe On 3/12/2021 9:07 AM, Bowen Song wrote: Millions rows in a single query? That sounds like a bad idea to me. Your "NoNodeAvailableException" could be caused by stop-the-world GC pauses, and the GC pauses are likely

Re: No node was available to execute query error

2021-03-12 Thread Bowen Song
  Average tombstones per slice (last five minutes): NaN     Maximum tombstones per slice (last five minutes): 0     Dropped Mutations: 0 Thank again! -Joe On 3/12/2021 11:01 AM, Bowen Song wrote: Sleep-then-retry works is just another indicator that it's likely a GC pause

Re: No node was available to execute query error

2021-03-15 Thread Bowen Song
part of the primary key used to determine partition size? -Joe On 3/12/2021 5:27 PM, Bowen Song wrote: The partition size min/avg/max of 8409008/15096925/25109160 bytes looks fine for the table fieldcounts, but the number of partitions is a bit worrying. Only 3 partitions? Are you expecting

Re: Restore of system_auth data to new cluster

2021-03-15 Thread Bowen Song
It's safe to restore the system_auth keyspace. The salted_hash in the system_auth.roles table stores the bcrypt salted hashed passwords. The data in this column actually contains both the salt and the hash. On 15/03/2021 17:00, Who Dadddy wrote: Hi Everyone, I need to nuke a cluster and want

Re: No node was available to execute query error

2021-03-15 Thread Bowen Song
lions). So - select * from table where source=? The number of unique source values is small - maybe 1000 Whereas each source may have billions of UUIDs. -Joe On 3/15/2021 11:18 AM, Bowen Song wrote: To be clear, this CREATE TABLE ... PRIMARY KEY (k1, k2); is the same as: CREATE TABLE

Re: No node was available to execute query error

2021-03-15 Thread Bowen Song
t's java do blocks of n records, but is that the best way? -joe On 3/15/2021 1:42 PM, Bowen Song wrote: I personally try to avoid using secondary indexes, especially in large clusters. SI is not scalable, because a SI query doesn't have the partition key information, Cassand

Re: Full repair results in uneven data distribution

2021-03-16 Thread Bowen Song
That sounds like the combined results from the anti-compaction and the size amplification from the default SizeTieredCompactionStrategy. If you keep repeating those steps, the disk usage will eventually stop growing. Of course, that's not an excuse to keep repeating it. To fix this (if you rea

Re: Fatal Java error when starting cassandra

2021-03-18 Thread Bowen Song
Try downgrade your JDK to 1.8.0_241. On 18/03/2021 06:45, Manu Chadha wrote: Hi Cassandra doesn’t start on my PC. I recently changed my machine. I simply copied the old Cassandra folder in the new machine. When I start Cassandra, I get error. Sorry for dumping the whole thing here but I do

Re: Best strategy to run repair

2021-03-23 Thread Bowen Song
It gets complicated when you have RF > 1, because a subrange can involve multiple nodes, and a node can't run multiple repairs concurrently for the same table. Multiple DCs also complicates things, as the describering command doesn't separate the token ranges by DC. In my opinion, it's best to

Re: Dont want to split sstables for repaired and non repaired while repairing with -pr option

2021-03-25 Thread Bowen Song
You can use nodetool repair with either -dc DC or -st START -et END to avoid the anticompaction. I would highly recommend you to try Cassandra reaper , it makes life a lot easier. On 24/03/2021 20:29, Surbhi Gupta wrote: Hi, I dont want to split sstables ,repair

Re: Backup cassandra and restore. Best practices

2021-04-06 Thread Bowen Song
Medusa /Support for local storage, Google Cloud Storage (GCS) and AWS S3 through //Apache Libcloud //. Can be extended to support other storage providers supported by Apache Libcloud,/ and Apache Libcloud supports minio

Re: Huge single-node DCs (?)

2021-04-08 Thread Bowen Song
I'm sure there's a lots of pitfalls. A few of them in my mind right now: * With a single node, you will completely lose the benefit of high availability from Cassandra. Not only hardware failure will result in downtime, routine maintenance (such as software upgrade) can also result in d

Re: Huge single-node DCs (?)

2021-04-08 Thread Bowen Song
This is off-topic. But if your goal is to maximise storage density and also ensuring data durability and availability, this is what you should be looking at: * hardware: https://www.backblaze.com/blog/open-source-data-storage-server/ * architecture and software: https://www.backblaze.co

Re: Error connecting to cassandra through php/browser

2021-04-10 Thread Bowen Song
What do you mean "through the browser"? Is it a php page that runs on something like Apache/Nginx? If that's the case, and the same code and the same setup works on Ubuntu 18 but not CentOS 7, I would recommend you to have a look at the SELinux logs. On 10/04/2021 14:01, Shabu Khan wrote: Hel

Re: Error connecting to cassandra through php/browser

2021-04-11 Thread Bowen Song
/etc/selinux/config -Shabu On Sun, Apr 11, 2021 at 1:26 AM Bowen Song wrote: What do you mean "through the browser"? Is it a php page that runs on something like Apache/Nginx? If that's the case, and the same code and the same setup works on Ubuntu 18 but not CentOS 7, I would recommen

Re: Query timed out after PT1M

2021-04-13 Thread Bowen Song
The error message is clear, it was a DriverTimeoutException, and it was because the query timed out after one minute. /Note: "PT1M" means a period of one minute, see //https://en.wikipedia.org/wiki/ISO_8601#Durations / If you need help from u

Re: Query timed out after PT1M

2021-04-13 Thread Bowen Song
imum live cells per slice (last five minutes): 0     Average tombstones per slice (last five minutes): NaN     Maximum tombstones per slice (last five minutes): 0     Dropped Mutations: 0 -Joe On 4/13/2021 12:35 PM, Bowen Song wrote: The error m

Re: Memory requirements for Cassandra reaper

2021-05-04 Thread Bowen Song
Hi Surbhi, I don't know the memory requirements, but speaking from my observation, a single Cassandra Reaper instance with an external postgres database storage backend, and managing a single small Cassandra cluster, the Cassandra Reaper's Java process memory usage is slightly short of 1GB.

Compatibility between Cassandra 3.11 and cqlsh from Cassandra 4.0 RC1

2021-05-04 Thread Bowen Song
Hi all, I was using the cqlsh from Cassandra 4.0 RC1 and trying to connect to a Cassandra 3.11 cluster, and it does not appear to be working correctly. Specificity, the "desc" command does not work at all. Steps to reproduce: # ensure you have docker installed and running # ru

Re: Compatibility between Cassandra 3.11 and cqlsh from Cassandra 4.0 RC1

2021-05-04 Thread Bowen Song
.org/jira/projects/CASSANDRA/summary> with the provided reproduction steps. Thanks, Paulo Em ter., 4 de mai. de 2021 às 18:22, Bowen Song escreveu: Hi all, I was using the cqlsh from Cassandra 4.0 RC1 and trying to connect to a Cassandra 3.11 cluster, and it does not appear t

Re: RC1 - Counters

2021-05-05 Thread Bowen Song
This sounds like the clock on your Cassandra servers are not in sync. Can you please ensure all Cassandra servers have their clock synced (usually via NTP) and retry this? On 05/05/2021 14:42, Joe Obernberger wrote: Want to add - I am seeing this in the log: INFO  [ScheduledTasks:1] 2021-05-0

Re: Unsubscribe

2021-05-06 Thread Bowen Song
It's user-unsubscr...@cassandra.apache.org not .com On 06/05/2021 17:13, vishal kharjul wrote: Hello, I need help with how to unsubscribe this mailing list. I tried sending email to user-unsubscr...@cassandra.apache.com and it didn't work. Pleas

Re: unable to repair

2021-05-27 Thread Bowen Song
Hi Sébastien, The error message you shared came from the repair coordinator node's log, and it's the result of failures reported by 3 other nodes. If you could have a look at the 3 nodes listed in the error message - 135.181.222.100, 135.181.217.109 and 135.181.221.180, you should be able to

Re: unable to repair

2021-05-30 Thread Bowen Song
This sounds like a really bad idea. In Cassandra 4.0 RC1, when you have more than 150 tables or 40 keyspaces (code reference ), Cassandra will warn you about

Re: Soon After Starting c* Process: CPU 100% for java Process

2021-07-02 Thread Bowen Song
On 01/07/2021 23:41, Elliott Sims wrote: Also for narrowing down performance issues, I've had good luck with the "ttop" module of Swiss Java Knife and with the async-profiler tool: https://github.com/jvm-profiling-tools/async-profiler

Re: Cassandra commitlog corruption on hard shutdown

2021-07-26 Thread Bowen Song
I have seen the same error in Cassandra 3.x too, and in fact quite a few times. On a few occasions, I opened the corrupted commit log file in a hex editor, and it was filled with a lots of 0x00s. I believe it was caused by the combination of the way Cassandra flushes the commit log + the way XF

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song
Hello Chahat, First, can you please make sure the Cassandra user used by the application is not "cassandra"? Because the "cassandra" user uses QUORUM consistency level to read the auth tables. Then, can you please make sure the replication strategy is set correctly for the system_auth names

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song
d stop time 0.001592 seconds. 2021-07-27 03:04:45,222 INFO gcstats:58 - Application Thread stop time 0.001618 seconds. 2021-07-27 03:05:08,907 INFO gcstats:58 - Application Thread stop time 0.001624 seconds. 2021-07-27 03:06:34,436 INFO gcstats:58 - Ap

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song
:15:59,744 MessagingService.java:1013 - READ messages were dropped in last 5000 ms: 287 for internal timeout and 0 for cross node timeout * Also, I am checking cfstats and proxyhistorgrams is in progress, will update incase anythings suspicious. On Tue, 27 Jul 2021 at 18:09, Bowen S

Re: Permission/Role Cache causing timeouts in apps.

2021-07-27 Thread Bowen Song
ax_threads: 320 JVM_OPTS="$JVM_OPTS -Dcassandra.max_queued_native_transport_requests=3072" Do you think it would be advisable to tune the number in the above params to have lesser load on the node? On Tue, 27 Jul 2021 at 20:13, Bowen Song wrote: Wow, 15 seconds timeout? That'

Re: cassandra 4.0 java 11 support

2021-07-27 Thread Bowen Song
Experimental means anything can happen - dragons, unicorns, ... On 27/07/2021 21:32, CPC wrote: Hi , At cassandra site https://cassandra.apache.org/doc/latest/cassandra/new/java11.html , it says java 11 support is experiment

Re: High memory usage during nodetool repair

2021-07-28 Thread Bowen Song
Could it be related to https://issues.apache.org/jira/browse/CASSANDRA-14096 ? On 28/07/2021 13:55, Amandeep Srivastava wrote: Hi team, My Cluster configs: DC1 - 9 nodes, DC2 - 4 nodes Node configs: 12 core x 96GB ram x 1 TB HDD Repair params: -full -pr -local Cassandra version: 3.11.4 I'm ru

Re: Issue with native protocol

2021-07-29 Thread Bowen Song
Can you please run the following query on the problematic node? select peer, host_id, release_version from system.peers where release_version < '3.0.0' allow filtering; I suspect you have a "ghost" 2.x node in the system.peers table on this problematic node. The "ghost" node will not sh

Re: [RELEASE] Apache Cassandra 3.11.11 released

2021-07-30 Thread Bowen Song
Hello, I'm getting the following error when trying to run the 'cqlsh' command on a Cassandra server after upgrading from 3.11.9 to 3.11.11 on CentOS 7: $ cqlsh Traceback (most recent call last):   File "/usr/bin/cqlsh.py", line 169, in     from cqlshlib import cql3handling, cql

Re: Reduce num_tokens on single node cluster

2021-07-30 Thread Bowen Song
Do you ever intend to add nodes to this single node cluster? If not, I don't see the number of tokens matter at all. However, if you really want to change it and don't mind downtime, you can do this: 1. make a backup of the data 2. completely destroy the node with all data in it 3.

Re: Reduce num_tokens on single node cluster

2021-07-30 Thread Bowen Song
Since you have only one node, sstableloader is unnecessary. Copy/move the the data directory back to the right place and restart Cassandra or run 'nodetool refresh' is sufficient. Do not restore the 'system' keyspace, but do restore the other system keyspaces, such as 'system_auth' and 'system_

Re: Large number of tiny sstables flushed constantly

2021-08-12 Thread Bowen Song
Hello Jiayong, Using multiple disks in a RAID0 for Cassandra data directory is not recommended. You will get better fault tolerance and often better performance too with multiple data directories, one on each disk. If you stick with RAID0, it's not 4 disks, it's 1 from Cassandra's point of

Re: Large number of tiny sstables flushed constantly

2021-08-13 Thread Bowen Song
meters in the yaml that we could tune for this. Thanks again, Jiayong Sun On Thursday, August 12, 2021, 04:55:51 AM PDT, Bowen Song wrote: Hello Jiayong, Using multiple disks in a RAID0 for Cassandra data directory is not recommended. You will get better fault tolerance and often bett

Re: Large number of tiny sstables flushed constantly

2021-08-13 Thread Bowen Song
rkload but its Partition Size is around 30 MB. There are a couple small tables with the Max Partition Size over several hundreds of MB but their total data size just about a few GB. Any thoughts? Thanks, Jiayong On Friday, August 13, 2021, 03:32:45 AM PDT, Bowen Song wrote: Hi Jiayong,

Re: Large number of tiny sstables flushed constantly

2021-08-13 Thread Bowen Song
essages at beginning of this email thread). Thanks for all your thoughts and I really appreciate. Thanks, Jiayong Sun On Friday, August 13, 2021, 01:36:21 PM PDT, Bowen Song wrote: Hi Jiayong, That doesn't really match the situation described in the SO question. I suspected it was rela

Re: Large number of tiny sstables flushed constantly

2021-08-15 Thread Bowen Song
this issue has been occurring causing many node lost gossip. We have to set up a daily cron job to clear the older hints from disk, but not sure if this would hurt data inconsistency among nodes and DCs. Thoughts? Thanks, Jiayong Sun On Friday, August 13, 2021, 03:39:44 PM PDT, Bowen Song wrote

Re: Large number of tiny sstables flushed constantly

2021-08-16 Thread Bowen Song
e depending on many factors but I was wondering if there is any kind of rule of thumb? Thanks, Jiayong Sun On Sunday, August 15, 2021, 05:58:11 AM PDT, Bowen Song wrote: Hi Jiayong, Based on this statement: /> //We see the commit logs switched about 10 times per minutes/ I'd als

Re: Large number of tiny sstables flushed constantly

2021-08-16 Thread Bowen Song
remove it due to some other issue. We have started using num_token: 16 as standard for any new clusters. Thanks, Jiayong Sun On Monday, August 16, 2021, 02:46:24 AM PDT, Bowen Song wrote: Hello Jiayong, />//There is only one major table taking 90% of writes. / In your case, what ma

Re: [UPGRADATION] Apache Cassandra from version 3.0.9 to 4.0.0

2021-09-06 Thread Bowen Song
Hello Ashish, I'm slightly worried about this: /Since I won't be needing physical DC anymore so instead of upgrading it I will simply discard that DC/ This sounds like you are planning to add GCP 3.x to existing cluster, and upgrade GCP to 4.0, then decommission the existing DC without

Re: Unable to Gossip

2021-09-10 Thread Bowen Song
Hello Joe, These logs indicate the clocks are out of sync (by over 4.2 hours) between the new node and the seed nodes: INFO  [ScheduledTasks:1] 2021-09-10 11:14:26,567 MessagingMetrics.java:206 - GOSSIP_DIGEST_SYN messages were dropped in last 5000 ms: 0 internal and 1 cross node. Me

Re: Data read size from table

2021-09-11 Thread Bowen Song
Hello Ashish, I don't think Cassandra exposes any metrics like that via the JMX interface (which is where the Prometheus JMX exporter is getting the metrics from). However, you do have a few other options to achieve the same goal, such as request tracing (nodetool settraceprobability), slow

Re: Hints streaming of dead node

2021-09-12 Thread Bowen Song
Hi Roy, I assume you are talking about "nodetool removenode", not "nodetool decommission". In the case of "removenode", The SSTables are streamed from the remaining live replica nodes (if RF>1), not the dead node. Because of that, the hints for the dead node is irrelevant. I hope that answer

Re: Hints streaming of dead node

2021-09-12 Thread Bowen Song
is It the same for node replacement? On Sun, Sep 12, 2021, 21:21 Bowen Song wrote: Hi Roy, I assume you are talking about "nodetool removenode", not "nodetool decommission". In the case of "removenode", The SSTables are streamed from the remaini

Re: COUNTER timeout

2021-09-15 Thread Bowen Song
Check the logs on the Cassandra servers first. Many different things can cause the same result, and you will have to dig in deeper to discover the true cause. On 14/09/2021 23:55, Joe Obernberger wrote: I'm getting a lot of the following errors during ingest of data: com.datastax.oss.driver.a

Re: COUNTER timeout

2021-09-15 Thread Bowen Song
Well, the log says cross node timeout, latency a bit over 44 seconds. Here's a few most likely causes: 1. The clocks are not in sync - please check the time on each server, and ensure NTP client is running on all Cassandra servers 2. Long stop the world GC pauses - please check the GC logs an

Re: High disk usage casaandra 3.11.7

2021-09-17 Thread Bowen Song
Assuming your total disk space is a lot bigger than 50GB in size (accounting for disk space amplification, commit log, logs, OS data, etc.), I would suspect the disk space is being used by something else. Have you checked that the disk space is actually being used by the cassandra data director

Re: High disk usage casaandra 3.11.7

2021-09-17 Thread Bowen Song
..i removed the same .. its purely data.. On Friday, September 17, 2021, Bowen Song <mailto:bo...@bso.ng>> wrote: Assuming your total disk space is a lot bigger than 50GB in size (accounting for disk space amplification, commit log, logs, OS data, etc.), I would suspect

Re: High disk usage casaandra 3.11.7

2021-09-17 Thread Bowen Song
ol decommission/removenode and added back one node ans it came back to 22Gb. Cant run major compaction as no space much left. On Friday, September 17, 2021, Bowen Song <mailto:bo...@bso.ng>> wrote: Okay, so how big exactly is the data on disk? You said removing and adding a new

Re: TWCS on Non TTL Data

2021-09-17 Thread Bowen Song
If you use TWCS with TTL, the old SSTables won't be compacted, the entire SSTable file will get dropped after it expires. I don't think you will need to manage the compaction or cleanup at all, as they are automatic. There's no space limit on the table holding the near-term data other than the

Re: High disk usage casaandra 3.11.7

2021-09-17 Thread Bowen Song
hanks. Application deletes data every 48hrs of older data. Auto compaction works but as space is full ..errorlog only says not enough space to run compaction. On Friday, September 17, 2021, Bowen Song <mailto:bo...@bso.ng>> wrote: If major compaction is failing due to disk space c

Re: High disk usage casaandra 3.11.7

2021-09-18 Thread Bowen Song
, 2021, Abdul Patel <mailto:abd786...@gmail.com>> wrote: 48hrs deletion is deleting older data more than 48hrs . LCS was used as its more of an write once and read many application. On Friday, September 17, 2021, Bowen Song mailto:bo...@bso.ng>> wrote: Congrat

Re: TWCS on Non TTL Data

2021-09-20 Thread Bowen Song
nt is that I am NOT using TTL and I need to keep the data, so when I do the switch to TWCS, will the old files be recompacted or they will remain the same and only new data coming in will use TWCS? *From:* Bowen Song *Sent:* Friday, September 17, 2021 9:04 PM *To:* user@cassandra.apache.org *Su

Re: Does Open-Source Cassandra 4 support mutual-TLS ?

2021-09-22 Thread Bowen Song
Out of curiosity, I have two further questions. 1. I know the client can *optionally* provider a certificate for the TLS handshake, but is it possible to *require* the client to provide a certificate? 2. Does Cassandra check that the username matches the client certificate? E.g. TLS client c

Re: Does Open-Source Cassandra 4 support mutual-TLS ?

2021-09-22 Thread Bowen Song
CQL username, will continue to work as usual. Cheers, Bowen On 22/09/2021 15:40, Dinesh Joshi wrote: On Sep 22, 2021, at 2:02 AM, Bowen Song wrote: Out of curiosity, I have two further questions. 1. I know the client can *optionally* provider a certificate for the TLS handshake, but is it po

Re: Latest Supported RedHat Linux version for Cassandra 3.11

2021-09-27 Thread Bowen Song
The document page looks pretty old. RHEL 7.8, 7.9 and 8 and Debian 10 have been available for some years, but the page is still referencing to RHEL 7.7 and Debian 9. However, the page has made the point: "Cassandra runs on a wide array of Linux distributions including (but not limited to)" that

Re: TTL and disk space releasing

2021-10-06 Thread Bowen Song
What is the the table's default TTL? (Note: it may be different than the TTL of the data in the table) On 06/10/2021 09:42, Michel Barret wrote: Hello, I try to use cassandra (3.11.5) with 8 nodes (in single datacenter). I use one simple table, all data are inserted with 31 days TTL (the data

Re: TTL and disk space releasing

2021-10-06 Thread Bowen Song
ng to have in that table. On 06/10/2021 16:34, Michel Barret wrote: Hi, it's not set before. I set it to ensure all data have a ttl. Thanks for your help. Le 06/10/2021 à 13:47, Bowen Song a écrit : What is the the table's default TTL? (Note: it may be different than the TTL of the da

Re: Problem with www.apache.org/dist/cassandra/KEYS?

2021-10-07 Thread Bowen Song
Well... $ wget -qO - https://www.apache.org/dist/cassandra/KEYS | wc -c 0 The first part of the command is clearly not working. Removing the "-q", and "wget" shows the error message: $ wget -O - https://www.apache.org/dist/cassandra/KEYS --2021-10-07 12:03:34-- https://www.apache.

Re: Problem with www.apache.org/dist/cassandra/KEYS?

2021-10-07 Thread Bowen Song
The "workaround" (it's a proper fix) is to upgrade the libgnutls30 package. On 07/10/2021 15:24, rhys.campb...@swisscom.com wrote: Thanks for that. This command is actually spat out by the apt_key ansible module….  but I'm sure there's a way around it. Cheers, R

Re: Trouble After Changing Replication Factor

2021-10-11 Thread Bowen Song
You have RF=3 and both read & write CL=1, which means you are asking Cassandra to give up strong consistency in order to gain higher availability and perhaps slight faster speed, and that's what you get. If you want to have strong consistency, you will need to make sure (read CL + write CL) > R

Re: Trouble After Changing Replication Factor

2021-10-12 Thread Bowen Song
when the data is requested from the 3^rd (new) replica, it is not there and an empty record is returned with read CL1. What can I do to force this data to be synced to all replicas as it should? So read CL1 request will actually return a correct result? Thanks *From:* Bowen Song *Sent:* M

Re: Stop long running queries in Cassandra 3.11.x or Cassandra 4.x

2021-10-12 Thread Bowen Song
That will depend on whether you have cross_node_timeout enabled. However, I have to point out that set timeout to 15ms is perhaps not a good idea, the JVM GC can easily cause a lots of timeouts. On 12/10/2021 18:20, S G wrote: ok, when a coordinator node sends timeout to the client, does it mea

Re: TWCS not cleaning up data as fast as expected

2021-10-15 Thread Bowen Song
I noticed the table default TTL is 1 day, but the SSTable's max TTL is 3 days. Any idea why would this happen? Does any INSERT/UPDATE statement have a TTL longer than the table's default? The min timestamp is also very odd, it's in 2017. Do you insert data using very old timestamps? On 15/10/2

Re: update cassandra.yaml file on number of cluster nodes

2021-10-15 Thread Bowen Song
We have Cassandra on bare-metal servers, and we manage our servers via Ansible. In this use case, we create an Ansible playbook to update the servers one by one, change the cassandra.yaml file, restart Cassandra, and wait for it to finish the restart, and then wait for a few minutes before movi

Re: How to find traffic profile per client on a Cassandra server?

2021-10-25 Thread Bowen Song
For older version (Cassandra < 4.0) you can use `nodetool settraceprobability` to get a sample of queries (or all queries, if your cluster can handle the extra load). For newer version (>= 4.0) you can use the above, or the new Full Query Logging

Re: How does a node decide where each of its vnodes will be replicated to?

2021-11-09 Thread Bowen Song
Just need to correct one thing: the num_tokens default /*WAS*/ 256. The new default is 16 in Cassandra 4. On 09/11/2021 07:17, manish khandelwal wrote: Just to add on to your response:

Re: gc throughput

2021-11-16 Thread Bowen Song
Do you have any performance issues? such as long STW GC pauses or high p99.9 latency? If not, then you shouldn't tune the GC for the sake of it. However, if you do have performance issues related to GC, regardless what is the GC metric you are looking at saying, you will need to address the iss

Re: Hint file getting stuck

2021-11-16 Thread Bowen Song
I think your problem is likely not the "stuck" hints, but the write requests in them. The reason those write requests ended up in the hint file is because they have failed before. They are likely to fail again when they are retried if the failure was caused by the write requests themselves in

Re: Anyone connecting the Cassandra on a server

2021-11-19 Thread Bowen Song
This could be two questions with different answers: 1. Is there anyone / who is connected to the Cassandra server right now? Use the netstat or ss command and check the active TCP connections on native port (default is 9042) 2. Is there anyone / who is connecting to the Cassandra servers be

Re: Added node - now queries time out

2021-12-03 Thread Bowen Song
The load on the new server looks clearly wrong. Are you sure this node has fully bootstraped / rebuilt? If not, the large amount of streaming activity triggered by read repair may be enough to cause timeouts. Please check the new server's log and make sure it did not fail any streaming session

Re: Node failed after drive failed

2021-12-11 Thread Bowen Song
Hi Joe, In case of a single disk failure, you should not remove the data directory from the cassandra.yaml file. Instead, you should replace the failed disk with a new empty disk. See https://docs.datastax.com/en/cassandra-oss/3.x/cassandra/operations/opsRecoverUsingJBOD.html for the steps.

Re: Node failed after drive failed

2021-12-11 Thread Bowen Song
Hi Joss, To unsubscribe from this mailing list, please send an email to user-unsubscr...@cassandra.apache.org, not the mailing list itself (user@cassandra.apache.org). On 09/12/2021 16:14, Joss wrote: unsubscribe On Mon, 6 Dec 2021 at 14:12, Joe Obernberger wrote: Hi All - one node

Re: Node failed after drive failed

2021-12-13 Thread Bowen Song
x27;ll just delete all the cassandra data on that node and have it rejoin as a new node. -Joe On 12/11/2021 3:44 PM, Bowen Song wrote: Hi Joe, In case of a single disk failure, you should not remove the data directory from the cassandra.yaml file. Instead, you should replace the

Re: Log4j vulnerability

2021-12-13 Thread Bowen Song
Do you mean the log4j-over-slf4j-#.jar? If so, please read: http://slf4j.org/log4shell.html On 13/12/2021 23:48, Rahul Reddy wrote: Hello, I see this jar  log4j-over-slf4j-1.7.7.jar does it have any impact on it? Why that jar is used for ? On Sat, Dec 11, 2021 at 12:45 PM Brandon Willia

Re: about memory problem in write heavy system..

2022-01-11 Thread Bowen Song
Many thanks. 2022. 1. 10. 오후 8:53, Bowen Song 작성: Anything special about the table you stopped writing to? I'm wondering how did you locate the table was the cause of the memory usage increase. /> For the latest version (3.11.11) upgrade, can the two versions coexist in the clust

Re: Hanging repairs in Cassandra

2022-01-18 Thread Bowen Song
The entry in the debug.log is not specific to a repair session, and it could also be caused by reasons other than network connectivity issue, such as long STW GC pauses. I usually don't start troubleshooting an issue from the debug log, as it can be rather noisy. The system.log is a better star

Re: Hanging repairs in Cassandra

2022-01-18 Thread Bowen Song
etween. Regards Manish On Tue, Jan 18, 2022 at 4:39 PM Bowen Song wrote: The entry in the debug.log is not specific to a repair session, and it could also be caused by reasons other than network connectivity issue, such as long STW GC pauses. I usually don't start troublesh

Re: Hanging repairs in Cassandra

2022-01-18 Thread Bowen Song
-acknowledged Regards Manish On Tue, Jan 18, 2022, 18:18 Bowen Song wrote: Keep reading the log on the initiator and the node sending the merkle tree, anything follows that? FYI, not all log has the repair ID in it, therefore please read the relevant logs in the chronological

Re: Hanging repairs in Cassandra

2022-01-19 Thread Bowen Song
" parameters. Since then I've switched to use Cassandra Reaper and have never had similar issues. On 19/01/2022 02:22, manish khandelwal wrote: Agree with you on that. Just wanted to highlight that I am experiencing the same behavior. Regards Manish On Tue, Jan 18, 2022, 22:50 B

Re: Cassandra 4.0 hanging on restart

2022-01-19 Thread Bowen Song
Nothing obvious from the logs you posted. Generally speaking, replaying commit log is often the culprit when a node takes a long time to start. I have seen many nodes with large memtable and commit log size limit spending over half an hour replaying the commit log. I usually do a "nodetool flu

  1   2   3   >