Unexplained stuck memtable flush

2024-11-05 Thread Bowen Song via dev
Hi all, We have a cluster running Cassandra 4.1.1. We are seeing the memtable flush randomly getting stuck. This has happened twice in the last 10 days, to two different nodes in the same cluster. This started to happen after we enabled CDC, and each time it got stuck, there was at least one

Re: 【DISCUSS】The configuration of Commitlog archiving

2024-08-30 Thread Bowen Song via dev
I'm not sure what is the concern here. Is it a malicious user exploiting this? Or human error with unintended consequences? For malicious user, in order to exploit this, an attacker needs to be able to write to the config file. The config file on Linux by default is owned by the root user and

Re: [DISCUSS] inotify for detection of manually removed snapshots

2024-08-09 Thread Bowen Song via dev
same if they had manually deleted some SSTable files (they shouldn't). On 09/08/2024 11:16, Štefan Miklošovič wrote: We could indeed do that. Does your suggestion mean that there should not be a problem with caching it all once explicitly stated like that? On Fri, Aug 9, 2024 at 12:01 PM

Re: [DISCUSS] inotify for detection of manually removed snapshots

2024-08-09 Thread Bowen Song via dev
Has anyone considered simply updating the documentation saying this? "Removing the snapshot files directly from the filesystem may break things. Always use the `nodetool` command or JMX to remove snapshots." On 09/08/2024 09:18, Štefan Miklošovič wrote: If we consider caching it all to be too

Re: [DISCUSS] Adding support for BETWEEN operator

2024-05-14 Thread Bowen Song via dev
Ranged update sounds like a disaster for compaction and read performance. Imagine compacting or reading some SSTables in which a large number of overlapping but non-identical ranges were updated with different values. It gives me a headache by just thinking about it. Ranged delete is much sim

Re: Schema Disagreement Issue for Cassandra 4.1

2024-04-01 Thread Bowen Song via dev
It sounds worthy of a Jira ticket. On 01/04/2024 06:23, Cheng Wang via dev wrote: Hello, I have recently encountered a problem concerning schema disagreement in Cassandra 4.1. It appears that the schema versions do not reconcile as expected. The issue can be reproduced by following these st

Re: Default table compression defined in yaml.

2024-03-19 Thread Bowen Song via dev
I believe the `foobar_in_kb: 123` format in the cassandra.yaml file is deprecated, and the new format is `foobar: 123KiB`. Is there a need to introduce new settings entries with the deprecated format only to be removed at a later version? On 18/03/2024 14:39, Claude Warren, Jr via dev wrote:

Re: [DISCUSS] What SHOULD we do when we index an inet type that is ipv4?

2024-03-07 Thread Bowen Song via dev
em in two separate columns on the same table, and none of this matters. If they are mixed, it feels like we should at least have the option to make them comparable, kind of like we have the option to make text case-insensitive or unicode normalized right now. On Wed, Mar 6, 2024 at 4:35 PM Bow

Re: [DISCUSS] What SHOULD we do when we index an inet type that is ipv4?

2024-03-06 Thread Bowen Song via dev
Technically, 127.0.0.1 (IPv4) is not 0:0:0:0:0::7f00:0001 (IPv6), but their values are equal. Just like 1.0 (double) is not 1 (int), but their values are equal. So, what is the meaning of "=" in CQL? On 06/03/2024 21:36, David Capwell wrote: So, was reviewing SAI and found we convert ipv4

Re: [DISCUSS] New CQL command/option for listing roles with superuser privileges

2024-02-29 Thread Bowen Song via dev
I believe that opens the door to this kind of situations: 1. create superuser role "role1" 2. create superuser role "role2" 3. add "role2" to members of "role1" 4. remove "role2" from the members of "role1" 5. "role2" now inexplicitly lost the superuser state TBH, my preferred solution is making

Re: Table name length limit in Cassandra

2024-02-22 Thread Bowen Song via dev
Hi Gaurav, I would be less worried about performance issues than interoperability issues. Other tools/client libraries do not expect this, and may cause them to behave unexpectedly (e.g. truncating/crashing/...). If you can, try get rid of common prefix/suffix, and use abbreviations where po

Re: [DISCUSS] Add subscription mangement instructions to user@, dev@ message footers

2024-01-22 Thread Bowen Song via dev
eaking, that's not forwarding, but sending a new email with the original email's content, subject, sender name (but not address), etc. information copied over. I believe the mailing list software this mailing list is using also supports such feature. For example, this email's &qu

Re: [DISCUSS] Add subscription mangement instructions to user@, dev@ message footers

2024-01-22 Thread Bowen Song via dev
Adding a footer or modifying the email content in any way will break the DKIM signature of the email if it has one. Since the mailing list's mail server will forward the emails to the recipients, the SPF check will fail too. Failing the DKIM signature & SPF check will result in the email likely

Re: [DISCUSS] Maintain backwards compatibility after dependency upgrade in the 5.0

2023-06-28 Thread Bowen Song via dev
IMHO, anyone upgrading software between major versions should expect to see breaking changes. Introducing breaking or major changes is the whole point of bumping major version numbers. Since the library upgrade need to happen sooner or later, I don't see any reason why it should not happen in

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-06 Thread Bowen Song via dev
/> I'm quite happy to leave things as they are if that is the consensus./ +1 to the above On 06/04/2023 14:54, Mike Adamson wrote: My apologies. I started this discussion off the back of a usability discussion around new user accessibility to Cassandra and the premise that there is an initial

Re: [DISCUSS] Introduce DATABASE as an alternative to KEYSPACE

2023-04-04 Thread Bowen Song via dev
I personally prefer to use the name "keyspace", because it avoids the confusion between the "database software/server" and the "collection of tables in a database". "An SQL database" can mean different things in different contexts, but "a Cassandra keyspace" always mean the same thing. On 04/0

Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-23 Thread Bowen Song via dev
simple for him to type ten times with different table names which I think at first Only set with argument ks keyspace name is enough. When we just want to see eight tables in the ks ,the user should just type eight table name which ignore two table may be enough. Bowen Song via dev 于2023年3月23日 周

Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-23 Thread Bowen Song via dev
ers specifying option -ks and -tbs , but tablestats don't. Josh McKenzie 于2023年3月22日周三 23:35写道: Agree w/Bowen. I think the straight forward simplicity of "clear inclusion and exclusion semantics, default to include all in scope excepting things that are explicitly ignored

Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-22 Thread Bowen Song via dev
2 AM, Josh McKenzie wrote: We could also consider augmenting the tool with new named arguments with the functionality you described and leave the positional usage intact. On Thu, Mar 16, 2023, at 6:43 AM, Bowen Song via dev wrote: The documented command options ar

Re: [DISCUSS] Change the useage of nodetool tablehistograms

2023-03-16 Thread Bowen Song via dev
The documented command options are: nodetool tablehistograms [ | ] That means one parameter will be treated as dot separated keyspace and table. Alternatively, two parameters will be treated as the keyspace and table respectively. To remain compatible with the documented behaviour, my s

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Bowen Song via dev
icitly having speculative execution handle the retry against another replica while repair of that range happens. But that feels suboptimal to me when a better framework is on the horizon. -- Abe On Mar 9, 2023, at 8:23 AM, Bowen Song via dev wrote: Hi Jeremiah, I'm fully aware of that,

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Bowen Song via dev
he retry against another replica while repair of that range happens. But that feels suboptimal to me when a better framework is on the horizon. -- Abe On Mar 9, 2023, at 8:23 AM, Bowen Song via dev wrote: Hi Jeremiah, I'm fully aware of that, which is why I said that deleting the affe

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-09 Thread Bowen Song via dev
tombstone covering).  Then you can stream from the other nodes to get the data back. -Jeremiah On Mar 8, 2023, at 7:24 AM, Bowen Song via dev wrote: At the moment, when a read error, such as unrecoverable bit error or data corruption, occurs in the SSTable data files, regardless of the dis

Re: [DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread Bowen Song via dev
;s smoke there's fire - I wouldn't expect a drive reporting uncorrectable errors / filesystem corruption to be long for this world. Can you say more about the scenarios you have in mind? – Scott On Mar 8, 2023, at 5:24 AM, Bowen Song via dev wrote: At the moment, when a read error,

[DISCUSS] Enhanced Disk Error Handling

2023-03-08 Thread Bowen Song via dev
At the moment, when a read error, such as unrecoverable bit error or data corruption, occurs in the SSTable data files, regardless of the disk_failure_policy configuration, manual (or to be precise, external) intervention is required to recover from the error. Commonly, there's two approach to

Re: [PROPOSAL] Moving deb/rpm repositories from downloads.apache.org to apache.jfrog.io

2022-08-11 Thread Bowen Song via dev
I see. In that case, stick to the original plan makes more sense. On 11/08/2022 22:46, Mick Semb Wever wrote: We should have the new domain/URL created before the final move is made, and redirecting to the existing download.apache.org for the time be

Re: [PROPOSAL] Moving deb/rpm repositories from downloads.apache.org to apache.jfrog.io

2022-08-11 Thread Bowen Song via dev
level of (superfluous) signing on top of that, which we do not currently have. Kind Regards, Brandon On Thu, Aug 11, 2022 at 4:20 PM Bowen Song via dev wrote: In that case, the move from signed RPM/DEB to unsigned can be quiet problematic to some enterprise users. On 11/08/2022 22:16, Jerem

Re: [PROPOSAL] Moving deb/rpm repositories from downloads.apache.org to apache.jfrog.io

2022-08-11 Thread Bowen Song via dev
.  See the ASF release policy for more information. https://www.apache.org/legal/release-policy.html#compiled-packages On Aug 11, 2022, at 4:12 PM, Bowen Song via dev wrote: I'm a bit unclear what's the scope of this change. Is it limited to the "*-bin.tar.gz" files on

Re: [PROPOSAL] Moving deb/rpm repositories from downloads.apache.org to apache.jfrog.io

2022-08-11 Thread Bowen Song via dev
I'm a bit unclear what's the scope of this change. Is it limited to the "*-bin.tar.gz" files only? I would assume the RPM/DEB packages are considered as parts of the "official releases", and aren't affected by this change. Am I right? On 11/08/2022 21:59, Mick Semb Wever wrote: > /Thes

Re: [PROPOSAL] Moving deb/rpm repositories from downloads.apache.org to apache.jfrog.io

2022-08-11 Thread Bowen Song via dev
> /These repositories and their binaries are "convenience binaries" and not the official Cassandra source binaries/ Then where are the official binaries? On 11/08/2022 21:40, Mick Semb Wever wrote: The proposal is to move our official debian and redhat repositories from downloads.apache.org

Re: Unsubscribe

2022-08-09 Thread Bowen Song via dev
To unsubscribe from this mailing list, you'll need to send an email to dev-unsubscr...@cassandra.apache.org On 09/08/2022 12:52, Schmidtberger, Brian M. (STL) wrote: unsubscribe + BRIAN SCHMIDTBERGER Software Engineering Senior Advisor, Core Engineering, Express Scripts M: 785.766.7450 EV

Re: [DISCUSS] Deprecate and remove resumable bootstrap and decommission

2022-08-03 Thread Bowen Song via dev
tigated by the much faster bootstrapping times without the correctness risks. On Wed, Aug 3, 2022, at 6:21 PM, Bowen Song via dev wrote: That would have to be assessed on a case by case basis. * When the code doesn't delete data, which means there's a zero probability of resurrecting dele

Re: [DISCUSS] Deprecate and remove resumable bootstrap and decommission

2022-08-03 Thread Bowen Song via dev
get these nodes finish joinning the cluster. Was this before or after the addition of zero copy streaming? The premise is that the pain point resumable bootstrap targets is mitigated by the much faster bootstrapping times without the correctness risks. On Wed, Aug 3, 2022, at 6:21 PM, Bowen So

Re: [DISCUSS] Deprecate and remove resumable bootstrap and decommission

2022-08-03 Thread Bowen Song via dev
7;t). On 03/08/2022 23:11, Jeff Jirsa wrote: The hypothetical concern described is around potential data resurrection - would you still use resumable bootstrap if you knew that data deleted during those STW pauses was improperly resurrected? On Wed, Aug 3, 2022 at 2:40 PM Bowen Song via

Re: [DISCUSS] Deprecate and remove resumable bootstrap and decommission

2022-08-03 Thread Bowen Song via dev
I have benefited from the resumable bootstrap before, and I'm in favour of keeping the feature around. I've had streaming failures due to long STW GC pauses on some bootstrapping nodes, and I had to resume the bootstrap once or twice in order to get these nodes finish joinning the cluster. The

Re: [DISCUSS] Improve Commitlog write path

2022-07-26 Thread Bowen Song via dev
es for too long. With lower throughput large system can ingest more data. Does it make sense ? Thanks, Amit *From:* Bowen Song via dev *Sent:* Friday, July 22, 2022 4:37 PM *To:* dev@cassandra.apache.org *Subject:* Re: [DISCUSS] Improve Commitlog write path [CAUTION: External Email] Hi

Re: [DISCUSS] Improve Commitlog write path

2022-07-22 Thread Bowen Song via dev
reflecting in score. Do you think multi-threading is good to have now ? else please suggest if I need to test further. Thanks, Amit *From:* Bowen Song via dev *Sent:* Wednesday, July 20, 2022 4:13 PM *To:* dev@cassandra.apache.org *Subject:* Re: [DISCUSS] Improve Commitlog write path [CAUTI

Re: [DISCUSS] Improve Commitlog write path

2022-07-20 Thread Bowen Song via dev
From my past experience, the bottleneck for insert heavy workload is likely to be compaction, not commit log. You initially may see commit log as the bottleneck when the table size is relatively small, but as the table size increases, compaction will likely take its place and become the new bot