Re: Cassandra commitlog corruption on hard shutdown

2022-04-04 Thread Leon Zaruvinsky
n (Cassandra write performance >2ms did not seem to be a bottleneck) - We've had *zero* commitlog corruption errors since we rolled this out to our fleet 6 months ago!! Previously using batch commitlog, we faced 1-2 corruptions per month. Cheers, Leon On Tue, Aug 3, 2021 at 11:39 PM Leon Za

Re: Faster bulk keyspace creation

2022-03-09 Thread Leon Zaruvinsky
de, you don't need to worry about schema disagreement it may > cause, as the server side internally will ensure the consistency of the > schema. > > On 09/03/2022 18:35, Leon Zaruvinsky wrote: > > Hey folks, > > > > A step in our Cassandra restore process is to re

Faster bulk keyspace creation

2022-03-09 Thread Leon Zaruvinsky
Hey folks, A step in our Cassandra restore process is to re-create every keyspace that existed in the backup in a brand new cluster. Because these creations are sequential, and because we have _a lot_ of keyspaces, this ends up being the slowest part of our restore. We already have some optimiza

Re: Cassandra commitlog corruption on hard shutdown

2021-08-03 Thread Leon Zaruvinsky
0s as was suggested later, then >> https://issues.apache.org/jira/browse/CASSANDRA-11995 addresses at least >> one cause/case of that particular bug. >> >> >> >> On Mon, Jul 26, 2021 at 3:11 PM Leon Zaruvinsky >> wrote: >> >>> And for completenes

Re: Cassandra commitlog corruption on hard shutdown

2021-07-26 Thread Leon Zaruvinsky
es at least > one cause/case of that particular bug. > > > > On Mon, Jul 26, 2021 at 3:11 PM Leon Zaruvinsky > wrote: > >> And for completeness, a sample stack trace: >> >> ERROR [2021-07-21T02:11:01.994Z] >> org.apache.cassandra.db.commitlog.CommitLo

Re: Cassandra commitlog corruption on hard shutdown

2021-07-26 Thread Leon Zaruvinsky
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:741) On Mon, Jul 26, 2021 at 6:08 PM Leon Zaruvinsky wrote: > Currently we're using commitlog_batch: > > commitlog_sync: batch > commitlog_sync_batch_window_in_ms: 2 > commitlog_segment_size_in_mb: 32 > > d

Re: Cassandra commitlog corruption on hard shutdown

2021-07-26 Thread Leon Zaruvinsky
ct cassandra to cleanup and start cleanly, which version > are you running? > > > > On Mon, Jul 26, 2021 at 1:00 PM Leon Zaruvinsky > wrote: > >> Hi Cassandra community, >> >> We (and others) regularly run into commit log corruptions that are caused >

Cassandra commitlog corruption on hard shutdown

2021-07-26 Thread Leon Zaruvinsky
Hi Cassandra community, We (and others) regularly run into commit log corruptions that are caused by Cassandra, or the underlying infrastructure, being hard restarted. I suspect that this is because it happens in the middle of a commitlog file write to disk. Could anyone point me at resources /

Re: GC pauses way up after single node Cassandra 2.2 -> 3.11 binary upgrade

2020-10-27 Thread Leon Zaruvinsky
> Our JVM options are unchanged between 2.2 and 3.11 >> > > For the sake of clarity, do you mean: > (a) you're using the default JVM options in 3.11 and it's different to the > options you had in 2.2? > (b) you've copied the same JVM options you had in 2.2 to 3.11? > (b), which are the default opt

Re: GC pauses way up after single node Cassandra 2.2 -> 3.11 binary upgrade

2020-10-27 Thread Leon Zaruvinsky
Thanks Erick. Our JVM options are unchanged between 2.2 and 3.11, and we have disk access mode set to standard. Generally we’ve maintained all configuration between the two versions. Read throughput (rate, bytes read/range scanned, etc.) seems fairly consistent before and after the upgrade ac

GC pauses way up after single node Cassandra 2.2 -> 3.11 binary upgrade

2020-10-27 Thread Leon Zaruvinsky
Hi, I'm attempting an upgrade of Cassandra 2.2.18 to 3.11.6, but had to abort because of major performance issues associated with GC pauses. Details: 3 node cluster, RF 3, 1 DC ~2TB data per node Heap Size: 12G / New Size: 5G I didn't even get very far in the upgrade - I just upgraded a binary o

Re: Difference in num_tokens between Cassandra 2 and 3?

2020-08-06 Thread Leon Zaruvinsky
Thanks Erick, that confirms my suspicion. Cheers! On Thu, Aug 6, 2020 at 8:55 PM Erick Ramirez wrote: > C* 3.0 added a new algorithm that optimised the token allocation > (CASSANDRA-7032) [1] with allocate_tokens_for_keyspace in cassandra.yaml > (originally allocate_tokens_keyspace but renamed)

Difference in num_tokens between Cassandra 2 and 3?

2020-08-06 Thread Leon Zaruvinsky
Hi, I'm currently investigating an upgrade for our Cassandra cluster from 2.2 to 3.11, and as part of that would like to understand if there is any change in how the cluster behaves w.r.t number of tokens. For historical reasons, we have num_tokens set very high but want to make sure that this is

Re: Is deleting live sstable safe in this scenario?

2020-05-27 Thread Leon Zaruvinsky
> specific_hosts` >> >> On Wed, May 27, 2020 at 9:06 AM Nitan Kainth >> wrote: >> >>> I didn't get you Leon, >>> >>> But, the simple thing is just to follow the steps and you will be fine. >>> You can't run the repair if the

Re: Is deleting live sstable safe in this scenario?

2020-05-27 Thread Leon Zaruvinsky
being pedantic > here, that means stop the host, while it’s stopped repair the surviving > replicas, then bootstrap a replacement on top of the same tokens) > > > > > On May 26, 2020, at 4:46 PM, Leon Zaruvinsky > wrote: > > > >  > > Hi all, > > >

Is deleting live sstable safe in this scenario?

2020-05-26 Thread Leon Zaruvinsky
Hi all, I'm looking to understand Cassandra's behavior in an sstable corruption scenario, and what the minimum amount of work is that needs to be done to remove a bad sstable file. Consider: 3 node, RF 3 cluster, reads/writes at quorum SStable corruption exception on one node at keyspace1/table1/

Re: Cassandra build failing after Central Repository HTTPS

2020-01-15 Thread Leon Zaruvinsky
the error below. It's curious that Junit was published to the http but not https repository. Either way, thanks for the assistance in debugging! On Wed, Jan 15, 2020 at 8:19 PM Leon Zaruvinsky wrote: > I agree that something feels wonky on the Circle container. I was able to > succe

Re: Cassandra build failing after Central Repository HTTPS

2020-01-15 Thread Leon Zaruvinsky
I agree that something feels wonky on the Circle container. I was able to successfully build locally. I SSHed into the container, cleared out .m2/repository and still can't get it to build. Using ant 1.10.7 with environment: openjdk version "1.8.0_222" OpenJDK Runtime Environment (build 1.8.0

Re: Cassandra build failing after Central Repository HTTPS

2020-01-15 Thread Leon Zaruvinsky
-pick sha 63ff65a8dd3a31e500ae5ec6232f1f9eade6fa3d which was > committed after the 2.2.14 release tag. > > https://github.com/apache/cassandra/commit/63ff65a8dd3a31e500ae5ec6232f1f9eade6fa3d > > -- > Kind regards, > Michael > >> On 1/15/20 5:44 PM, Leon Zaruvinsky wrote: >> Hey

Cassandra build failing after Central Repository HTTPS

2020-01-15 Thread Leon Zaruvinsky
Hey all, I'm having trouble with the building Cassandra 2.2.14 on CircleCI since Central Repo has stopped supporting http.* * https://central.sonatype.org/articles/2020/Jan/15/501-https-required-error/ I've updated the build.properties and build.xml files to use https. However, it seems that the

Breaking up major compacted Sstable with TWCS

2019-07-11 Thread Leon Zaruvinsky
Hi, We are switching a table to run using TWCS. However, after running the alter statement, we ran a major compaction without understanding the implications. Now, while new sstables are properly being created according to the time window, there is a giant sstable sitting around waiting for expi