Suppose I have a cassandra cluster with the data that is skewed such that
one node have 40% more data than other nodes.Since while creating the
cassandra the tokens were distributed uniformly.
Now to make the data uniform I have to recalculate the tokens and assign
them to nodes in the cluster. The
Hi, all,
We've noticed there is a new feature for streaming changed data other
streaming service. Doc: http://cassandra.apache.org/doc/latest/operating/cdc
.html
We are evaluating the stability (and maturity) of this feature, and
possibly integrate this with Kafka (associated w/ its connector). H
Suppose I have a cassandra cluster with the data that is skewed such that
one node have 40% more data than other nodes.Since while creating the
cassandra the tokens were distributed uniformly.
Now to make the data uniform I have to recalculate the tokens and assign
them to nodes in the cluster. The
Are you using a load balancing policy? That sounds like you are only using
node2 as a coordinator.
You should choose a partition key that enables you to have a uniform
distribution of partitions amongst the nodes and refrain from having too
many wide rows/a small number of wide partitions. If your tokens are
already uniformly distributed, recalculating in order to achieve a better
data load bala
Hi All,
Am in the process of learning batch operations. Here is what I tried.
Executed a CQL query against the student table(student_id is the primary key).
select student_id,position,WRITETIME(class_id),WRITETIME(position)
FROM student WHERE student_id='s123';
student_id position writetime(cl
Dear community,
I'd like to receive additional info on how to modify a keyspace replication
strategy.
My Cassandra cluster is on AWS, Cassandra 2.1.15 using vnodes, the cluster's
snitch is configured to Ec2Snitch, but the keyspace the developers created has
replication class SimpleStrategy = 3.
The original timestamp is bigger than the timestamp you're using in your
batch. Cassandra uses timestamps for conflict resolution, so the batch
write will lose.
On Wed, Sep 13, 2017 at 11:59 AM Deepak Panda
wrote:
> Hi All,
>
> Am in the process of learning batch operations. Here is what I trie
I am using RoundRobin
cluster = Cluster.builder()...( socket stuff, pool option stuff ... )
.withLoadBalancingPolicy( new RoundRobinPolicy() )
.addContactPoints( hosts )
.build();
On 09/13/2017 03:02 AM, kurt greaves wrote:
Are you us
Hi all,
I was trying to configure the Cassandra code formatter and downloaded
IntelliJ-codestyle.jar from this link:
https://wiki.apache.org/cassandra/CodeStyle
After extracting this JAR, I was able to import codestyle/Default_1_.xml
into my project and formatting seemed to work.
However, I'm wo
Hi,
the steps are:
- ALTER KEYSPACE to change your replication strategy
- "nodetool repair -pr " on ALL nodes or full repair
"nodetool repair " on enough replica to distribute and
rebalance your data to replicas
- nodetool cleanup on every node to remove superfluous data
Please note that you'd be
Hi All,
I plan to install cassandra on prem, we expect load of 10mil inserts per
minute . Are there any thumb rules for configuration, HW requirements, mem
allocation etc` ?
Thanks
Avi
The token distribution isn't going to change - the way Cassandra maps replicas
will change.
How many data centers/regions will you have when you're done? What's your RF
now? You definitely need to run repair before you ALTER, but you've got a bit
of a race here between the repairs and the ALTE
Some notes from ~7 years of running in prod below - note though that none of
this matters, the only thing that matters is benchmarking your load on your own
hardware. Definitely run benchmarks and figure out what works for you.
166k/s is something you CAN hit with a 3-5 node cluster with the rig
Is it helpful to run nodetool compaction in cassandra?
or automatic compaction is just fine.
Regards
15 matches
Mail list logo