Hi,
I have a five nodes C* cluster suffering from a big number of pending
compaction tasks: 1) 571; 2) 91; 3) 367; 4) 22; 5) 232
Initially, it was holding one big table (table_a). With Spark, I read that
table, extended its data and stored in a second table_b. After this
copying/extending process
Hi Mikhail,
Could you please provide :
- your cluster version/topology (number of nodes, cpu, ram available etc)
- what kind of underlying storage you are using
- cfstat using -H option cause I'm never sure I'm converting bytes=>GB
You are storing 1Tb per node, so long running compaction is not r
The cluster has 5 nodes of d2.xlarge AWS type (32GB RAM, Attached SSD
disks), Cassandra 3.0.9.
Increased compaction throughput from 16 to 200 - active compaction
remaining time decreased.
What will happen if another node will join the cluster? - will former nodes
move part of theirs SSTables to the
Your compaction time won't improve immediately simply by adding nodes
because the old data still needs to be cleaned up.
What's your end goal? Why is having a spike in pending compaction tasks
following a massive write an issue? Are you seeing a dip in performance,
violating an SLA, or do you ju
Hi,
Testing to switch from sizetiered to Timewindow, did changed compaction
strategy on a table with a buckets of 3 days
After switching when I checked min and max timestamps on sstables I did see
data older than 3 days range in my case 30-60 days
So when we switch from sizetired to Timewindow,
Is old data TTLed already? If not, then I don't think TWCS will know when
to delete data.
My understanding about TWCS is, data has to be written with TTL. (Please
correct me, if wrong)
Regards,
Nitan K.
Cassandra and Oracle Architect/SME
Datastax Certified Cassandra expert
Oracle 10g Certified
Yes the data is TTLed, but I don't think that's the criteria for the TWCS.
My understanding is the data is divided into buckets based on written
timestamp.
Thanks
Pranay
On Fri, Apr 27, 2018, 1:17 PM Nitan Kainth wrote:
> Is old data TTLed already? If not, then I don't think TWCS will know when
TWCS uses the max timestamp in an sstable to determine what to compact
together, it won't anti-compact your data. The goal is to minimize I/O.
You'll have to wait for all your mixed-timestamp sstable data to TTL out
before TWCS's windowing kicks in optimally.
http://thelastpickle.com/blog/2016/1
IN cases where a table was dropped and re-added, there are now two table
directories with different uuids with sstables.
If you don't have knowledge of which one is active, how do you determine
which is the active table directory? I have tried cf_id from
system.schema_columnfamilies and that can w
Hi Mikhall,
There are a few ways to speed up compactions in the short term:
- nodetool setcompactionthroughput 0
This will unthrottle compactions but obviously unthrottling compactions puts
you at risk of high latency while compactions are running.
- nodetool setconcurrentcompactors 2
You usually
10 matches
Mail list logo