Re: Heavy one-off writes best practices

Alain RODRIGUEZ Tue, 30 Jan 2018 01:15:56 -0800

I noticed I did not give the credits to Eric Lubow from SimpleReach. The
video mentioned above is a talk he gave at the Cassandra Summit 2016 :-).


2018-01-30 9:07 GMT+00:00 Alain RODRIGUEZ <arodr...@gmail.com>:

> Hi Julien,
>
> Whether skinny rows or wide rows, data for a partition key is always
>> completely updated / overwritten, ie. every command is an insert.
>
>
> Insert and updates are kind of the same thing in Cassandra for standard
> data types, as Cassandra appends the operation and do not actually update
> any past data right away. My guess is you are actually  'updating'
> existing columns, rows or partitions.
>
> We manage data expiration with TTL set to several days.
>>
>
> I believe that for the reason mentioned above, this TTL only applies to
> data that would not be overwritten. All the updated / reinserted data, is
> resetting the TTL timer to the new value given to the column, range, row,
> or partition.
>
> This imposes a great load on the cluster (huge CPU consumption), this load
>> greatly impacts the constant reads we have. Read latency are fine the rest
>> of the time.
>>
>
> This is expected in writes heavy scenario. Writes are not touching the
> data disk, thus, the CPU is often the bottleneck in this case. Also, it is
> known that Spark (and similar distributed processing technologies) can harm
> regular transactions.
>
> Possible options to reduce the impact:
>
> - Use a specific data center for analytics, within the same cluster, and
> work locally there. Writes will still be replicated to the original DC
> (asynchronously) but it will no longer be responsible for coordinating the
> analytical jobs.
> - Use a coordinator ring to delegate most of the work to this 'proxy
> layer' between clients and Cassandra nodes (with data). A good starting
> point could be: https://www.youtube.com/watch?v=K0sQvaxiDH0. I am not
> sure how experimental or hard to deploy this architecture is, but I see
> there a smart move, probably very good for some use case. Maybe yours?
> - Simply limit the write speed in Spark if doable from a service
> perspective or add nodes, so spark is never strong enough to break regular
> transactions (this could be very expensive).
> - Run Spark mostly on off-peak hours
> - ... probably some more I cannot think of just now :).
>
> Is there any best practices we should follow to ease the load when
>> importing data into C* except
>>  - reducing the number of concurrent writes and throughput on the driver
>> side
>>
>
> Yes, as mentioned above, on throttling spark throughput if doable is
> definitively a good idea. If not you might have terrible surprises if
> someone from the dev team decides to add some more writes suddenly and
> Cassandra side is not ready for it.
>
>  - reducing the number of compaction threads and throughput on the cluster
>>
>
> Generally the number of compaction is well defined by default. You don't
> want to use more than 1/4 or 1/2 of the total available and generally no
> more than 8. Lowering the compaction throughput is a double-edged sword.
> Yes it would free some disk throughput immediately. Yet if compactions are
> stacking, SSTables are merging slowly and reads performances will decrease
> substantially, quite fast, as each read will have to hit a lot of files
> thus making an increasing number of reads. The throughput should be set to
> a value that is fast enough to keep up with compactions.
>
> If you really have to rewrite 100% of the data, every day, I would suggest
> you to create 10 new tables every day instead of rewriting existing data.
> Writing a new table 'MyAwesomeTable-20180130' for example and then simply
> dropping the one from 2 or 3 days ago and cleaning the snapshot, might be
> more efficient I would say. On the client side, it is about adding the date
> (passed or calculated).
>
> In particular :
>>  - is there any evidence that writing multiple tables at the same time
>> produces more load than writing the tables one at a time when tables are
>> completely written at once such as we do?
>
>
> I don't think so, excepted maybe that compactions within a single table
> cannot be done all in parallel, thus you would probably limit the load a
> bit in Cassandra. I am not even sure, a lot of progress was made in the
> past to make compactions more efficient :).
>
>  - because of the heavy writes, we use STC. Is it the best choice
>> considering data is completely overwritten once a day? Tables contain
>> collections and UDTs.
>
>
> STCS sounds reasonable, I would not start tuning here. TWCS could be
> considered to evict tombstones efficiently, but as I said earlier, I don't
> think you have a lot of expired tombstones, I would guess compactions +
> coordination for writes is being the cluster killer in your case, but
> please, let us know how compactions and tombstones look like in your
> cluster  .
>
> - compactions: nodetool compactionstats -H / check pending compactions
> - tombstones: use sstablemetadata on biggest / oldest SSTables or use
> monitoring to check the ratio of droppable tombstones.
>
> It's a very specific use case I never faced and I don't know your exact
> use case, so I am mostly guessing here. I can be wrong on some of the
> points above, but I am sure some people around will step in and correct me
> where this is the case :). I hope you'll find some useful information
> though.
>
> C*heers,
> -----------------------
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France / Spain
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
>
>
> 2018-01-30 8:12 GMT+00:00 Julien Moumne <jmou...@deezer.com>:
>
>> Hello, I am looking for best practices for the following use case :
>>
>> Once a day, we insert at the same time 10 full tables (several 100GiB
>> each) using Spark C* driver, without batching, with CL set to ALL.
>>
>> Whether skinny rows or wide rows, data for a partition key is always
>> completely updated / overwritten, ie. every command is an insert.
>>
>> This imposes a great load on the cluster (huge CPU consumption), this
>> load greatly impacts the constant reads we have. Read latency are fine the
>> rest of the time.
>>
>> Is there any best practices we should follow to ease the load when
>> importing data into C* except
>>  - reducing the number of concurrent writes and throughput on the driver
>> side
>>  - reducing the number of compaction threads and throughput on the cluster
>>
>> In particular :
>>  - is there any evidence that writing multiple tables at the same time
>> produces more load than writing the tables one at a time when tables are
>> completely written at once such as we do?
>>  - because of the heavy writes, we use STC. Is it the best choice
>> considering data is completely overwritten once a day? Tables contain
>> collections and UDTs.
>>
>> (We manage data expiration with TTL set to several days.
>> We use SSDs.)
>>
>> Thanks!
>>
>
>

Re: Heavy one-off writes best practices

Reply via email to