I noticed I did not give the credits to Eric Lubow from SimpleReach. The video mentioned above is a talk he gave at the Cassandra Summit 2016 :-).
2018-01-30 9:07 GMT+00:00 Alain RODRIGUEZ <arodr...@gmail.com>: > Hi Julien, > > Whether skinny rows or wide rows, data for a partition key is always >> completely updated / overwritten, ie. every command is an insert. > > > Insert and updates are kind of the same thing in Cassandra for standard > data types, as Cassandra appends the operation and do not actually update > any past data right away. My guess is you are actually 'updating' > existing columns, rows or partitions. > > We manage data expiration with TTL set to several days. >> > > I believe that for the reason mentioned above, this TTL only applies to > data that would not be overwritten. All the updated / reinserted data, is > resetting the TTL timer to the new value given to the column, range, row, > or partition. > > This imposes a great load on the cluster (huge CPU consumption), this load >> greatly impacts the constant reads we have. Read latency are fine the rest >> of the time. >> > > This is expected in writes heavy scenario. Writes are not touching the > data disk, thus, the CPU is often the bottleneck in this case. Also, it is > known that Spark (and similar distributed processing technologies) can harm > regular transactions. > > Possible options to reduce the impact: > > - Use a specific data center for analytics, within the same cluster, and > work locally there. Writes will still be replicated to the original DC > (asynchronously) but it will no longer be responsible for coordinating the > analytical jobs. > - Use a coordinator ring to delegate most of the work to this 'proxy > layer' between clients and Cassandra nodes (with data). A good starting > point could be: https://www.youtube.com/watch?v=K0sQvaxiDH0. I am not > sure how experimental or hard to deploy this architecture is, but I see > there a smart move, probably very good for some use case. Maybe yours? > - Simply limit the write speed in Spark if doable from a service > perspective or add nodes, so spark is never strong enough to break regular > transactions (this could be very expensive). > - Run Spark mostly on off-peak hours > - ... probably some more I cannot think of just now :). > > Is there any best practices we should follow to ease the load when >> importing data into C* except >> - reducing the number of concurrent writes and throughput on the driver >> side >> > > Yes, as mentioned above, on throttling spark throughput if doable is > definitively a good idea. If not you might have terrible surprises if > someone from the dev team decides to add some more writes suddenly and > Cassandra side is not ready for it. > > - reducing the number of compaction threads and throughput on the cluster >> > > Generally the number of compaction is well defined by default. You don't > want to use more than 1/4 or 1/2 of the total available and generally no > more than 8. Lowering the compaction throughput is a double-edged sword. > Yes it would free some disk throughput immediately. Yet if compactions are > stacking, SSTables are merging slowly and reads performances will decrease > substantially, quite fast, as each read will have to hit a lot of files > thus making an increasing number of reads. The throughput should be set to > a value that is fast enough to keep up with compactions. > > If you really have to rewrite 100% of the data, every day, I would suggest > you to create 10 new tables every day instead of rewriting existing data. > Writing a new table 'MyAwesomeTable-20180130' for example and then simply > dropping the one from 2 or 3 days ago and cleaning the snapshot, might be > more efficient I would say. On the client side, it is about adding the date > (passed or calculated). > > In particular : >> - is there any evidence that writing multiple tables at the same time >> produces more load than writing the tables one at a time when tables are >> completely written at once such as we do? > > > I don't think so, excepted maybe that compactions within a single table > cannot be done all in parallel, thus you would probably limit the load a > bit in Cassandra. I am not even sure, a lot of progress was made in the > past to make compactions more efficient :). > > - because of the heavy writes, we use STC. Is it the best choice >> considering data is completely overwritten once a day? Tables contain >> collections and UDTs. > > > STCS sounds reasonable, I would not start tuning here. TWCS could be > considered to evict tombstones efficiently, but as I said earlier, I don't > think you have a lot of expired tombstones, I would guess compactions + > coordination for writes is being the cluster killer in your case, but > please, let us know how compactions and tombstones look like in your > cluster . > > - compactions: nodetool compactionstats -H / check pending compactions > - tombstones: use sstablemetadata on biggest / oldest SSTables or use > monitoring to check the ratio of droppable tombstones. > > It's a very specific use case I never faced and I don't know your exact > use case, so I am mostly guessing here. I can be wrong on some of the > points above, but I am sure some people around will step in and correct me > where this is the case :). I hope you'll find some useful information > though. > > C*heers, > ----------------------- > Alain Rodriguez - @arodream - al...@thelastpickle.com > France / Spain > > The Last Pickle - Apache Cassandra Consulting > http://www.thelastpickle.com > > > > 2018-01-30 8:12 GMT+00:00 Julien Moumne <jmou...@deezer.com>: > >> Hello, I am looking for best practices for the following use case : >> >> Once a day, we insert at the same time 10 full tables (several 100GiB >> each) using Spark C* driver, without batching, with CL set to ALL. >> >> Whether skinny rows or wide rows, data for a partition key is always >> completely updated / overwritten, ie. every command is an insert. >> >> This imposes a great load on the cluster (huge CPU consumption), this >> load greatly impacts the constant reads we have. Read latency are fine the >> rest of the time. >> >> Is there any best practices we should follow to ease the load when >> importing data into C* except >> - reducing the number of concurrent writes and throughput on the driver >> side >> - reducing the number of compaction threads and throughput on the cluster >> >> In particular : >> - is there any evidence that writing multiple tables at the same time >> produces more load than writing the tables one at a time when tables are >> completely written at once such as we do? >> - because of the heavy writes, we use STC. Is it the best choice >> considering data is completely overwritten once a day? Tables contain >> collections and UDTs. >> >> (We manage data expiration with TTL set to several days. >> We use SSDs.) >> >> Thanks! >> > >