> On Mar 2, 2020, at 2:02 AM, Oleksandr Shulgin <oleksandr.shul...@zalando.de>
> wrote:
>
>
>> On Sat, Feb 29, 2020 at 8:49 AM Jeff Jirsa <jji...@gmail.com> wrote:
>
>> If you’re really really advanced you MIGHT be able to use spark +
>> cqlsstablewriter to create a ton of sstables with just tombstones one them
>> representing deletes, then either nodetool refresh or sstableloader them
>> into the cluster
>>
>> If you create sstables on the right timestamp boundaries to match your twcs
>> windows, each one will compact with the data file or the same window and
>> delete the data.
>>
>> Will be a ton of compaction though. Not as efficient as the deleting
>> strategy. Also not sure if the offline cqlsstablewriter actually supports
>> deletes because I’m on my phone and too lazy to check. If it doesn’t it
>> probably wouldn’t be that hard to add.
>
> Yeah, even if that would work with the CQLSSTableWriter, the ton of
> user-defined compaction is what we would like to avoid. We are OK with
> rewriting all files once, though.
>
> Assuming, we get it running on our server version: do I get it right that
> running `nodetool upgradesstables -a` is going to rewrite all the SSTable
> files subject to the defined compaction strategy?
You don’t need to do user defined compaction here
As soon as the data files are on the server, the next time TWCS looks for
compaction candidates (e.g. next flush, so “nodetool flush”), it’ll find all of
the extra sstables and start putting them into the right windows.
Note that you have to have the sstables lined up properly - when you build
them, they must stop on the right timestamp boundaries or this doesn’t work.
You can try a day at a time though - process all of the deletes for one time
window and load them in.
(Again, presumes this works with the cqlsstablewriter which I haven’t looked at
in years)