Thanks Jeff. We've been trying to find the optimal setting for our TWCS.
It's just two tables with only one of the tables being a factor. Initially
we set the window to an hour, and then increased it to a day. It still
seemed that there were lots of small sstables on disk. dozens of small db
files that were maybe only a few megabytes. These were all the most recent
sstables in the data directory. As we've increased the window size and the
tombstone_threshold we've seen the size of the newest db files on disk to
now be larger, as we would expect.

The total size of the table in question is between 500GB and 550GB on each
node. At certain intervals it seems that all nodes begin a cycle of
compactions and the number of pending tasks goes up. During this period we
can see the compactions use up maybe 100 or 200GB, sometimes more, and then
when everything finished, we gain most of that disk space back. We usually
have over 500GB free but it can trickle down to only 150GB free. I assume
solving this is about finding the optimal TWCS settings for our TTL data.

The other thought is that we currently have data mixed in that does not
have a TTL and we are strongly considering putting this data in it's own
table.

On Mon, Oct 24, 2016 at 6:38 AM, Jeff Jirsa <jeff.ji...@crowdstrike.com>
wrote:

>
>
> If you drop window size, you may force some window-major compactions (if
> you go from 1 week windows to 1 day windows, you’ll have 6 days worth of
> files start compacting into 1-day sstables).
>
> If you increase window size, you’ll likely have adjacent windows join (if
> you go from 1 day windows to 2 day windows, nearly every sstable will be
> joined with the one in the day adjacent to it).
>
>
>
> Short of altering compaction strategies, it seems unlikely that you’d see
> huge jumps where you’d run out of space. How many tables/CFs have TWCS
> enabled? How much space are you using, and how much is free?  Do you have
> hundreds with the same TWCS parameters?
>
>
>
> If you’re running very close to your capacity, you may want to consider
> dropping concurrent compactors down so fewer compaction tasks run at the
> same time. That will translate proportionally to the amount of extra disk
> you have consumed by compaction in a TWCS setting.
>
>
>
>
>
>
>
> *From: *Seth Edwards <s...@pubnub.com>
> *Reply-To: *"user@cassandra.apache.org" <user@cassandra.apache.org>
> *Date: *Sunday, October 23, 2016 at 7:03 PM
> *To: *user <user@cassandra.apache.org>
> *Subject: *Re: Question about compaction strategy changes
>
>
>
> More compactions meaning "rows to be compacted" or actual number of
> pending compactions? I assumed when I run nodetool compactionstats the
> number of pending tasks would line up with number of sstables that will be
> compacted. Most of the time this is idle, then we hit spots when it could
> jump into the thousands and we and up being short of a few hundred GB of
> disk space.
>
>
>
> On Sun, Oct 23, 2016 at 5:49 PM, kurt Greaves <k...@instaclustr.com>
> wrote:
>
>
>
> On 22 October 2016 at 03:37, Seth Edwards <s...@pubnub.com> wrote:
>
> We're using TWCS and we notice that if we make changes to the options to
> the window unit or size, it seems to implicitly start recompacting all
> sstables.
>
>
>
> If you increase the window unit or size you potentially increase the
> number of SSTable candidates for compaction inside each window, which is
> why you would see more compactions. If you decrease the window you
> shouldn't see any new compactions kicked off, however be aware that you
> will have SSTables covering multiple windows, so until a full cycle of your
> TTL passes your read queries won't benefit from the smaller window size.
>
>
> Kurt Greaves
>
> k...@instaclustr.com
>
> www.instaclustr.com
> <https://urldefense.proofpoint.com/v2/url?u=http-3A__www.instaclustr.com&d=DQMFaQ&c=08AGY6txKsvMOP6lYkHQpPMRA1U6kqhAwGa8-0QCg3M&r=yfYEBHVkX6l0zImlOIBID0gmhluYPD5Jje-3CtaT3ow&m=bT5rVUkGNycBRCPaF4XuTwYmPMNlu83RBkGLXPp7up4&s=GZ6bHFwxWbRnT6rYMPaZStcQTVz0xDq9HNmMMPDjZ9U&e=>
>
>
> CONFIDENTIALITY NOTE: This e-mail and any attachments are confidential and
> may be legally privileged. If you are not the intended recipient, do not
> disclose, copy, distribute, or use this email or any attachments. If you
> have received this in error please let the sender know and then delete the
> email and all attachments.
>

Reply via email to