Re: High disk usage casaandra 3.11.7

Abdul Patel Fri, 17 Sep 2021 15:54:58 -0700

Twcs is best for TTL not for excipilitly delete correct?


On Friday, September 17, 2021, Abdul Patel <abd786...@gmail.com> wrote:

> 48hrs deletion is deleting older data more than 48hrs .
> LCS was used as its more of an write once and read many application.
>
> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>
>> Congratulation! You've just found out the cause of it. Does all data get
>> deletes 48 hours after they are inserted? If so, are you sure LCS is the
>> right compaction strategy for this table? TWCS sounds like a much better
>> fit for this purpose.
>> On 17/09/2021 19:16, Abdul Patel wrote:
>>
>> Thanks.
>> Application deletes data every 48hrs of older data.
>> Auto compaction works but as space is full ..errorlog only says not
>> enough space to run compaction.
>>
>>
>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>
>>> If major compaction is failing due to disk space constraint, you could
>>> copy the files to another server and run a major compaction there instead
>>> (i.e.: start cassandra on new server but not joining the existing cluster).
>>> If you must replace the node, at least use the
>>> '-Dcassandra.replace_address=...' parameter instead of 'nodetool
>>> decommission' and then re-add, because the later changes the token ranges
>>> on the node, and that makes troubleshooting harder.
>>>
>>> 22GB of data amplifies to nearly 300GB sounds very impossible to me,
>>> there must be something else going on. Have you turned off auto compaction?
>>> Did you change the default parameters (namely, the 'fanout_size') for LCS?
>>> If this doesn't give you a clue, have a look at the SSTable data files, do
>>> you notice anything unusual? For example, too many small files, or some
>>> files are extraordinarily large. Also have a look at the logs, is there
>>> anything unusual? Also, do you know the application logic? Does it do a
>>> lots of delete or update (including 'upsert')? Writes with TTL? Does the
>>> table has a default TTL?
>>> On 17/09/2021 13:45, Abdul Patel wrote:
>>>
>>> Close 300 gb data. Nodetool decommission/removenode and added back one
>>> node ans it came back to 22Gb.
>>> Cant run major compaction as no space much left.
>>>
>>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>>
>>>> Okay, so how big exactly is the data on disk? You said removing and
>>>> adding a new node gives you 20GB on disk, was that done via the
>>>> '-Dcassandra.replace_address=...' parameter? If not, the new node will
>>>> almost certainly have a different token range and not directly comparable
>>>> to the existing node if you have uneven partitions or small number of
>>>> partitions in the table. Also, try major compaction, it's a lot easier than
>>>> replacing a node.
>>>>
>>>>
>>>> On 17/09/2021 12:28, Abdul Patel wrote:
>>>>
>>>> Yes i checked and cleared all snapshots and also i had incremental
>>>> backups in backup folder ..i removed the same .. its purely data..
>>>>
>>>>
>>>> On Friday, September 17, 2021, Bowen Song <bo...@bso.ng> wrote:
>>>>
>>>>> Assuming your total disk space is a lot bigger than 50GB in size
>>>>> (accounting for disk space amplification, commit log, logs, OS data, 
>>>>> etc.),
>>>>> I would suspect the disk space is being used by something else. Have you
>>>>> checked that the disk space is actually being used by the cassandra data
>>>>> directory? If so, have a look at 'nodetool listsnapshots' command output 
>>>>> as
>>>>> well.
>>>>>
>>>>>
>>>>> On 17/09/2021 05:48, Abdul Patel wrote:
>>>>>
>>>>>> Hello
>>>>>>
>>>>>> We have cassandra with leveledcompaction strategy, recently found
>>>>>> filesystem almost 90% full but the data was only 10m records.
>>>>>> Manual compaction will work? As not sure its recommended and space is
>>>>>> also constraint ..tried removing and adding one node and now data is at
>>>>>> 20GB which looks appropropiate.
>>>>>> So is only solution to reclaim space is remove/add node?
>>>>>>
>>>>>

Re: High disk usage casaandra 3.11.7

Reply via email to