Congratulation! You've just found out the cause of it. Does all data get deletes 48 hours after they are inserted? If so, are you sure LCS is the right compaction strategy for this table? TWCS sounds like a much better fit for this purpose.

On 17/09/2021 19:16, Abdul Patel wrote:
Thanks.
Application deletes data every 48hrs of older data.
Auto compaction works but as space is full ..errorlog only says not enough space to run compaction.


On Friday, September 17, 2021, Bowen Song <bo...@bso.ng <mailto:bo...@bso.ng>> wrote:

    If major compaction is failing due to disk space constraint, you
    could copy the files to another server and run a major compaction
    there instead (i.e.: start cassandra on new server but not joining
    the existing cluster). If you must replace the node, at least use
    the '-Dcassandra.replace_address=...' parameter instead of
    'nodetool decommission' and then re-add, because the later changes
    the token ranges on the node, and that makes troubleshooting harder.

    22GB of data amplifies to nearly 300GB sounds very impossible to
    me, there must be something else going on. Have you turned off
    auto compaction? Did you change the default parameters (namely,
    the 'fanout_size') for LCS? If this doesn't give you a clue, have
    a look at the SSTable data files, do you notice anything unusual?
    For example, too many small files, or some files are
    extraordinarily large. Also have a look at the logs, is there
    anything unusual? Also, do you know the application logic? Does it
    do a lots of delete or update (including 'upsert')? Writes with
    TTL? Does the table has a default TTL?

    On 17/09/2021 13:45, Abdul Patel wrote:
    Close 300 gb data. Nodetool decommission/removenode and added
    back one node ans it came back to 22Gb.
    Cant run major compaction as no space much left.

    On Friday, September 17, 2021, Bowen Song <bo...@bso.ng
    <mailto:bo...@bso.ng>> wrote:

        Okay, so how big exactly is the data on disk? You said
        removing and adding a new node gives you 20GB on disk, was
        that done via the '-Dcassandra.replace_address=...'
        parameter? If not, the new node will almost certainly have a
        different token range and not directly comparable to the
        existing node if you have uneven partitions or small number
        of partitions in the table. Also, try major compaction, it's
        a lot easier than replacing a node.


        On 17/09/2021 12:28, Abdul Patel wrote:
        Yes i checked and cleared all snapshots and also i had
        incremental backups in backup folder ..i removed the same ..
        its purely data..


        On Friday, September 17, 2021, Bowen Song <bo...@bso.ng
        <mailto:bo...@bso.ng>> wrote:

            Assuming your total disk space is a lot bigger than 50GB
            in size (accounting for disk space amplification, commit
            log, logs, OS data, etc.), I would suspect the disk
            space is being used by something else. Have you checked
            that the disk space is actually being used by the
            cassandra data directory? If so, have a look at
            'nodetool listsnapshots' command output as well.


            On 17/09/2021 05:48, Abdul Patel wrote:

                Hello

                We have cassandra with leveledcompaction strategy,
                recently found filesystem almost 90% full but the
                data was only 10m records.
                Manual compaction will work? As not sure its
                recommended and space is also constraint ..tried
                removing and adding one node and now data is at 20GB
                which looks appropropiate.
                So is only solution to reclaim space is remove/add node?

Reply via email to