Hi everyone,

There are a lot of articles, and, probably this question was asked already
many times, but I still not 100% sure.

We have a table, which we load almost full every night with spark job and
consistency LOCAL_QUORUM and record TTL 7 days. This is to remove some
records if they are not present in last 7 imports. Table is located in 2
DCs. We are interested only in the last record state. Definition of the
table below. After the load, we are running repair with reaper on this
table, which takes lot of time and resources. We have multiple such tables
and most of the repair time is busy with such tables. Running full load
again takes less time than repair on this table.

Question is: Do we, actually, need to run repairs on this table at all ? If
yes, how offten, daily, weekly ?

Thanks in advance,
Maxim.

WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class':
'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
    AND compression = {'chunk_length_in_kb': '16', 'class':
'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

Reply via email to