Read repair due to digest mismatch and speculative retry can both cause
some behaviors that are hard to reason about (usually seen if a host stops
accepting writes due to bad disk, which you havent described, but generally
speaking, there are times when reads will block on writing to extra
replicas).

The patch from https://issues.apache.org/jira/browse/CASSANDRA-10726
changes this behavior significantly.

The last message in this thread (about huge read repair mutations) suggests
that your writes during the bounce got some partitions quite out of sync,
and hints aren't replaying fast enough to fill in the gaps before you read,
and the read repair is timing out. The read repair timing out wouldn't
block the read after 10726, so if you're seeing read timeouts right now,
what you probably want to do is run repair or read much smaller pages so
that read repair succeeds, or increase your commitlog segment size from 32M
to 128M or so until the read repair actually succeeds.


On Tue, Jan 1, 2019 at 12:18 AM Vlad <qa23d-...@yahoo.com.invalid> wrote:

> Hi All and Happy New Year!!!
>
> This year started with Cassandra 3.11.3 sometimes forces level ALL despite
> query level LOCAL_QUORUM (actually there is only one DC) and it fails with
> timeout.
>
> As far as I understand, it can be caused by read repair attempts (we see
> "DigestMismatch" errors in Cassandra log), but table has no read repair
> configured:
>
>     AND bloom_filter_fp_chance = 0.01
>     AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
>     AND comment = ''
>     AND compaction = {'class':
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy',
> 'max_threshold': '32', 'min_threshold': '4'}
>     AND compression = {'chunk_length_in_kb': '64', 'class':
> 'org.apache.cassandra.io.compress.LZ4Compressor'}
>     AND crc_check_chance = 1.0
>     AND *dclocal_read_repair_chance *= 0.0
>     AND default_time_to_live = 0
>     AND gc_grace_seconds = 864000
>     AND max_index_interval = 2048
>     AND memtable_flush_period_in_ms = 0
>     AND min_index_interval = 128
>     AND *read_repair_chance *= 0.0
>     AND speculative_retry = '99PERCENTILE';
>
>
> Any suggestions?
>
> Thanks.
>

Reply via email to