Re: Digest mismatch

Joe Obernberger Mon, 14 Dec 2020 13:11:00 -0800

Some more info.

From java using the Datastax 4.9.0 driver, I'm selecting an entiretable, after about 17 million rows (the table is probably around 150million rows), I get:

com.datastax.oss.driver.api.core.servererrors.ReadFailureException:Cassandra failure during read query at consistency ONE (1 responses wererequired but only 0 replica responded, 1 failed)

It's almost as if the data was not written with LOCAL_QUORUM, but I'vetriple checked.

If I stop writes to the table and reduce the load on Cassandra, then it(java program) works OK. Presto queries still fail, but that might be aPresto issue. Interestingly they sometimes fail quickly, coming backwith the 'Cassandra failure during read query' error very quickly, butsometimes go through 140 million rows and then die.

Are regular table repairs required to be run when using LOCAL_QUORUM? Isee no nodes down, or disk failures.


-Joe

On 12/14/2020 9:41 AM, Joe Obernberger wrote:

Thanks all for the help on this. I've changed all my writes toLOCAL_QUORUM, and same with reads. Under a constant load of doingwrites to a table and reads from the same table, I'm still getting the:
DEBUG [ReadRepairStage:372] 2020-12-14 09:36:09,002ReadCallback.java:244 - Digest mismatch:org.apache.cassandra.service.DigestMismatchException: Mismatch for keyDecoratedKey(-7287062361589376757,44535f313034335f333332353839305f323032302d31322d31325430302d31392d33312e3330335a)(054250ecd7170b1707ec36c6f1798ed0 vs 5752eec36bff050dd363b7803c500a95) atorg.apache.cassandra.service.DigestResolver.compareResponses(DigestResolver.java:92)~[apache-cassandra-3.11.9.jar:3.11.9] atorg.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:235)~[apache-cassandra-3.11.9.jar:3.11.9] atjava.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)[na:1.8.0_272] atjava.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)[na:1.8.0_272] atorg.apache.cassandra.concurrent.NamedThreadFactory.lambda$threadLocalDeallocator$0(NamedThreadFactory.java:84)[apache-cassandra-3.11.9.jar:3.11.9]
        at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_272]
Under load this happens a lot; several times a second on each of theserver nodes. I started with a new table and under light load, itworked wonderfully - no issues. But under heavy load, it stilloccurs. Is there a different setting?Also, when this happens, I cannot query the table from presto as Ithen get the familiar:
"Query 20201214_143949_00000_b3fnt failed: Cassandra timeout duringread query at consistency LOCAL_QUORUM (2 responses were required butonly 1 replica responded)"
Changed presto to use ONE results in an error about 1 were required,but only 1 responded.
Any ideas?  Things to try?  Thanks!

-Joe

On 12/3/2020 12:49 AM, Erick Ramirez wrote:
    Thank you Steve - once I have the key, how do I get to a node?

Run this command to determine which replicas own the partition:

$ nodetool getendpoints <partition_key>

    So if the propagation has not taken place and a node doesn't have
    the data and is the first to 'be asked' the client will get no data?
That's correct. It will not return data it doesn't have when queryingwith a consistency of ONE. There are limited cases where ONE isapplicable. In most cases, a strong consistency of LOCAL_QUORUM isrecommended to avoid the scenario you described. Cheers!
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Re: Digest mismatch

Reply via email to