Thanks for the response.

All nodes are using NTP.
Thanks,
Jason

      From: Kai Wang <dep...@gmail.com>
 To: user@cassandra.apache.org; Jason Kania <jason.ka...@ymail.com> 
 Sent: Wednesday, March 30, 2016 10:59 AM
 Subject: Re: Inconsistent query results and node state
   
Do you have NTP setup on all nodes?

On Tue, Mar 29, 2016 at 11:48 PM, Jason Kania <jason.ka...@ymail.com> wrote:

We have encountered a query inconsistency problem wherein the following query 
returns different results sporadically with invalid values for a timestamp 
field looking like the field is uninitialized (a zero timestamp) in the query 
results.

Attempts to repair and compact have not changed the results.

select "subscriberId","sensorUnitId","sensorId","time" from 
"sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND 
"sensorId"=0 ORDER BY "time" LIMIT 10;

Invalid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    1969-12-31 19:00
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:10
JASKAN    0    0    2016-01-21 2:11
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22
JASKAN    0    0    2016-01-21 2:22

Valid Query Results
subscriberId    sensorUnitId    sensorId    time
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:09
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:10
JASKAN    0    0    2015-05-24 2:11
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:13
JASKAN    0    0    2015-05-24 2:14

We have confirmed that the 1969-12-31 timestamp is not within the data based on 
running and number of queries so it looks like the invalid timestamp value is 
generated by the query. The query below returns no row.

select * from "sensorReadingIndex" where "subscriberId"='JASKAN' AND 
"sensorUnitId"=0 AND "sensorId"=0 AND time='1969-12-31 19:00:00-0500';

No logs are coming out but the following was observed intermittently in the 
tracing output, but not correlated to the invalid query results:

 Digest mismatch: org.apache.cassandra.service.DigestMismatchException: 
Mismatch for key DecoratedKey(-7563144029910940626, 
00064a41534b414e000004000000000000040000000000) 
(be22d379c18f75c2f51dd6942d2f9356 vs da4e95d571b41303b908e0c5c3fff7ba) 
[ReadRepairStage:3179] | 2016-03-29 23:12:35.025000 | 192.168.10.10 |
An error from the debug log that might be related is:
org.apache.cassandra.service.DigestMismatchException: Mismatch for key 
DecoratedKey(-4908797801227889951, 4a41534b414e) 
(6a6c8ab013d7757e702af50cbdae045c vs 2ece61a01b2a640ac10509f4c49ae6fb)
        at 
org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) 
~[apache-cassandra-3.0.3.jar:3.0.3]
        at 
org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225)
 ~[apache-cassandra-3.0.3.jar:3.0.3]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
[na:1.8.0_74]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
[na:1.8.0_74]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74]

The tracing files are attached and seem to show that in the failed case, 
content is skipped because of tombstones if we understand it correctly. This 
could be an inconsistency problem on 192.168.10.9 Unfortunately, attempts to 
compact on 192.168.10.9 only give the following error without any stack trace 
detail and are not fixed with repair.

root@cutthroat:/usr/local/bin/analyzer/bin# nodetool compact
error: null
-- StackTrace --
java.lang.ArrayIndexOutOfBoundsException
Any suggestions on how to fix or what to search for would be much appreciated.
Thanks,
Jason







  

Reply via email to