We have encountered a query inconsistency problem wherein the following query returns different results sporadically with invalid values for a timestamp field looking like the field is uninitialized (a zero timestamp) in the query results.
Attempts to repair and compact have not changed the results. select "subscriberId","sensorUnitId","sensorId","time" from "sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND "sensorId"=0 ORDER BY "time" LIMIT 10; Invalid Query Results subscriberId sensorUnitId sensorId time JASKAN 0 0 2015-05-24 2:09 JASKAN 0 0 1969-12-31 19:00 JASKAN 0 0 2016-01-21 2:10 JASKAN 0 0 2016-01-21 2:10 JASKAN 0 0 2016-01-21 2:10 JASKAN 0 0 2016-01-21 2:11 JASKAN 0 0 2016-01-21 2:22 JASKAN 0 0 2016-01-21 2:22 JASKAN 0 0 2016-01-21 2:22 JASKAN 0 0 2016-01-21 2:22 Valid Query Results subscriberId sensorUnitId sensorId time JASKAN 0 0 2015-05-24 2:09 JASKAN 0 0 2015-05-24 2:09 JASKAN 0 0 2015-05-24 2:10 JASKAN 0 0 2015-05-24 2:10 JASKAN 0 0 2015-05-24 2:10 JASKAN 0 0 2015-05-24 2:10 JASKAN 0 0 2015-05-24 2:11 JASKAN 0 0 2015-05-24 2:13 JASKAN 0 0 2015-05-24 2:13 JASKAN 0 0 2015-05-24 2:14 We have confirmed that the 1969-12-31 timestamp is not within the data based on running and number of queries so it looks like the invalid timestamp value is generated by the query. The query below returns no row. select * from "sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND "sensorId"=0 AND time='1969-12-31 19:00:00-0500'; No logs are coming out but the following was observed intermittently in the tracing output, but not correlated to the invalid query results: Digest mismatch: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-7563144029910940626, 00064a41534b414e000004000000000000040000000000) (be22d379c18f75c2f51dd6942d2f9356 vs da4e95d571b41303b908e0c5c3fff7ba) [ReadRepairStage:3179] | 2016-03-29 23:12:35.025000 | 192.168.10.10 | An error from the debug log that might be related is: org.apache.cassandra.service.DigestMismatchException: Mismatch for key DecoratedKey(-4908797801227889951, 4a41534b414e) (6a6c8ab013d7757e702af50cbdae045c vs 2ece61a01b2a640ac10509f4c49ae6fb) at org.apache.cassandra.service.DigestResolver.resolve(DigestResolver.java:85) ~[apache-cassandra-3.0.3.jar:3.0.3] at org.apache.cassandra.service.ReadCallback$AsyncRepairRunner.run(ReadCallback.java:225) ~[apache-cassandra-3.0.3.jar:3.0.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_74] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_74] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_74] The tracing files are attached and seem to show that in the failed case, content is skipped because of tombstones if we understand it correctly. This could be an inconsistency problem on 192.168.10.9 Unfortunately, attempts to compact on 192.168.10.9 only give the following error without any stack trace detail and are not fixed with repair. root@cutthroat:/usr/local/bin/analyzer/bin# nodetool compact error: null -- StackTrace -- java.lang.ArrayIndexOutOfBoundsException Any suggestions on how to fix or what to search for would be much appreciated. Thanks, Jason
Tracing session: 09e26410-f626-11e5-8b85-9b8e819c8182 activity | timestamp | source | source_elapsed -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+---------------- Execute CQL3 query | 2016-03-29 23:18:13.969000 | 192.168.10.10 | 0 Parsing select "subscriberId","sensorUnitId","sensorId","time" from "sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND "sensorId"=0 ORDER BY "time" LIMIT 10; [SharedPool-Worker-2] | 2016-03-29 23:18:13.970000 | 192.168.10.10 | 181 READ message received from /192.168.10.10 [MessagingService-Incoming-/192.168.10.10] | 2016-03-29 23:18:13.970000 | 192.168.10.9 | 20 Preparing statement [SharedPool-Worker-2] | 2016-03-29 23:18:13.970000 | 192.168.10.10 | 400 Executing single-partition query on edgeTransitionIndex [SharedPool-Worker-3] | 2016-03-29 23:18:13.970000 | 192.168.10.9 | 75 reading data from /192.168.10.9 [SharedPool-Worker-2] | 2016-03-29 23:18:13.970000 | 192.168.10.10 | 681 Acquiring sstable references [SharedPool-Worker-3] | 2016-03-29 23:18:13.970000 | 192.168.10.9 | 85 Sending READ message to /192.168.10.9 [MessagingService-Outgoing-/192.168.10.9] | 2016-03-29 23:18:13.970001 | 192.168.10.10 | 806 Key cache hit for sstable 17164 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 114 Key cache hit for sstable 13046 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 130 Key cache hit for sstable 8003 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 142 Key cache hit for sstable 2836 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 152 Key cache hit for sstable 253 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 161 Key cache hit for sstable 27 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 232 Key cache hit for sstable 26 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 331 Key cache hit for sstable 25 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 408 Key cache hit for sstable 24 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 483 Skipped 0/9 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 492 Merging data from memtables and 9 sstables [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 500 Read 10 live and 0 tombstone cells [SharedPool-Worker-3] | 2016-03-29 23:18:13.970001 | 192.168.10.9 | 869 Enqueuing response to /192.168.10.10 [SharedPool-Worker-3] | 2016-03-29 23:18:13.970002 | 192.168.10.9 | 899 Sending REQUEST_RESPONSE message to /192.168.10.10 [MessagingService-Outgoing-/192.168.10.10] | 2016-03-29 23:18:13.971000 | 192.168.10.9 | 1009 REQUEST_RESPONSE message received from /192.168.10.9 [MessagingService-Incoming-/192.168.10.9] | 2016-03-29 23:18:13.972000 | 192.168.10.10 | 2214 Processing response from /192.168.10.9 [SharedPool-Worker-4] | 2016-03-29 23:18:13.972000 | 192.168.10.10 | 2285 Request complete | 2016-03-29 23:18:13.971415 | 192.168.10.10 | 2415
Tracing session: 854c7a00-f626-11e5-8b85-9b8e819c8182 activity | timestamp | source | source_elapsed -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+---------------+---------------- Execute CQL3 query | 2016-03-29 23:21:41.024000 | 192.168.10.10 | 0 Parsing select "subscriberId","sensorUnitId","sensorId","time" from "sensorReadingIndex" where "subscriberId"='JASKAN' AND "sensorUnitId"=0 AND "sensorId"=0 ORDER BY "time" LIMIT 10; [SharedPool-Worker-3] | 2016-03-29 23:21:41.024000 | 192.168.10.10 | 118 Preparing statement [SharedPool-Worker-3] | 2016-03-29 23:21:41.024000 | 192.168.10.10 | 265 Executing single-partition query on edgeTransitionIndex [SharedPool-Worker-4] | 2016-03-29 23:21:41.025000 | 192.168.10.10 | 602 Acquiring sstable references [SharedPool-Worker-4] | 2016-03-29 23:21:41.025000 | 192.168.10.10 | 631 Key cache hit for sstable 39 [SharedPool-Worker-4] | 2016-03-29 23:21:41.025000 | 192.168.10.10 | 654 Key cache hit for sstable 38 [SharedPool-Worker-4] | 2016-03-29 23:21:41.025000 | 192.168.10.10 | 666 Skipped 0/2 non-slice-intersecting sstables, included 0 due to tombstones [SharedPool-Worker-4] | 2016-03-29 23:21:41.025000 | 192.168.10.10 | 676 Merging data from memtables and 2 sstables [SharedPool-Worker-4] | 2016-03-29 23:21:41.025000 | 192.168.10.10 | 684 Read 10 live and 0 tombstone cells [SharedPool-Worker-4] | 2016-03-29 23:21:41.025001 | 192.168.10.10 | 907 Request complete | 2016-03-29 23:21:41.024947 | 192.168.10.10 | 947