Issue created. Will attach debug logs asap CASSANDRA-4670<https://issues.apache.org/jira/browse/CASSANDRA-4670>
Von: aaron morton [mailto:aa...@thelastpickle.com] Gesendet: Montag, 17. September 2012 03:46 An: user@cassandra.apache.org Betreff: Re: secondery indexes TTL - strange issues Date gets inserted and accessible via index query for some time. At some point in time Indexes are completely empty and start filling again (while new data enters the system). If you can reproduce this please create a ticket on https://issues.apache.org/jira/browse/CASSANDRA . If you can include DEBUG level logs that would be helpful. Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/09/2012, at 10:08 PM, Roland Gude <roland.g...@ez.no<mailto:roland.g...@ez.no>> wrote: I am not sure it is compacting an old file: the same thing happens eeverytime I rebuild the index. New Files appear, get compacted and vanish. We have set up a new smaller cluster with fresh data. Same thing happens here as well. Date gets inserted and accessible via index query for some time. At some point in time Indexes are completely empty and start filling again (while new data enters the system). I am currently testing with SizeTiered on both the fresh set and the imported set. For the fresh set (which is significantly smaller) first results imply that the issue is not happening with SizeTieredCompaction - I have not yet tested everything that comes into my mind and will update if something new comes up. As for the failing query it is from the cli: get EventsByItem where 00000003-0000-1000-0000-000000000000=utf8('someValue'); 00000003-0000-1000-0000-000000000000 is a TUUID we use as a marker for a TimeSeries. (and equivalent queries with astyanax and hector as well) This is a cf with the issue: create column family EventsByItem with column_type = 'Standard' and comparator = 'TimeUUIDType' and default_validation_class = 'BytesType' and key_validation_class = 'BytesType' and read_repair_chance = 0.5 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy' and caching = 'NONE' and column_metadata = [ {column_name : '00000000-0000-1000-0000-000000000000', validation_class : BytesType, index_name : 'ebi_mandatorIndex', index_type : 0}, {column_name : '00000002-0000-1000-0000-000000000000', validation_class : BytesType, index_name : 'ebi_itemidIndex', index_type : 0}, {column_name : '00000003-0000-1000-0000-000000000000', validation_class : BytesType, index_name : 'ebi_eventtypeIndex', index_type : 0}] and compression_options={sstable_compression:SnappyCompressor, chunk_length_kb:64}; Von: aaron morton [mailto:aa...@thelastpickle.com<http://thelastpickle.com>] Gesendet: Freitag, 14. September 2012 10:46 An: user@cassandra.apache.org<mailto:user@cassandra.apache.org> Betreff: Re: secondery indexes TTL - strange issues INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line 221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E ventsByItem.ebi_eventtypeIndex-he-10-Data.db,]. 78,623,000 to 373,348 (~0% of o riginal) bytes for 83 keys at 0.000280MB/s. Time: 1,272,883ms. There is a lot of weird things here. It could be levelled compaction compacting an older file for the first time. But that would be a guess. Rebuilding the index gives us back the data for a couple of minutes - then it vanishes again. Are you able to do a test with SiezedTieredCompaction ? Are you able to replicate the problem with a fresh testing CF and some test Data? If it's only a problem with imported data can you provide a sample of the failing query ? Any maybe the CF definition ? Cheers ----------------- Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 14/09/2012, at 2:46 AM, Roland Gude <roland.g...@ez.no<mailto:roland.g...@ez.no>> wrote: Hi, we have been running a system on Cassandra 0.7 heavily relying on secondary indexes for columns with TTL. This has been working like a charm, but we are trying hard to move forward with Cassandra and are struggling at that point: When we put our data into a new cluster (any 1.1.x version - currently 1.1.5) , rebuild indexes and run our system, everything seems to work good - until in some point of time index queries do not return any data at all anymore (note that the TTL has not yet expired for several months). Rebuilding the index gives us back the data for a couple of minutes - then it vanishes again. What seems strange is that compaction apparently is very aggressive: INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java (line 221) Compacted to [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E ventsByItem.ebi_eventtypeIndex-he-10-Data.db,]. 78,623,000 to 373,348 (~0% of o riginal) bytes for 83 keys at 0.000280MB/s. Time: 1,272,883ms. Actually we have switched to LeveledCompaction. Could it be that leveled compaction does not play nice with indexes?