Re: secondery indexes TTL - strange issues

aaron morton Sun, 16 Sep 2012 18:46:23 -0700

>  Date gets inserted and accessible via index query for some time. At some 
> point in time Indexes are completely empty and start filling again (while new 
> data enters the system).
If you can reproduce this please create a ticket on 
https://issues.apache.org/jira/browse/CASSANDRA .


If you can include DEBUG level logs that would be helpful. 

Cheers

-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 14/09/2012, at 10:08 PM, Roland Gude <roland.g...@ez.no> wrote:

> I am not sure it is compacting an old file: the same thing happens eeverytime 
> I rebuild the index. New Files appear, get compacted and vanish.
>  
> We have set up a new smaller cluster with fresh data. Same thing happens here 
> as well. Date gets inserted and accessible via index query for some time. At 
> some point in time Indexes are completely empty and start filling again 
> (while new data enters the system).
>  
> I am currently testing with SizeTiered on both the fresh set and the imported 
> set.
>  
> For the fresh set (which is significantly smaller) first results imply that 
> the issue is not happening with SizeTieredCompaction – I have not yet tested 
> everything that comes into my mind and will update if something new comes up.
>  
> As for the failing query it is from the cli:
> get EventsByItem where 00000003-0000-1000-0000-000000000000=utf8(‘someValue’);
> 00000003-0000-1000-0000-000000000000 is a TUUID we use as a marker for a 
> TimeSeries.
> (and equivalent queries with astyanax and hector as well)
>  
> This is a cf with the issue:
>  
> create column family EventsByItem
>   with column_type = 'Standard'
>   and comparator = 'TimeUUIDType'
>   and default_validation_class = 'BytesType'
>   and key_validation_class = 'BytesType'
>   and read_repair_chance = 0.5
>   and dclocal_read_repair_chance = 0.0
>   and gc_grace = 864000
>   and min_compaction_threshold = 4
>   and max_compaction_threshold = 32
>   and replicate_on_write = true
>   and compaction_strategy = 
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'
>   and caching = 'NONE'
>   and column_metadata = [
>     {column_name : '00000000-0000-1000-0000-000000000000',
>     validation_class : BytesType,
>     index_name : 'ebi_mandatorIndex',
>     index_type : 0},
>     {column_name : '00000002-0000-1000-0000-000000000000',
>     validation_class : BytesType,
>     index_name : 'ebi_itemidIndex',
>     index_type : 0},
>     {column_name : '00000003-0000-1000-0000-000000000000',
>     validation_class : BytesType,
>     index_name : 'ebi_eventtypeIndex',
>     index_type : 0}]
>   and compression_options={sstable_compression:SnappyCompressor, 
> chunk_length_kb:64};
>  
> Von: aaron morton [mailto:aa...@thelastpickle.com] 
> Gesendet: Freitag, 14. September 2012 10:46
> An: user@cassandra.apache.org
> Betreff: Re: secondery indexes TTL - strange issues
>  
> INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java 
> (line
> 221) Compacted to 
> [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
> ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% 
> of o
> riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.
> There is a lot of weird things here. 
> It could be levelled compaction compacting an older file for the first time. 
> But that would be a guess. 
>  
> Rebuilding the index gives us back the data for a couple of minutes - then it 
> vanishes again.
> Are you able to do a test with SiezedTieredCompaction ? 
>  
> Are you able to replicate the problem with a fresh testing CF and some test 
> Data?
>  
> If it's only a problem with imported data can you provide a sample of the 
> failing query ? Any maybe the CF definition ? 
>  
> Cheers
>  
>  
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 14/09/2012, at 2:46 AM, Roland Gude <roland.g...@ez.no> wrote:
> 
> 
> Hi,
>  
> we have been running a system on Cassandra 0.7 heavily relying on secondary 
> indexes for columns with TTL.
> This has been working like a charm, but we are trying hard to move forward 
> with Cassandra and are struggling at that point:
>  
> When we put our data into a new cluster (any 1.1.x version – currently 1.1.5) 
> , rebuild indexes and run our system, everything seems to work good – until 
> in some point of time index queries do not return any data at all anymore 
> (note that the TTL has not yet expired for several months).
> Rebuilding the index gives us back the data for a couple of minutes - then it 
> vanishes again.
>  
> What seems strange is that compaction apparently is very aggressive:
>  
> INFO [CompactionExecutor:181] 2012-09-13 12:58:37,443 CompactionTask.java 
> (line
> 221) Compacted to 
> [/var/lib/cassandra/data/Eventstore/EventsByItem/Eventstore-E
> ventsByItem.ebi_eventtypeIndex-he-10-Data.db,].  78,623,000 to 373,348 (~0% 
> of o
> riginal) bytes for 83 keys at 0.000280MB/s.  Time: 1,272,883ms.
>  
>  
> Actually we have switched to LeveledCompaction. Could it be that leveled 
> compaction does not play nice with indexes?
>  
>

Re: secondery indexes TTL - strange issues

Reply via email to