Well that was fun https://issues.apache.org/jira/browse/CASSANDRA-5079
Just testing my idea of a fix now. Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 20/12/2012, at 10:33 AM, aaron morton <aa...@thelastpickle.com> wrote: >> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M > Done and I now get your repo case… > > [default@ks123] get cf1 where 'indexedColumn'='65'; > > 0 Row Returned. > Elapsed time: 1.44 msec(s). > > > [default@ks123] get cf1 where 'indexedColumn'='66'; > ------------------- > RowKey: 66 > => (column=1, value=val, timestamp=1355952222439049, ttl=7884000) > => (column=10, value=val, timestamp=1355952222439269, ttl=7884000) > ... > => (column=indexedColumn, value=66, timestamp=1355952223881937, ttl=7887600) > > Looking into it now. > > Thanks > > ----------------- > Aaron Morton > Freelance Cassandra Developer > New Zealand > > @aaronmorton > http://www.thelastpickle.com > > On 19/12/2012, at 9:56 PM, Roland Gude <roland.g...@ez.no> wrote: > >> I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670 >> Unfortunately apart from me no one was yet able to reproduce. >> >> Check if data is available before/after compaction >> If you have leveled compaction it is hard to test because you cannot trigger >> compaction manually. >> >> -----Ursprüngliche Nachricht----- >> Von: Alexei Bakanov [mailto:russ...@gmail.com] >> Gesendet: Mittwoch, 19. Dezember 2012 09:35 >> An: user@cassandra.apache.org >> Betreff: Re: TTL on SecondaryIndex Columns. A bug? >> >> I'm running on a single node on my laptop. >> It looks like the point when rows dissapear from the index depends on JVM >> memory settings. With more memory it needs more data to feed in before >> things start disappearing. >> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M >> >> To be sure, try to get rows for 'indexedColumn'='1': >> >> [default@ks123] get cf1 where 'indexedColumn'='1'; >> >> 0 Row Returned. >> >> Thanks >> >> >> On 19 December 2012 05:15, aaron morton <aa...@thelastpickle.com> wrote: >>> Thanks for the nice steps to reproduce. >>> >>> I ran this on my MBP using C* 1.1.7 and got the expected results, both >>> get's returned a row. >>> >>> Were you running against a single node or a cluster ? If a cluster did >>> you change the CL, cassandra-cli defaults to ONE. >>> >>> Cheers >>> >>> ----------------- >>> Aaron Morton >>> Freelance Cassandra Developer >>> New Zealand >>> >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 18/12/2012, at 9:44 PM, Alexei Bakanov <russ...@gmail.com> wrote: >>> >>> Hi, >>> >>> We are having an issue with TTL on Secondary index columns. We get 0 >>> rows in return when running queries on indexed columns that have TTL. >>> Everything works fine with small amounts of data, but when we get over >>> a ceratin threshold it looks like older rows dissapear from the index. >>> In the example below we create 70 rows with 45k columns each + one >>> indexed column with just the rowkey as value, so we have one row per >>> indexed value. When the script is finished the index contains rows >>> 66-69. Rows 0-65 are gone from the index. >>> Using 'indexedColumn' without TTL fixes the problem. >>> >>> >>> ------------- SCHEMA START ----------------- create keyspace ks123 >>> with placement_strategy = 'NetworkTopologyStrategy' >>> and strategy_options = {datacenter1 : 1} and durable_writes = true; >>> >>> use ks123; >>> >>> create column family cf1 >>> with column_type = 'Standard' >>> and comparator = 'AsciiType' >>> and default_validation_class = 'AsciiType' >>> and key_validation_class = 'AsciiType' >>> and read_repair_chance = 0.1 >>> and dclocal_read_repair_chance = 0.0 >>> and gc_grace = 864000 >>> and min_compaction_threshold = 4 >>> and max_compaction_threshold = 32 >>> and replicate_on_write = true >>> and compaction_strategy = >>> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' >>> and caching = 'KEYS_ONLY' >>> and column_metadata = [ >>> {column_name : 'indexedColumn', >>> validation_class : AsciiType, >>> index_name : 'INDEX1', >>> index_type : 0}] >>> and compression_options = {'sstable_compression' : >>> 'org.apache.cassandra.io.compress.SnappyCompressor'}; >>> ------------- SCHEMA FINISH ----------------- >>> >>> ------------- POPULATE START ----------------- from pycassa.batch >>> import Mutator import pycassa >>> >>> pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, >>> 'cf1') >>> >>> for rowKey in xrange(70): >>> b = Mutator(pool) >>> for datapoint in xrange(1, 45001): >>> b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000); >>> b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600); >>> print 'row %d' % rowKey >>> b.send() >>> b = Mutator(pool) >>> >>> pool.dispose() >>> ------------- POPULATE FINISH ----------------- >>> >>> ------------- QUERY START ----------------- [default@ks123] get cf1 >>> where 'indexedColumn'='65'; >>> >>> 0 Row Returned. >>> Elapsed time: 2.38 msec(s). >>> >>> [default@ks123] get cf1 where 'indexedColumn'='66'; >>> ------------------- >>> RowKey: 66 >>> => (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ... >>> => (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) >>> => (column=indexedColumn, value=66, timestamp=1355818768119334, >>> ttl=7887600) >>> >>> 1 Row Returned. >>> Elapsed time: 31 msec(s). >>> ------------- QUERY FINISH ----------------- >>> >>> This is all using Cassandra 1.1.7 with default settings. >>> >>> Best regards, >>> >>> Alexei Bakanov >>> >>> >> >> >