Nice job Aaron, AFAIU now you set the gc_before to the current time for secondary indexes. And as it was set to Integer.MAX_VALUE before your patch, removeDeletedStandard function was testing if (column.getLocalDeletiontime() < MAX_VALUE) which is always true and so was removing all rows from the secondary index. Am I right ?
-- Cyril SCETBON On Dec 20, 2012, at 9:28 PM, aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>> wrote: Yes, but they will get compacted away again unless the patch is there. it's a small patch so you should be able to apply it easily enough if you need a fix ASAP. Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com<http://www.thelastpickle.com/> On 20/12/2012, at 5:27 PM, B. Todd Burruss <bto...@gmail.com<mailto:bto...@gmail.com>> wrote: i believe we have hit this as well. if you use nodetool to rebuild_index, does it work? On Wed, Dec 19, 2012 at 8:10 PM, aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>> wrote: Well that was fun https://issues.apache.org/jira/browse/CASSANDRA-5079 Just testing my idea of a fix now. Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com<http://www.thelastpickle.com/> On 20/12/2012, at 10:33 AM, aaron morton <aa...@thelastpickle.com<mailto:aa...@thelastpickle.com>> wrote: Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M Done and I now get your repo case… [default@ks123] get cf1 where 'indexedColumn'='65'; 0 Row Returned. Elapsed time: 1.44 msec(s). [default@ks123] get cf1 where 'indexedColumn'='66'; ------------------- RowKey: 66 => (column=1, value=val, timestamp=1355952222439049, ttl=7884000) => (column=10, value=val, timestamp=1355952222439269, ttl=7884000) ... => (column=indexedColumn, value=66, timestamp=1355952223881937, ttl=7887600) Looking into it now. Thanks ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 19/12/2012, at 9:56 PM, Roland Gude <roland.g...@ez.no> wrote: I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670 Unfortunately apart from me no one was yet able to reproduce. Check if data is available before/after compaction If you have leveled compaction it is hard to test because you cannot trigger compaction manually. -----Ursprüngliche Nachricht----- Von: Alexei Bakanov [mailto:russ...@gmail.com] Gesendet: Mittwoch, 19. Dezember 2012 09:35 An: user@cassandra.apache.org Betreff: Re: TTL on SecondaryIndex Columns. A bug? I'm running on a single node on my laptop. It looks like the point when rows dissapear from the index depends on JVM memory settings. With more memory it needs more data to feed in before things start disappearing. Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M To be sure, try to get rows for 'indexedColumn'='1': [default@ks123] get cf1 where 'indexedColumn'='1'; 0 Row Returned. Thanks On 19 December 2012 05:15, aaron morton <aa...@thelastpickle.com> wrote: Thanks for the nice steps to reproduce. I ran this on my MBP using C* 1.1.7 and got the expected results, both get's returned a row. Were you running against a single node or a cluster ? If a cluster did you change the CL, cassandra-cli defaults to ONE. Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 18/12/2012, at 9:44 PM, Alexei Bakanov <russ...@gmail.com> wrote: Hi, We are having an issue with TTL on Secondary index columns. We get 0 rows in return when running queries on indexed columns that have TTL. Everything works fine with small amounts of data, but when we get over a ceratin threshold it looks like older rows dissapear from the index. In the example below we create 70 rows with 45k columns each + one indexed column with just the rowkey as value, so we have one row per indexed value. When the script is finished the index contains rows 66-69. Rows 0-65 are gone from the index. Using 'indexedColumn' without TTL fixes the problem. ------------- SCHEMA START ----------------- create keyspace ks123 with placement_strategy = 'NetworkTopologyStrategy' and strategy_options = {datacenter1 : 1} and durable_writes = true; use ks123; create column family cf1 with column_type = 'Standard' and comparator = 'AsciiType' and default_validation_class = 'AsciiType' and key_validation_class = 'AsciiType' and read_repair_chance = 0.1 and dclocal_read_repair_chance = 0.0 and gc_grace = 864000 and min_compaction_threshold = 4 and max_compaction_threshold = 32 and replicate_on_write = true and compaction_strategy = 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' and caching = 'KEYS_ONLY' and column_metadata = [ {column_name : 'indexedColumn', validation_class : AsciiType, index_name : 'INDEX1', index_type : 0}] and compression_options = {'sstable_compression' : 'org.apache.cassandra.io.compress.SnappyCompressor'}; ------------- SCHEMA FINISH ----------------- ------------- POPULATE START ----------------- from pycassa.batch import Mutator import pycassa pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool, 'cf1') for rowKey in xrange(70): b = Mutator(pool) for datapoint in xrange(1, 45001): b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000); b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600); print 'row %d' % rowKey b.send() b = Mutator(pool) pool.dispose() ------------- POPULATE FINISH ----------------- ------------- QUERY START ----------------- [default@ks123] get cf1 where 'indexedColumn'='65'; 0 Row Returned. Elapsed time: 2.38 msec(s). [default@ks123] get cf1 where 'indexedColumn'='66'; ------------------- RowKey: 66 => (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ... => (column=10087, value=val, timestamp=1355818766075538, ttl=7884000) => (column=indexedColumn, value=66, timestamp=1355818768119334, ttl=7887600) 1 Row Returned. Elapsed time: 31 msec(s). ------------- QUERY FINISH ----------------- This is all using Cassandra 1.1.7 with default settings. Best regards, Alexei Bakanov _________________________________________________________________________________________________________________________ Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration, France Telecom - Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci. This message and its attachments may contain confidential or privileged information that may be protected by law; they should not be distributed, used or copied without authorisation. If you have received this email in error, please notify the sender and delete this message and its attachments. As emails may be altered, France Telecom - Orange is not liable for messages that have been modified, changed or falsified. Thank you.