Re: TTL on SecondaryIndex Columns. A bug?

Alexei Bakanov Wed, 19 Dec 2012 23:19:32 -0800

Great stuff, Aaron. Thanks for your time


On 20 December 2012 05:10, aaron morton <[email protected]> wrote:
> Well that was fun https://issues.apache.org/jira/browse/CASSANDRA-5079
>
> Just testing my idea of a fix now.
>
> Cheers
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 20/12/2012, at 10:33 AM, aaron morton <[email protected]> wrote:
>
> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M
>
> Done and I now get your repo case…
>
> [default@ks123] get cf1 where 'indexedColumn'='65';
>
> 0 Row Returned.
> Elapsed time: 1.44 msec(s).
>
>
> [default@ks123] get cf1 where 'indexedColumn'='66';
> -------------------
> RowKey: 66
> => (column=1, value=val, timestamp=1355952222439049, ttl=7884000)
> => (column=10, value=val, timestamp=1355952222439269, ttl=7884000)
> ...
> => (column=indexedColumn, value=66, timestamp=1355952223881937, ttl=7887600)
>
> Looking into it now.
>
> Thanks
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 19/12/2012, at 9:56 PM, Roland Gude <[email protected]> wrote:
>
> I think this might be https://issues.apache.org/jira/browse/CASSANDRA-4670
> Unfortunately apart from me no one was yet able to reproduce.
>
> Check if data is available before/after compaction
> If you have leveled compaction it is hard to test because you cannot trigger
> compaction manually.
>
> -----Ursprüngliche Nachricht-----
> Von: Alexei Bakanov [mailto:[email protected]]
> Gesendet: Mittwoch, 19. Dezember 2012 09:35
> An: [email protected]
> Betreff: Re: TTL on SecondaryIndex Columns. A bug?
>
> I'm running on a single node on my laptop.
> It looks like the point when rows dissapear from the index depends on JVM
> memory settings. With more memory it needs more data to feed in before
> things start disappearing.
> Please try to run Cassandra with -Xms1927M -Xmx1927M -Xmn400M
>
> To be sure, try to get rows for 'indexedColumn'='1':
>
> [default@ks123] get cf1 where 'indexedColumn'='1';
>
> 0 Row Returned.
>
> Thanks
>
>
> On 19 December 2012 05:15, aaron morton <[email protected]> wrote:
>
> Thanks for the nice steps to reproduce.
>
> I ran this on my MBP using C* 1.1.7 and got the expected results, both
> get's returned a row.
>
> Were you running against a single node or a cluster ? If a cluster did
> you change the CL, cassandra-cli defaults to ONE.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Developer
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/12/2012, at 9:44 PM, Alexei Bakanov <[email protected]> wrote:
>
> Hi,
>
> We are having an issue with TTL on Secondary index columns. We get 0
> rows in return when running queries on indexed columns that have TTL.
> Everything works fine with small amounts of data, but when we get over
> a ceratin threshold it looks like older rows dissapear from the index.
> In the example below we create 70 rows with 45k columns each + one
> indexed column with just the rowkey as value, so we have one row per
> indexed value. When the script is finished the index contains rows
> 66-69. Rows 0-65 are gone from the index.
> Using 'indexedColumn' without TTL fixes the problem.
>
>
> ------------- SCHEMA START ----------------- create keyspace ks123
> with placement_strategy = 'NetworkTopologyStrategy'
> and strategy_options = {datacenter1 : 1}  and durable_writes = true;
>
> use ks123;
>
> create column family cf1
> with column_type = 'Standard'
> and comparator = 'AsciiType'
> and default_validation_class = 'AsciiType'
> and key_validation_class = 'AsciiType'
> and read_repair_chance = 0.1
> and dclocal_read_repair_chance = 0.0
> and gc_grace = 864000
> and min_compaction_threshold = 4
> and max_compaction_threshold = 32
> and replicate_on_write = true
> and compaction_strategy =
> 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'
> and caching = 'KEYS_ONLY'
> and column_metadata = [
>   {column_name : 'indexedColumn',
>   validation_class : AsciiType,
>   index_name : 'INDEX1',
>   index_type : 0}]
> and compression_options = {'sstable_compression' :
> 'org.apache.cassandra.io.compress.SnappyCompressor'};
> ------------- SCHEMA FINISH -----------------
>
> ------------- POPULATE START ----------------- from pycassa.batch
> import Mutator import pycassa
>
> pool = pycassa.ConnectionPool('ks123') cf = pycassa.ColumnFamily(pool,
> 'cf1')
>
> for rowKey in xrange(70):
>   b = Mutator(pool)
>   for datapoint in xrange(1, 45001):
>       b.insert(cf,str(rowKey), {str(datapoint): 'val'}, ttl=7884000);
>   b.insert(cf, str(rowKey), {'indexedColumn': str(rowKey)}, ttl=7887600);
>   print 'row %d' % rowKey
>   b.send()
>   b = Mutator(pool)
>
> pool.dispose()
> ------------- POPULATE FINISH -----------------
>
> ------------- QUERY START ----------------- [default@ks123] get cf1
> where 'indexedColumn'='65';
>
> 0 Row Returned.
> Elapsed time: 2.38 msec(s).
>
> [default@ks123] get cf1 where 'indexedColumn'='66';
> -------------------
> RowKey: 66
> => (column=1, value=val, timestamp=1355818765548964, ttl=7884000) ...
> => (column=10087, value=val, timestamp=1355818766075538, ttl=7884000)
> => (column=indexedColumn, value=66, timestamp=1355818768119334,
> ttl=7887600)
>
> 1 Row Returned.
> Elapsed time: 31 msec(s).
> ------------- QUERY FINISH -----------------
>
> This is all using Cassandra 1.1.7 with default settings.
>
> Best regards,
>
> Alexei Bakanov
>
>
>
>
>
>

Re: TTL on SecondaryIndex Columns. A bug?

Reply via email to