Made a d-test for easier reproduction and created https://issues.apache.org/jira/browse/CASSANDRA-5223
On 1 February 2013 15:14, Alexei Bakanov <russ...@gmail.com> wrote: > Hi again, > > Once started playing with CCM it's hard to stop, such a great tool. > My issue with secondary indexes is following: neither explicit > 'nodetool repair' nor implicit 'hinted handoffs/read repairs' resolve > inconsistencies in data I get from secondary indexes. > I observe this for both one- and 2-datacenter deployments, independent > of caching settings. Rebuilding/droping and creating index or > restarting nodes doesn't help. > > In the following scenario I start up 2 nodes and insert some rows with > CL.ONE. During this process I deliberately stop and start the nodes in > order to trigger inconsistencies. > I then query all data by its index with read CL.ONE and stop if I see > that data is missing. I see that none of C* repair mechanisms work for > secondary indexes. > > $ ccm create --cassandra-version 1.2.1 --nodes 2 -p RandomPartitioner > test2ndIndexRepair > $ ccm start > $ ccm node1 cli > -> create keyspace and column family (please find schemas attached) > $ python populate_repair.py (in first terminal) > $ ccm node1 stop; sleep 10; ccm node1 start (in second terminal, > while populate_repair.py runs) > $ ccm node2 stop; sleep 10; ccm node2 start (in second terminal, > while populate_repair.py runs. Hinted Handoffs do the work but > unfortunately not on Secondary Indexes) > > $ python fetcher_repair.py > .... > 254 > 255 > 256 > Traceback (most recent call last): > File "fetcher_repair.py", line 19, in <module> > raise Exception('missing rows for userId %s, data length is > %d'%(userId, len(data))) > Exception: missing rows for userId 256, data length is 0 > > $ ccm cli > [default@unknown] use testks; > Authenticated to keyspace: testks > [default@testks] get cf1 where 'indexedColumn'='userId_256'; > > 0 Row Returned. > Elapsed time: 47 msec(s). > > $ python fetcher_repair.py (running one more time in hope that 'read > repair' kicked in after the last query, but unfortunately no) > .... > 254 > 255 > 256 > Traceback (most recent call last): > File "fetcher_repair.py", line 19, in <module> > raise Exception('missing rows for userId %s, data length is > %d'%(userId, len(data))) > Exception: missing rows for userId 256, data length is 0 > > $ ccm node1 repair > $ ccm node2 repair > $ ccm cli > > [default@unknown] use testks; > Authenticated to keyspace: testks > [default@testks] get cf1 where 'indexedColumn'='userId_256'; > > 0 Row Returned. > > > Both cassandra instances run with -Xms1927M -Xmx1927M -Xmn400M > > Thanks for help. > > Best regards, > Alexei > > ------START cassandra-cli schemas ------------ > create keyspace testks > with placement_strategy = 'NetworkTopologyStrategy' > and strategy_options = {datacenter1 : 2} > and durable_writes = true; > > use testks; > > create column family cf1 > with column_type = 'Standard' > and comparator = 'AsciiType' > and default_validation_class = 'UTF8Type' > and key_validation_class = 'UTF8Type' > and read_repair_chance = 1.0 > and dclocal_read_repair_chance = 1.0 > and gc_grace = 864000 > and min_compaction_threshold = 4 > and max_compaction_threshold = 32 > and replicate_on_write = true > and compaction_strategy = > 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy' > and caching = 'KEYS_ONLY' > and column_metadata = [ > {column_name : 'indexedColumn', > validation_class : UTF8Type, > index_name : 'INDEX1', > index_type : 0}] > and compression_options = {'sstable_compression' : > 'org.apache.cassandra.io.compress.SnappyCompressor'}; > ------FINISH cassandra-cli schemas ------------ > > ------START populate_repair.py ---------- > import datetime > from pycassa.batch import Mutator > > import pycassa > > pool = pycassa.ConnectionPool('testks', timeout=5, > server_list=['127.0.0.1:9160', '127.0.0.2:9160']) > cf = pycassa.ColumnFamily(pool, 'cf1') > > for userId in xrange(0, 2000): > print userId > b = Mutator(pool, queue_size=200) > for itemId in xrange(20): > rowKey = 'userId_%s:itemId_%s'%(userId, itemId) > for message_number in xrange(10): > b.insert(cf, rowKey, {'indexedColumn': 'userId_%s'%userId, > str(message_number): str(message_number)}) > b.send() > > pool.dispose() > ------FINISH populate_repair.py ---------- > > ------START fetcher_repair.py ---------- > import pycassa > from pycassa.columnfamily import ColumnFamily > from pycassa.pool import ConnectionPool > from pycassa.index import * > > pool = pycassa.ConnectionPool('testks', server_list=['127.0.0.1:9160', > '127.0.0.2:9160']) > cf = pycassa.ColumnFamily(pool, 'cf1') > > for userId in xrange(2000): > print userId > index_expr = create_index_expression('indexedColumn', 'userId_%s'%userId) > index_clause = create_index_clause([index_expr], count=10000000) > data = list(cf.get_indexed_slices(index_clause=index_clause)) > if len(data) != 20: > raise Exception('missing rows for userId %s, data length is > %d'%(userId, len(data))) > pool.dispose() > > ------FINISH fetcher_repair.py ----------