Re: Consistency Level

Kamal Bahadur Wed, 04 Jan 2012 13:53:45 -0800

Hi Aaron,

Thanks for your response!


I re-ran the test case # 5. (Node 1 & 2 running, Node 3 & 4 down, Node 1
contains the data, CL ONE and RF 2). I was connected to Node 1 while I ran
the test. I still did not get any data. See below logs:

-----------------------------------------------------------
nodetool ring:
-----------------------------------------------------------

Address         DC          Rack        Status State   Load
Owns    Token

127605887595351923798765477786913079296
xxx.16.200.130  datacenter1 rack1       Up     Normal  584.97 KB
25.00%  0
xxx.16.202.118  datacenter1 rack1       Up     Normal  570.91 KB
25.00%  42535295865117307932921825928971026432
xxx.16.200.190  datacenter1 rack1       Down   Normal  593.51 KB
25.00%  85070591730234615865843651857942052864
xxx.16.200.217  datacenter1 rack1       Down   Normal  617.23 KB
25.00%  127605887595351923798765477786913079296

-----------------------------------------------------
Logs on Node 1
-----------------------------------------------------
DEBUG [Thread-7] 2012-01-04 13:44:00,665 IncomingTcpConnection.java (line
148) Version is now 2
 INFO [HintedHandoff:1] 2012-01-04 13:44:00,850 HintedHandOffManager.java
(line 314) Endpoint /172.16.200.190 died before hint delivery, aborting
DEBUG [HintedHandoff:1] 2012-01-04 13:44:00,851 HintedHandOffManager.java
(line 303) Checking remote schema before delivering hints
DEBUG [HintedHandoff:1] 2012-01-04 13:44:00,851 HintedHandOffManager.java
(line 295) schema for /172.16.200.217 matches local schema
DEBUG [HintedHandoff:1] 2012-01-04 13:44:00,851 HintedHandOffManager.java
(line 309) Sleeping 17296ms to stagger hint delivery
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,857 CassandraServer.java (line
670) scan
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,869 StorageProxy.java (line
889) restricted ranges for query [-1,-1] are [[-1,0],
(0,42535295865117307932921825928971026432],
(42535295865117307932921825928971026432,85070591730234615865843651857942052864],
(85070591730234615865843651857942052864,127605887595351923798765477786913079296],
(127605887595351923798765477786913079296,-1]]
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,869 StorageProxy.java (line
976) scan ranges are
[-1,0],(0,42535295865117307932921825928971026432],(42535295865117307932921825928971026432,85070591730234615865843651857942052864],(85070591730234615865843651857942052864,127605887595351923798765477786913079296],(127605887595351923798765477786913079296,-1]
DEBUG [Thread-7] 2012-01-04 13:44:00,876 IncomingTcpConnection.java (line
148) Version is now 2
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,881 ReadCallback.java (line 76)
Blockfor/repair is 1/false; setting up requests to /172.16.200.130
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,884 StorageProxy.java (line
1003) reading org.apache.cassandra.db.IndexScanCommand@c9f997 from /
172.16.200.130
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,884 StorageProxy.java (line
1003) reading org.apache.cassandra.db.IndexScanCommand@c9f997 from /
172.16.202.118
DEBUG [WRITE-/172.16.202.118] 2012-01-04 13:44:00,884
OutboundTcpConnection.java (line 162) attempting to connect to /
172.16.202.118
DEBUG [ReadStage:2] 2012-01-04 13:44:00,887 ColumnFamilyStore.java (line
1550) Primary scan clause is member
DEBUG [ReadStage:2] 2012-01-04 13:44:00,887 ColumnFamilyStore.java (line
1563) Expanding slice filter to entire row to cover additional expressions
DEBUG [ReadStage:2] 2012-01-04 13:44:00,887 ColumnFamilyStore.java (line
1605) Scanning index 'Audit_Log.member EQ kamal' starting with
DEBUG [ReadStage:2] 2012-01-04 13:44:00,893 SliceQueryFilter.java (line
123) collecting 0 of 100:
7a7a32323636373030303438303031:false:0@1325704860925009
DEBUG [ReadStage:2] 2012-01-04 13:44:00,893 ColumnFamilyStore.java (line
1617) fetched ColumnFamily(Audit_Log.Audit_Log_member_idx
[7a7a32323636373030303438303031:false:0@1325704860925009,])
DEBUG [ReadStage:2] 2012-01-04 13:44:00,894 IndexScanVerbHandler.java (line
46) Sending RangeSliceReply{rows=} to 171@/172.16.200.130
DEBUG [RequestResponseStage:1] 2012-01-04 13:44:00,895
ResponseVerbHandler.java (line 48) Processing response on a callback from
171@/172.16.200.130
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,895 ReadCallback.java (line 76)
Blockfor/repair is 1/false; setting up requests to /172.16.202.118
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,896 StorageProxy.java (line
1003) reading org.apache.cassandra.db.IndexScanCommand@10eeb26 from /
172.16.202.118
DEBUG [Thread-7] 2012-01-04 13:44:00,912 IncomingTcpConnection.java (line
148) Version is now 2
DEBUG [Thread-7] 2012-01-04 13:44:00,913 IncomingTcpConnection.java (line
148) Version is now 2
DEBUG [RequestResponseStage:3] 2012-01-04 13:44:00,913
ResponseVerbHandler.java (line 48) Processing response on a callback from
172@/172.16.202.118
DEBUG [RequestResponseStage:2] 2012-01-04 13:44:00,913
ResponseVerbHandler.java (line 48) Processing response on a callback from
173@/172.16.202.118
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,913 ReadCallback.java (line 76)
Blockfor/repair is 1/false; setting up requests to
DEBUG [pool-2-thread-1] 2012-01-04 13:44:00,913 ReadCallback.java (line
203) Live nodes  do not satisfy ConsistencyLevel (1 required)
-----------------------------------------------------
Logs on Node 2
-----------------------------------------------------
DEBUG [Thread-6] 2012-01-04 13:44:00,886 Gossiper.java (line 197) Setting
version 2 for /172.16.200.130
DEBUG [Thread-6] 2012-01-04 13:44:00,886 IncomingTcpConnection.java (line
105) set version for /172.16.200.130 to 2
DEBUG [Thread-6] 2012-01-04 13:44:00,895 IncomingTcpConnection.java (line
148) Version is now 2
DEBUG [ReadStage:2] 2012-01-04 13:44:00,906 ColumnFamilyStore.java (line
1550) Primary scan clause is member
DEBUG [ReadStage:3] 2012-01-04 13:44:00,906 ColumnFamilyStore.java (line
1550) Primary scan clause is member
DEBUG [ReadStage:2] 2012-01-04 13:44:00,906 ColumnFamilyStore.java (line
1563) Expanding slice filter to entire row to cover additional expressions
DEBUG [ReadStage:3] 2012-01-04 13:44:00,906 ColumnFamilyStore.java (line
1563) Expanding slice filter to entire row to cover additional expressions
DEBUG [ReadStage:2] 2012-01-04 13:44:00,906 ColumnFamilyStore.java (line
1605) Scanning index 'Audit_Log.member EQ kamal' starting with
DEBUG [ReadStage:3] 2012-01-04 13:44:00,907 ColumnFamilyStore.java (line
1605) Scanning index 'Audit_Log.member EQ kamal' starting with
DEBUG [ReadStage:2] 2012-01-04 13:44:00,909 SliceQueryFilter.java (line
123) collecting 0 of 100:
7a7a32323636373030303438303031:false:0@1325615757889011
DEBUG [ReadStage:2] 2012-01-04 13:44:00,910 ColumnFamilyStore.java (line
1617) fetched ColumnFamily(Audit_Log.Audit_Log_member_idx
[7a7a32323636373030303438303031:false:0@1325615757889011,])
DEBUG [ReadStage:3] 2012-01-04 13:44:00,910 SliceQueryFilter.java (line
123) collecting 0 of 100:
7a7a32323636373030303438303031:false:0@1325615757889011
DEBUG [ReadStage:3] 2012-01-04 13:44:00,911 ColumnFamilyStore.java (line
1617) fetched ColumnFamily(Audit_Log.Audit_Log_member_idx
[7a7a32323636373030303438303031:false:0@1325615757889011,])
DEBUG [ReadStage:2] 2012-01-04 13:44:00,911 IndexScanVerbHandler.java (line
46) Sending RangeSliceReply{rows=} to 172@/172.16.200.130
DEBUG [ReadStage:3] 2012-01-04 13:44:00,911 IndexScanVerbHandler.java (line
46) Sending RangeSliceReply{rows=} to 173@/172.16.200.130
DEBUG [Thread-4] 2012-01-04 13:44:01,664 IncomingTcpConnection.java (line
148) Version is now 2



On Wed, Jan 4, 2012 at 1:15 AM, aaron morton <aa...@thelastpickle.com>wrote:

> I've not spent much time with the secondary indexes, so a couple of
> questions.
>
> Whats is the output of nodetool ring ?
> Which node were you connected to when you did the get ?
> If you enable DEBUG logging what do the log messages from StorageProxy say
> that contain the string "scan ranges are" and "reading .. from ..."
>
> Now for the wild guessing….It's working as designed for a CL ONE
> request. Looking at test case 5 and *assuming* you were connected to node 2
> this is what I think is happening: A get indexed slice query without a
> start key does not know which nodes will contain the data. Reading the code
> it will consider the replicas for the minimum token as nodes to send the
> query to, for a CL ONE query it will only use one. If you were connected to
> node 2 the query would have executed only on node 2.
>
> This is where I get confused. What happens if you have 50 nodes, with RF
> 3, and you execute a get_indexed_slice at QUOURM with no start_key and the
> only rows that satisfy the query exist on nodes 47, 48 and 49. So they are
> a long way away from the minimum token, assuming SimpleStrategy and well
> ordered token ring.
>
> I think I've missed something, anyone ?
> Cheers
>
>   -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/01/2012, at 9:44 AM, Kamal Bahadur wrote:
>
> Hi Peter,
>
> To test, I wiped out all the data from Cassandra and inserted just one
> record. The row key is "7a7a32323636373030303438303031". I used
> getendpoints to see where my data is and double checked the same using
> sstable2json command.
>
> Since the RF is 2, the data is currently on Node 1 and Node 4 of my 4
> nodes cluster. I used cassandra-cli to query the data by using one of the
> secondary index but following are my results:
>
>   Test Node 1 Node 2 Node 3 Node 4 Got data back?  1 Up Up Up Up Yes  2 Up
> Up Up
> Yes  3 Up Up
> Up Yes  4
> Up Up Up Yes  5 Up Up
>
> No  6
>
> Up Up No  7 Up
>
> Up No
>
> It turns out that even though my consistency level is ONE, since I am
> using secondary index to query the data, at least 3 nodes has to be
> running. And out of these 3 running nodes, it works even if one nodes
> contains the data.
>
> Somewhere in the mailing I read that "Iterating through all of the rows
> matching an index clause on your cluster is guaranteed to touch N/RF of the
> nodes in your cluster, because each node only knows about data that is
> indexed locally."
>
> I am not sure what N/RF means in my case. Does it mean 4/2 = 2? where 4 is
> the number of nodes and 2 is the RF. If it is 2, why is it not returning
> any data when the two nodes that contains the data is running (test #7)?
>
> For my use case, I have to have a RF of 2 and should be able to query
> using secondary index with a CL of ONE. Is this possible when 2 nodes are
> down in a 4 nodes cluster? Is there any limitations on using secondary
> index?
>
> Thanks in advance.
>
> Thanks,
> Kamal
>
> On Thu, Dec 29, 2011 at 6:40 PM, Peter Schuller <
> peter.schul...@infidyne.com> wrote:
>
>> > Thanks for the response Peter! I checked everything and it look good to
>> me.
>> >
>> > I am stuck with this for almost 2 days now. Has anyone had this issue?
>>
>> While it is certainly possible that you're running into a bug, it
>> seems unlikely to me since it is the kind of bug that would affect
>> almost anyone if it is failing with Unavailable due to unrelated (not
>> in replica sets) nodes being down.
>>
>> Can you please post back with (1) the ring layout ('nodetool ring'),
>> and (2) the exact row key that you're testing with?
>>
>> You might also want to run with DEBUG level (modify
>> log4j-server.properties at the top) and the strategy (assuming you are
>> using NetworkTopologyStrategy) will log selected endpoints, and
>> confirm that it's indeed picking endpoints that you think it should
>> based on getendpoints.
>>
>> --
>> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>>
>
>
>

Re: Consistency Level

Reply via email to