Cassandra 2.1.2 node stuck on joining the cluster

2014-12-08 Thread Krzysztof Zarzycki
Hi Cassandra users,

I'm trying but failing to join a new (well old, but wiped
out/decomissioned) node to an existing cluster.

Currently I have a cluster that consists of 2 nodes and runs C* 2.1.2. I
start a third node with 2.1.2, it gets to joining state, it bootstraps,
i.e. streams some data as shown by nodetool netstats, but after some time,
it gets stuck. From that point nothing gets streamed, the new node stays in
joining state. I restarted node multiple times, each time it streamed more
data, but then got stuck again.

Other facts:

   - I don't see any errors in the log on any of the nodes.
   - The connectivity seems fine, I can ping, netcat to port 7000 all ways.
   - I have ~ 200 GB load per running node, replication 2, 16 tokens.
   - Load of a new node got to around 300GBs now.
   -

   The bootstrapping process stops in the middle of streaming some table,
   *always* after sending exactly 10MB of some SSTable, e.g.:

   $ nodetool netstats | grep -P -v "bytes\(100" Mode: NORMAL Bootstrap
   e0abc160-7ca8-11e4-9bc2-cf6aed12690e /192.168.200.16 Sending 516 files,
   12493900 bytes total
   
/home/data/cassandra/data/some_ks/page_view-2a2410103f4411e4a266db7096512b05/some_ks-page_view-ka-13890-Data.db
   10485760/167797071 bytes(6%) sent to idx:0/192.168.200.16 Read Repair
   Statistics: Attempted: 2016371 Mismatch (Blocking): 0 Mismatch
   (Background): 168721 Pool Name Active Pending Completed Commands n/a 0
   55802918 Responses n/a 0 425963


I'm trying to join this node for several days and I don't know what to do
with it... I'll be grateful for any help!


Cheers,

Krzysztof Zarzycki


cassandra [2.1.0-rc4] does not filter data out when using WHERE clause on cluster column

2014-07-28 Thread Krzysztof Zarzycki
Hi everyone,
I have a weird, invalid situation with my cluster. I have a table on which
I'm running some SELECTs with WHERE clause filtering on cluster columns,
but the rows are not getting filtered out.
Look:
 select * from page_view where website_id = xxx and user_id = 'some_user'
and page_id =0; -- tried also page_id<0 >0

 website_id | user_id   | page_id  | ...
+---+--+
xxx | some_user | 21044533 | ...
...more rows here, none with page_id=0

The filtering on other (partition) columns runs fine. Only the clustering
column is somewhat malfunctioning.

Important is, how I got to this table:
1. I collected data with Cassandra version 2.0.8
2. I snapshotted the data, and removed the main copy from cluster's data.
3. I upgraded cluster to version 2.1.0-rc4
4. I've recreated the schema of tables in new version.
5. I ran sstableloader on the data to load data to new upgraded cluster.
6. I spotted the problem with filtering.
7. I've tried to run nodetool repair, nodetool upgradesstables -a , neither
helped.

Do you have any ideas how to curate my data?
It might be cumbersome, but possible to just copy data out (to e.g. json)
and back in. But anyway, I believe it might be a bug in 2.1.0-rc4. Do you
have at least ideas on how to investigate what the problem is?

Thank you for any help,
-- Krzysztof Zarzycki