Node decomission failed
Hi, We are testing Cassandra and tried to remove a node from the cluster using nodetool decomission. The node transferred the data, then "died" for about 20 minutes without responding, then came back to life with a load of 50-100, was in a heavy load during about 1 hour and then returned to normal load. It seems to have stopped receiving new data but it is still in the cluster. The node we tried to remove is the third one: root@dc-cassandra-03:~# nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070484 10.70.147.62datacenter1 rack1 Up Normal 7.14 GB 33.33% 0 10.208.51.64datacenter1 rack1 Up Normal 3.68 GB 33.33% 56713727820156410577229101238628035242 10.190.207.185 datacenter1 rack1 Up Normal 3.54 GB 33.33% 113427455640312821154458202477256070484 It seems it is still part of the cluster. What should we do? decomission again? How can we know the current state of the cluster? Thanks!
0.7 live schema updates
Hi! I like the new feature of making live schema updates. You can add, drop and rename columns and keyspaces via thrift, but how do you modify column attributes like key_cache_size or rows_cached? Thank you.
Re: Best strategy for adding new nodes to the cluster
What do you mean by "running live"? I am also planning to use cassandra on EC2 using small nodes. Small nodes have 1/4 cpu of the large ones, 1/4 cost, but I/O is more than 1/4 (amazon does not give explicit I/O numbers...), so I think 4 small instances should perform better than 1 large one (and the cost is the same), am I wrong? El 27 de septiembre de 2010 18:09:14 UTC+2, Jonathan Ellis < jbel...@gmail.com> escribió: > I strongly recommend not running live on Small nodes. So in your case > I would recommend starting up Large instances with raid0'd disks, shut > down cassandra on the Small ones, rsync to the Large, and start up on > Large. > > On Mon, Sep 27, 2010 at 6:46 AM, Utku Can Topçu wrote: > > Hi All, > > > > We're currently running a cassandra cluster with Replication Factor 3, > > consisting of 4 nodes. > > > > The current situation is: > > > > - The nodes are all identical (AWS small instances) > > - Data directory is in the partition (/mnt) which has 150G capacity and > each > > node has around 90 GB load, so 60 G free space per node is left. > > > > So adding a new node to the cluster will seem to cause problems for us. I > > think the node which will stream the data to the new bootstrapping node, > > will not have enough disk space for anticompacting its data. > > > > What should be the best practice for such scenarios? > > > > Regards, > > > > Utku > > > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of Riptano, the source for professional Cassandra support > http://riptano.com >
Re: New nodes won't bootstrap on .66
Hi, Did you solve this problem? I'm having the same poblem. I'm trying to bootstrap a third node in a 0.66 cluster. It has two keyspaces: Keyspace1 and KeyspaceLogs, both with replication factor 2. It starts bootstrapping, receives some streams but it keeps waiting for streams. I enabled the debug mode. This lines may be useful: DEBUG [main] 2010-11-07 17:39:50,052 BootStrapper.java (line 70) Beginning bootstrap process DEBUG [main] 2010-11-07 17:39:50,082 StorageService.java (line 160) Added / 10.204.93.16/Keyspace1 as a bootstrap source ... DEBUG [main] 2010-11-07 17:39:50,090 StorageService.java (line 160) Added / 10.204.93.16/KeyspaceLogs as a bootstrap source ... (streaming mesages) DEBUG [Thread-56] 2010-11-07 17:45:51,706 StorageService.java (line 171) Removed /10.204.93.16/Keyspace1 as a bootstrap source; remaining is [/ 10.204.93.16] ... (and never ends). It seems it is waiting for [/10.204.93.16] when it should be waiting for / 10.204.93.16/KeyspaceLogs. The third node is 64 bits, while the two existing nodes are 32 bits. Can this be a problem? Thank you. 2010/10/28 Dimitry Lvovsky > Maybe your7000 is being blocked by iptables > or some firewall or maybe you have it bound ( tag ) to > localhost instead an ip address. > > Hope this helps, > Dimitry. > > > > On Thu, Oct 28, 2010 at 5:35 PM, Thibaut Britz < > thibaut.br...@trendiction.com> wrote: > >> Hi, >> >> I have the same problem with 0.6.5 >> >> New nodes will hang forever in bootstrap mode (no streams are being >> opened) and the receiver thread just waits for data forever: >> >> >> INFO [Thread-53] 2010-10-27 20:33:37,399 SSTableReader.java (line 120) >> Sampling index for /hd2/cassandra/data/table_xyz/ >> table_xyz-3-Data.db >> INFO [Thread-53] 2010-10-27 20:33:37,444 StreamCompletionHandler.java >> (line 64) Streaming added /hd2/cassandra/data/table_xyz/table_xyz-3-Data.db >> >> Stacktracke: >> >> "pool-1-thread-53" prio=10 tid=0x412f2800 nid=0x215c runnable >> [0x7fd7cf217000] >>java.lang.Thread.State: RUNNABLE >> at java.net.SocketInputStream.socketRead0(Native Method) >> at java.net.SocketInputStream.read(SocketInputStream.java:129) >> at java.io.BufferedInputStream.fill(BufferedInputStream.java:218) >> at java.io.BufferedInputStream.read1(BufferedInputStream.java:258) >> at java.io.BufferedInputStream.read(BufferedInputStream.java:317) >> - locked <0x7fd7e77e0520> (a java.io.BufferedInputStream) >> at >> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:126) >> at >> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >> at >> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314) >> at >> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262) >> at >> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192) >> at >> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1154) >> at >> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >> at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) >> at java.lang.Thread.run(Thread.java:662) >> >> >> >> >> >> >> >> >> >> >> >> >> On Thu, Oct 28, 2010 at 12:44 PM, aaron morton >> wrote: >> >>> The best approach is to manually select the tokens, see the Load >>> Balancing section http://wiki.apache.org/cassandra/Operations Also >>> >>> Are there any log messages in the existing nodes or the new one which >>> mention each other? >>> >>> Is this a production system? Is it still running ? >>> >>> Sorry there is not a lot to go on, it sounds like you've done the right >>> thing. I'm assuming things like the Cluster Name, seed list and port numbers >>> are set correct as the new node got some data. >>> >>> You'll need to dig through the logs a bit more to see that the boot >>> strapping started and what was the last message it logged. >>> >>> Good Luck. >>> Aaron >>> >>> On 27 Oct 2010, at 22:40, Dimitry Lvovsky wrote: >>> >>> Hi Aaron, >>> Thanks for your reply. >>> >>> We still haven't solved this unfortunately. >>> >>> How did you start the bootstrap for the .18 node ? >>> >>> >>> Standard way: we set "AutoBootstrap" to true and added all the servers >>> from the working ring as seeds. >>> >>> Was it the .18 or the .17 node you tried to add >>> >>> >>> We first tried adding .17, it streamed for a while, took on a 50GB of >>> load, stopped streaming but then didn't enter into the ring. We left it for >>> a few days to see if it would come in, but no luck. After that we did >>> decommission and removeToken ( in that order) operations. >>> Since we couldn't get .17 in we tried again with .18. Before doing so we >>> increas
Re: New nodes won't bootstrap on .66
I have just solved the problem removing the second keyspace (manually moving its column families to the first). So it seems the problem appears when having multiple keyspaces. 2010/11/8 Thibaut Britz > Hi, > > No I didn't solve the problem. I reinitialized the cluster and gave each > node manually a token before adding data. There are a few messages in > multiple threads related to this, so I suspect it's very common and I hope > it's gone with 0.7. > > Thibaut > > > > > > On Sun, Nov 7, 2010 at 6:57 PM, Marc Canaleta wrote: > >> Hi, >> >> Did you solve this problem? I'm having the same poblem. I'm trying to >> bootstrap a third node in a 0.66 cluster. It has two keyspaces: Keyspace1 >> and KeyspaceLogs, both with replication factor 2. >> >> It starts bootstrapping, receives some streams but it keeps waiting for >> streams. I enabled the debug mode. This lines may be useful: >> >> DEBUG [main] 2010-11-07 17:39:50,052 BootStrapper.java (line 70) Beginning >> bootstrap process >> DEBUG [main] 2010-11-07 17:39:50,082 StorageService.java (line 160) Added >> /10.204.93.16/Keyspace1 as a bootstrap source >> ... >> DEBUG [main] 2010-11-07 17:39:50,090 StorageService.java (line 160) Added >> /10.204.93.16/KeyspaceLogs as a bootstrap source >> ... (streaming mesages) >> DEBUG [Thread-56] 2010-11-07 17:45:51,706 StorageService.java (line 171) >> Removed /10.204.93.16/Keyspace1 as a bootstrap source; remaining is [/ >> 10.204.93.16] >> ... >> (and never ends). >> >> It seems it is waiting for [/10.204.93.16] when it should be waiting for >> /10.204.93.16/KeyspaceLogs. >> >> The third node is 64 bits, while the two existing nodes are 32 bits. Can >> this be a problem? >> >> Thank you. >> >> >> 2010/10/28 Dimitry Lvovsky >> >> Maybe your7000 is being blocked by iptables >>> or some firewall or maybe you have it bound ( tag ) to >>> localhost instead an ip address. >>> >>> Hope this helps, >>> Dimitry. >>> >>> >>> >>> On Thu, Oct 28, 2010 at 5:35 PM, Thibaut Britz < >>> thibaut.br...@trendiction.com> wrote: >>> >>>> Hi, >>>> >>>> I have the same problem with 0.6.5 >>>> >>>> New nodes will hang forever in bootstrap mode (no streams are being >>>> opened) and the receiver thread just waits for data forever: >>>> >>>> >>>> INFO [Thread-53] 2010-10-27 20:33:37,399 SSTableReader.java (line 120) >>>> Sampling index for /hd2/cassandra/data/table_xyz/ >>>> table_xyz-3-Data.db >>>> INFO [Thread-53] 2010-10-27 20:33:37,444 StreamCompletionHandler.java >>>> (line 64) Streaming added /hd2/cassandra/data/table_xyz/table_xyz-3-Data.db >>>> >>>> Stacktracke: >>>> >>>> "pool-1-thread-53" prio=10 tid=0x412f2800 nid=0x215c runnable >>>> [0x7fd7cf217000] >>>>java.lang.Thread.State: RUNNABLE >>>> at java.net.SocketInputStream.socketRead0(Native Method) >>>> at java.net.SocketInputStream.read(SocketInputStream.java:129) >>>> at >>>> java.io.BufferedInputStream.fill(BufferedInputStream.java:218) >>>> at >>>> java.io.BufferedInputStream.read1(BufferedInputStream.java:258) >>>> at >>>> java.io.BufferedInputStream.read(BufferedInputStream.java:317) >>>> - locked <0x7fd7e77e0520> (a java.io.BufferedInputStream) >>>> at >>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:126) >>>> at >>>> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) >>>> at >>>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:314) >>>> at >>>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:262) >>>> at >>>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:192) >>>> at >>>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1154) >>>> at >>>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:167) >>>> at >>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) >>>> at >&g