RE: URGENT HELP PLEASE!

Jared Laprise Fri, 25 Mar 2011 13:06:15 -0700

No, what initially started it all was that I needed to increase my EC2 server 
instance size. So I removed said server from the load balancer, stopped 
Cassandra, and then shutdown the server in order to change the instance type. I 
assumed the other node had all the data and everything should keep running 
without issue. Almost immediately I realized I was missing a bunch of data. Not 
fully understanding what happened  I was hesitant to bring up the other node 
again for fear of data loss (again because I didn't understand what had 
happened). I ended up bringing the other node back online and then everything 
seemed to snap back it expected working order.

Although after all the help from the Cassandra community I have a much better 
understanding of why and how my situation happened, there was still one strange 
side effect I noticed. For context, I store user accounts and other account 
information in Cassandra. When the second node was offline and I tried to log 
into the site, I got an error saying invalid password. Out of curiosity I 
logged into the cassandra-cli tool and looked at what columns and values were 
present for my user account. My User CF seemed to have data stored from right 
before I added the second node. I found that really strange assuming that 
Cassandra doesn't keep any historical or versioned data? Again, once the second 
node was back online both servers showed the expected more current data.

Today I'm preparing to increase my replication factor to 2 and have been 
reading about the proper way to do that. Although I've found bits and pieces, I 
haven't found any definitive explanation on how to do it. Could someone please 
sanity check my intended approach?

1. Change the RF to 2 and restart Cassandra on both nodes
2. Run `nodetool repair` on both nodes, one at a time as to not halt up both 
servers (will that sync data between the nodes?)

In a 2 node environment and RF=2 using consistency level of ONE would still 
ensure data is replicated to both servers, correct?

-----Original Message-----
From: Sylvain Lebresne [mailto:sylv...@datastax.com] 
Sent: Friday, March 25, 2011 3:01 AM
To: user@cassandra.apache.org
Cc: Jared Laprise
Subject: Re: URGENT HELP PLEASE!

On Fri, Mar 25, 2011 at 1:49 AM, Jared Laprise <ja...@webonyx.com> wrote:
> Hello all, I'm running 2 Cassandra 6.5 nodes and I brought down the 
> secondary node and restarted the primary node. After Cassandra came 
> back up all data has been reverted to several months ago.

Out of curiosity, when you said 'brought down the secondary node', did that 
involved a decomission or removeToken ? If so, I have an explanation for you.

--
Sylvain

> I could really use some incite here, this is a production website and 
> I need to act quickly. I have a cron job that takes a snapshot every 
> night, but even with that I tried to restore a snapshot on my local 
> development environment and it was also missing a ton of data.
>
>
>
> Any help will be so appreciated.
>
>
>
>

RE: URGENT HELP PLEASE!

Reply via email to