I believe that no matter how I stop cassandra, I should not be missing data. 
Even if compaction is in progress. As far as I can tell during compaction cassa 
reads from some files and creates a new temp file. Here I believe that it 
stores the info and after the new file is complete it should rename maybe do 
some internal work about changes and then then delete the original files. Now 
if a node just crashes for whatever reason, the initial data or the new data 
should be there. 
When running nodetool drain , for a while, I cannot write anything in the linux 
terminal. When the cursor is again available I assume that nodetool has 
finished execution. After that I stop cassandra.
These missing files are on the same node that previously I decomissioned (due 
to the same problem), deleted all data and readded it to the cluster. So in 
this case all data should be recreated from the other nodes in the cluster.



On Tuesday, March 25, 2014 9:53 PM, Duncan Sands <duncan.sa...@gmail.com> wrote:
 
Hi,

On 25/03/14 19:30, Robert Coli wrote:
> On Tue, Mar 25, 2014 at 5:36 AM, Batranut Bogdan <batra...@yahoo.com
> <mailto:batra...@yahoo.com>> wrote:

>
>     I am running 2.0.6 and I use /etc/init.d/cassandra start / stop . Also
>     before stopping I do :
>     nodetool disablegossip
>     nodetool disablethrift
>     nodetool drain
>     after that /etc/init.d/cassandra stop
>
>
> This seems reasonable/best practice. Do you verify the drain has completed
> before stopping?
>
> If I were you, I'd be looking for files with the name of the missing file in
> logs previous to the restart. It is pretty unusual for SSTables or their
> components to go missing.

I've noticed that compaction can continue to run after the drain has completed, 
and restarting while in the middle of the compaction can then cause similar 
problems.

Ciao, Duncan.

Reply via email to