I believe that no matter how I stop cassandra, I should not be missing data. Even if compaction is in progress. As far as I can tell during compaction cassa reads from some files and creates a new temp file. Here I believe that it stores the info and after the new file is complete it should rename maybe do some internal work about changes and then then delete the original files. Now if a node just crashes for whatever reason, the initial data or the new data should be there. When running nodetool drain , for a while, I cannot write anything in the linux terminal. When the cursor is again available I assume that nodetool has finished execution. After that I stop cassandra. These missing files are on the same node that previously I decomissioned (due to the same problem), deleted all data and readded it to the cluster. So in this case all data should be recreated from the other nodes in the cluster.
On Tuesday, March 25, 2014 9:53 PM, Duncan Sands <duncan.sa...@gmail.com> wrote: Hi, On 25/03/14 19:30, Robert Coli wrote: > On Tue, Mar 25, 2014 at 5:36 AM, Batranut Bogdan <batra...@yahoo.com > <mailto:batra...@yahoo.com>> wrote: > > I am running 2.0.6 and I use /etc/init.d/cassandra start / stop . Also > before stopping I do : > nodetool disablegossip > nodetool disablethrift > nodetool drain > after that /etc/init.d/cassandra stop > > > This seems reasonable/best practice. Do you verify the drain has completed > before stopping? > > If I were you, I'd be looking for files with the name of the missing file in > logs previous to the restart. It is pretty unusual for SSTables or their > components to go missing. I've noticed that compaction can continue to run after the drain has completed, and restarting while in the middle of the compaction can then cause similar problems. Ciao, Duncan.