On Wed, Feb 29, 2012 at 5:25 AM, Casey Deccio <ca...@deccio.net> wrote:
> I recently had to do some shuffling with one of my cassandra nodes because > it was running out of disk space. I did a few things in the process, and > I'm not sure in the end which caused my problem. First I added a second > file path to the data directory in cassandra.yaml. Things still worked > fine after this, as far as I could tell. Shortly after this, however, I > took down the node and rsync'd the data from both data directories, as well > as commitlogs, to an external drive. I then shut down the machine, > replaced the hard drives with bigger drives, and re-installed the OS. I > re-created the data directories, rsync'd the data and commitlogs back over > from the external drive, and started up cassandra, re-adding it to the > ring. When it came up, all of my rows were missing for one columnfamily > and nearly all my rows were missing for another--or at least that's what it > looks like, based on walking the rows. I tried scrubbing each of the > nodes. One of them had insufficient disk space (yes, this seems to be a > recurring problem) for scrub, so I did upgradesstables instead, and that > one is still in progress. So far the scrub/upgradesstables hasn't seemed > to help. But in the log messages created during scrub/upgradesstables it > shows realistic numbers (i.e., in terms of the rows that existed before > this ordeal) created in each new sstable. Also, the loads shown when I run > nodetool ring still reflects the numbers with the complete set of rows. > That's encouraging, but I can't seem to access these phantom rows. Please > help! > > I neglected to mention that I'm running cassandra 1.0.7. Thanks, Casey