After having done so many tries, I am not sure which log entries correspond to what. However, there were many of this type:
WARN [CompactionExecutor:14] 2011-08-18 18:47:00,596 CompactionManager.java (line 730) Index file contained a different key or row size; using key from data file And there were out of disk space errors (because hundreds of gigs were used up). Anyhow, disk space usage is under control now, at least for 3 nodes so far, the repair the last node is still running . Here is what I did that led to the disk space usage under control: * Shutdown cass * Restored data from backup for 0.6.11. * Started up cass 0.6.11 * Ran repair on all nodes, one at a time. After repair, the node that showed up in ring as having 40GB of data before, now have 120GB of data. All other nodes showed the same amount of data. Not much disk space usage increase compare to what seen before. Just some Compacted files. * Ran compact on all nodes * Drained all nodes * Shutdown cass * Started up cass with version 0.8.4 * Applied schema to 0.8.4 * Ran scrub on all nodes * Ran repair on each of the 4 nodes, one at at time (repair on the last node is still running). Data size show up in ring the same size as it was in cass 0.6.11. No disk space usage increase. So it seems like data was in inconsistent state before it was upgraded to cass 0.8.4, and some how that triggered cass 0.8.4 repair to cause disk usage out of control. Or may be data is already consistent across the nodes, and now running repair does not do any kind of data transfer. Once the repair for this last node is completed, I will start populating some data by using our application. While using the app, I will randomly restart few nodes, one at a time to cause data missing on some nodes. Then I will run repair again to see if the disk usage still under control. Huy On Fri, Aug 19, 2011 at 7:22 PM, Peter Schuller <peter.schul...@infidyne.com > wrote: > > Is there any chance that theentire file from source node got streamed to > > destination node even though only small amount of data in hte file from > > source node is supposed to be streamed destination node? > > Yes, but the thing that's annoying me is that even if so - you should > not be seeing a 40 gb -> hundreds of gig increase even if all > neighbors sent all their data. > > Can you check system.log for references to these sstables to see when > and under what circumstances they got written? > > -- > / Peter Schuller (@scode on twitter) > -- Huy Le Spring Partners, Inc. http://springpadit.com