We also got the same problem when using 0.8.0. As far as I know, there are a few issues relative to 'repair' has been marked as resolved at 0.8.1. Hope this could really solve our problem.
On Wed, Jul 20, 2011 at 8:47 PM, David Boxenhorn <da...@citypath.com> wrote: > I have this problem too, and I don't understand why. > > I can repair my nodes very quickly by looping though all my data (when you > read your data it does read-repair), but nodetool repair takes forever. I > understand that nodetool repair builds merkle trees, etc. etc., so it's a > different algorithm, but why can't nodetool repair be smart enough to choose > the best algorithm? Also, I don't understand what's special about my data > that makes nodetool repair so much slower than looping through all my data. > > > > On Wed, Jul 20, 2011 at 12:18 AM, Maxim Potekhin <potek...@bnl.gov> wrote: > >> Thanks Edward. I'm told by our IT that the switch connecting the nodes is >> pretty fast. >> Seriously, in my house I copy complete DVD images from my bedroom to >> the living room downstairs via WiFi, and a dozen of GB does not seem like >> a >> problem, on dirt cheap hardware (Patriot Box Office). >> >> I also have just _one_ column major family but caveat emptor -- 8 indexes >> attached to >> it (and there will be more). There is one accounting CF which is small, >> can't possibly >> make a difference. >> >> By contrast, compaction (as in nodetool) performs quite well on this >> cluster. I start suspecting some >> sort of malfunction. >> >> Looked at the system log during the "repair", there is some compaction >> agent doing >> work that I'm not sure makes sense (and I didn't call for it). Disk >> utilization all of a sudden goes up to 40% >> per Ganglia, and stays there, this is pretty silly considering the cluster >> is IDLE and we have SSDs. No external writes, >> no reads. There are occasional GC stoppages, but these I can live with. >> >> This repair debacle happens 2nd time in a row. Cr@p. I need to go to >> production soon >> and that doesn't look good at all. If I can't manage a system that simple >> (and/or get help >> on this list) I may have to cut losses i.e. stay with Oracle. >> >> Regards, >> >> Maxim >> >> >> >> >> On 7/19/2011 12:16 PM, Edward Capriolo wrote: >> >>> >>> Well most SSD's are pretty fast. There is one more to consider. If >>> Cassandra determines nodes are out of sync it has to transfer data across >>> the network. If that is the case you have to look at 'nodetool streams' and >>> determine how much data is being transferred between nodes. There are some >>> open tickets where with larger tables repair is streaming more then it needs >>> to. But even if the transfers are only 10% of your 200GB. Transferring 20 GB >>> is not trivial. >>> >>> If you have multiple keyspaces and column families repair one at a time >>> might make the process more manageable. >>> >> >> >