It was the long time since last repair that did it. We've scheduled regular repairs now, and this time the repairs didn't increase the load very much. So that was it! :-)
/Henrik On Thu, Nov 8, 2012 at 7:20 PM, Andrey Ilinykh <ailin...@gmail.com> wrote: > Nothing unusual. When you run repair cassandra streams inconsistent > regions from all replicas. If you have wide rows or didn't run repair > regularly it is very easy to get 10-20% of extra data from each replica. > What probably happens in your case. Theoretically cassandra should compact > new sstables you get from other nodes. But, by default cassandra compacts > sstables in the same size tier. Because of major compaction you ran before, > you have one big sstable and a bunch of small. So, there is nothing to > compact right now. Eventually cassandra will compact them. But nobody knows > when it will happen. This is one of problems caused by major compaction. > For maintenance it is better to have a set of small sstables then one > big. > > Andrey > > > On Thu, Nov 8, 2012 at 2:55 AM, Henrik Schröder <skro...@gmail.com> wrote: > >> Hi, >> >> We recently ran a major compaction across our cluster, which reduced the >> storage used by about 50%. This is fine, since we do a lot of updates to >> existing data, so that's the expected result. >> >> The day after, we ran a full repair -pr across the cluster, and when that >> finished, each storage node was at about the same size as before the major >> compaction. Why does that happen? What gets transferred to other nodes, and >> why does it suddenly take up a lot of space again? >> >> We haven't run repair -pr regularly, so is this just something that >> happens on the first weekly run, and can we expect a different result next >> week? Or does repair always cause the data to grow on each node? To me it >> just doesn't seem proportional? >> >> >> /Henrik >> > >