It was the long time since last repair that did it. We've scheduled regular
repairs now, and this time the repairs didn't increase the load very much.
So that was it! :-)


/Henrik


On Thu, Nov 8, 2012 at 7:20 PM, Andrey Ilinykh <ailin...@gmail.com> wrote:

> Nothing unusual. When you run repair cassandra streams inconsistent
> regions from all replicas. If you have wide rows or didn't run repair
> regularly it is very easy to get 10-20% of extra data from each replica.
> What probably happens in your case. Theoretically cassandra should compact
> new sstables you get from other nodes. But, by default cassandra compacts
> sstables in the same size tier. Because of major compaction you ran before,
> you have one big sstable and a bunch of small. So, there is nothing to
> compact right now. Eventually cassandra will compact them. But nobody knows
> when it will happen. This is one of problems caused by major compaction.
> For maintenance it is better to have a set of small sstables then one
> big.
>
> Andrey
>
>
> On Thu, Nov 8, 2012 at 2:55 AM, Henrik Schröder <skro...@gmail.com> wrote:
>
>> Hi,
>>
>> We recently ran a major compaction across our cluster, which reduced the
>> storage used by about 50%. This is fine, since we do a lot of updates to
>> existing data, so that's the expected result.
>>
>> The day after, we ran a full repair -pr across the cluster, and when that
>> finished, each storage node was at about the same size as before the major
>> compaction. Why does that happen? What gets transferred to other nodes, and
>> why does it suddenly take up a lot of space again?
>>
>> We haven't run repair -pr regularly, so is this just something that
>> happens on the first weekly run, and can we expect a different result next
>> week? Or does repair always cause the data to grow on each node? To me it
>> just doesn't seem proportional?
>>
>>
>> /Henrik
>>
>
>

Reply via email to