Follow up question: Is it safe to abort the compactions happening after
node repair?

On Mon, Oct 15, 2012 at 6:32 PM, Will Martin <w...@voodoolunchbox.com>wrote:

> +1   It doesn't make sense that the xfr compactions are heavy unless they
> are translating the file. This could be a protocol mismatch: however the
> requirements for node level compaction and wire compaction I would expect
> to be pretty different.
> On Oct 15, 2012, at 4:42 PM, Matthias Broecheler wrote:
>
> > Hey,
> >
> > we are writing a lot of data into a cassandra cluster for a batch
> loading use case. We cannot use the sstable batch loader, so in order to
> speed up the loading process we are using RF=1 while the data is loading.
> After the load is complete, we want to increase the RF. For that, we are
> updating the RF in the schema and then run the node repair tool on each
> cassandra instance to stream the data over. However, we are noticing that
> this process is slowed down by a lot of compactions (the actually streaming
> of data only takes a couple of minutes).
> >
> > Cassandra is already running a major compaction after the data loading
> process has completed. But then, there are to be two more compactions (one
> on the sender and one on the receiver) happening and those take a very long
> time even on the aws high i/o instance with no compaction throttling.
> >
> > Question: These additional compactions seem redundant since there are no
> reads or writes on the cluster after the first major compaction
> (immediately after the data load), is that right? And if so, what can we do
> to avoid them? We are currently waiting multiple days.
> >
> > Thank you very much for your help,
> > Matthias
> >
>
>


-- 
Matthias Broecheler, PhD
http://www.matthiasb.com
E-Mail: m...@matthiasb.com

Reply via email to