Re: Cassandra and disk space

Scott Dworkis Thu, 09 Dec 2010 12:14:51 -0800

i recently finished a practice expansion of 4 nodes to 5 nodes, a seriesof "nodetool move", "nodetool cleanup" and jmx gc steps. i found that insome of the steps, disk usage actually grew to 2.5x the base data size onone of the nodes. i'm using 0.6.4.


-scott


On Thu, 9 Dec 2010, Rustam Aliyev wrote:

Is there any plans to improve this in future?

For big data clusters this could be very expensive. Based on your comment, I 
will need 200TB of storage for 100TB of data to keep Cassandra running.

--
Rustam.

On 09/12/2010 17:56, Tyler Hobbs wrote:
      If you are on 0.6, repair is particularly dangerous with respect to disk 
space usage.  If your replica is sufficiently out of sync, you can
      triple your disk usage pretty easily.  This has been improved in 0.7, so 
repairs should use about half as much disk space, on average.

      In general, yes, keep your nodes under 50% disk usage at all times.  Any 
of: compaction, cleanup, snapshotting, repair, or bootstrapping (the
      latter two are improved in 0.7) can double your disk usage temporarily.

      You should plan to add more disk space or add nodes when you get close to 
this limit.  Once you go over 50%, it's more difficult to add nodes,
      at least in 0.6.

      - Tyler

      On Thu, Dec 9, 2010 at 11:19 AM, Mark <static.void....@gmail.com> wrote:
            I recently ran into a problem during a repair operation where my 
nodes completely ran out of space and my whole cluster was...
            well, clusterfucked.

            I want to make sure how to prevent this problem in the future.

            Should I make sure that at all times every node is under 50% of its 
disk space? Are there any normal day-to-day operations that
            would cause the any one node to double in size that I should be 
aware of? If on or more nodes to surpass the 50% mark, what should
            I plan to do?

            Thanks for any advice

Re: Cassandra and disk space

Reply via email to