On Fri, Sep 21, 2012 at 2:05 AM, aaron morton <aa...@thelastpickle.com> wrote: >> Would it help if I partitioned the computing resources of my physical >> machines into VMs? > > No. > Just like cutting a cake into smaller pieces does not mean you can eat more > without getting fat. > > In the general case, regular HDD and 1 Gbe and 8 to 16 virtual cores and 8GB > to 16GB ram, you can expect to comfortably run up 400GB of data (maybe > 500GB). That is replicated storage, so 400 / 3 = 133GB if you replicate > data 3 times.
Remember also that these numbers reflect total size of your sstables. This is both good and bad: 1. Good, because if you use compression you can store more data. I'm doing time series data for network statistics and I'm seeing extremely good compression numbers (better then 10:1) 2. Bad, because if you're doing a lot of deletes, the old data + tombstones count against you until they're actually purged from disk. This can create rather interesting disk usage situations where my "rolling 48 hours" of current data CF takes significantly more disk space then my historical CF which currently stores over 4 months worth of data. I'm thinking about repairing the rolling 48 hours CF more often and reducing the gc_grace time so that compaction has a better chance of removing stale data from disk. -- Aaron Turner http://synfin.net/ Twitter: @synfinatic http://tcpreplay.synfin.net/ - Pcap editing and replay tools for Unix & Windows Those who would give up essential Liberty, to purchase a little temporary Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin "carpe diem quam minimum credula postero"