Re: Cassandra disk usage

2014-04-14 Thread Yulian Oifa
Hello The load of data on 3 nodes is : Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 172.19.10.1 19 10 Up Normal 22.16 GB 33.33% 0 172.19.10.2 19 10 Up Normal 19.89 GB 33.33%

Re: Cassandra disk usage

2014-04-13 Thread Mark Reddy
> > i I will change the data i am storing to decrease the usage , in value i > will find some small value to store.Previously i used same value since this > table is index only for search purposed and does not really has value. If you don't need a value, you don't have to store anything. You can

Re: Cassandra disk usage

2014-04-13 Thread Michal Michalski
> Each columns have name of 15 chars ( digits ) and same 15 chars in value ( also digits ). > Each column should have 30 bytes. Remember about the standard Cassandra's column overhead which is, as far as I remember, 15 bytes, so it's 45 bytes in total - 50% more than you estimated, which kind of m

Re: Cassandra disk usage

2014-04-13 Thread Yulian Oifa
Hello Mark and thanks for you reply. 1) i store is as UTF8String.All digits are from 0x30 to 0x39 and should take 1 byte each digit. Since all characters are digits it should have 15 bytes. 2) I will change the data i am storing to decrease the usage , in value i will find some small value to store

Re: Cassandra disk usage

2014-04-13 Thread Mark Reddy
What are you storing these 15 chars as; string, int, double, etc.? 15 chars does not translate to 15 bytes. You may be mixing up replication factor and quorum when you say "Cassandra cluster has 3 servers, and data is stored in quorum ( 2 servers )." You read and write at quorum (N/2)+1 where N=to

Cassandra disk usage

2014-04-13 Thread Yulian Oifa
I have column family with 2 raws. 2 raws have overall 100 million columns. Each columns have name of 15 chars ( digits ) and same 15 chars in value ( also digits ). Each column should have 30 bytes. Therefore all data should contain approximately 3GB. Cassandra cluster has 3 servers , and data is s

Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Peter Schuller
> That is correct.  In 0.6, an anticompaction was performed and a temporary > SSTable was written out to disk, then streamed to the recipient.  The way > this is now done in 0.7 requires no extra disk space on the source node. Great. So that should at least mean that running out of diskspace shoul

Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Tyler Hobbs
> Anti-compaction and streaming is done to move data from nodes that > have it (that are in the replica set). This implies CPU and I/O and > networking load on the source node, so it does have an impact. See > http://wiki.apache.org/cassandra/Streaming among others. > > (Here's where I'm not sure,

Re: Cassandra disk usage and failure recovery

2011-01-04 Thread Peter Schuller
This will be a very selective response, not at all as exhaustive as it should be to truly cover what you bring up. Sorry, but here goes some random tidbits. > On the cassandra user list, I noticed a thread on a user that literally > wrote his cluster to death.  Correct me if I'm wrong, but based o

Re: cassandra disk usage

2010-08-30 Thread Terje Marthinussen
On Mon, Aug 30, 2010 at 10:10 PM, Jonathan Ellis wrote: > column names are stored per cell > > (moving to user@) > I think that is already accommodated for in my numbers? What i listed was measured from the actual SSTable file (using the output from "strings ), so multiples of the supercolumn

Re: cassandra disk usage

2010-08-30 Thread Jonathan Ellis
column names are stored per cell (moving to user@) On Mon, Aug 30, 2010 at 6:58 AM, Terje Marthinussen wrote: > Hi, > > Was just looking at a SSTable file after loading a dataset. The data load > has no updates of data  but: > - Columns can in some rare cases be added to existing super columns >