On Wed, Jan 5, 2011 at 9:52 AM, Jonathan Ellis <jbel...@gmail.com> wrote:
> It's normal for Cassandra to use more disk space than MySQL.  It's
> part of what we trade for not having to rewrite every row when you add
> a new column.
>
> "SSTables that are obsoleted by a compaction are deleted
> asynchronously when the JVM performs a GC."
> http://wiki.apache.org/cassandra/MemtableSSTable
>
> On Wed, Jan 5, 2011 at 8:35 AM, nicolas lattuada
> <nicolaslattu...@hotmail.fr> wrote:
>> Hi
>>
>> i have some data size issues:
>>
>> i am storing super columns with the following content:
>>
>> {a=>1, b=>2, c=>3.......n=>14}
>>
>> i am storing it 300 000 times and i have a data size on the disk about 283Mo
>>
>> And in other side i have a mysql table which stores a bunch of data the
>> schema follows:
>> 6 varchars +100
>> 5 ints +6
>>
>> I put about 1 300 000 records on it and end up with 150Mo of data and 57Mo
>> of index.
>>
>> Then i think i am certainly doing something wrong...
>>
>> The other thing is when i run flush and then compact the size of my data
>> increases, then i imagine something is copied up on compaction
>> So is there a way to remove the unused data? (cleanup doesn t seem to do the
>> job).
>>
>> Any help to reduce the size of the data would be greatly apreciated!
>> Greetings
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Unlike datastores that are delimited or have fixed column sizes
Cassandra does not. Each row is a Sorted Map of columns. A Column is a
tupple of {columnname,columnvalue,time}. Also the data is not stored
as tersely as it is inside mysql.

Reply via email to