Also note that an improved and compressible file format has been in the works for a while now.
https://issues.apache.org/jira/browse/CASSANDRA-674 I am endlessly optimistic that it will make it into the 'next' version; in particular, the current hope is 0.8 On Jan 20, 2011 6:34 AM, "Terje Marthinussen" <tmarthinus...@gmail.com> wrote: > Perfectly normal with 3-7x increase in data size depending on you data schema. > > Regards, > Terje > > On 20 Jan 2011, at 23:17, "akshatbakli...@gmail.com" < akshatbakli...@gmail.com> wrote: > >> I just did a du -h DataDump which showed 40G >> and du -h CassandraDataDump which showed 170G >> >> am i doing something wrong. >> have you observed some compression in it. >> >> On Thu, Jan 20, 2011 at 6:57 PM, Javier Canillas < javier.canil...@gmail.com> wrote: >> How do you calculate your 40g data? When you insert it into Cassandra, you need to convert the data into a Byte[], maybe your problem is there. >> >> >> On Thu, Jan 20, 2011 at 10:02 AM, akshatbakli...@gmail.com < akshatbakli...@gmail.com> wrote: >> Hi all, >> >> I am experiencing a unique situation. I loaded some data onto Cassandra. >> my data was about 40 GB but when loaded to Cassandra the data directory size is almost 170GB. >> >> This means the **data got inflated**. >> >> Is it the case just with me or some else is also facing the inflation or its the general behavior of Cassandra. >> >> I am using Cassandra 0.6.8. on Ubuntu 10.10 >> >> -- >> Akshat Bakliwal >> Search Information and Extraction Lab >> IIIT-Hyderabad >> 09963885762 >> WebPage >> >> >> >> >> >> -- >> Akshat Bakliwal >> Search Information and Extraction Lab >> IIIT-Hyderabad >> 09963885762 >> WebPage >>