Hello Robin You have many options for compression in C*:
1) Serialized in bytes instead of JSON, to save a lot of space due to String encoding. Of course the data will be opaque and not human readable 2) Activate client-node data compression. In this case, do not forget to ship LZ4 or SNAPPY dependency on the client side. On the server-side, data compression is active by default using LZ4 when you're creating a new table so there is pretty much nothing to do. It's up to you to consider whether the compression ratio difference between Gzip and LZ4 does worth relying on C* compression. Regards On Mon, Nov 3, 2014 at 3:51 PM, Robin Verlangen <ro...@us2.nl> wrote: > Hi there, > > We're working on a project which is going to store a lot of JSON objects > in Cassandra. A large piece of this (90%) consists of an array of integers, > of which in a lot of cases there are a bunch of zeroes. > > The average JSON is 4KB in size, and once GZIP (default compression) just > under 100 bytes. > > My question is, should we compress client-side (literally converting JSON > string to compressed gzip bytes), let Cassandra do the work, or do both? > > From my point of view I think Cassandra would be better, as it could > compress beyond a single value, using large blocks within a row / SSTable. > > Thank you in advance for your help. > > Best regards, > > Robin Verlangen > *Chief Data Architect* > > W http://www.robinverlangen.nl > E ro...@us2.nl > > <http://goo.gl/Lt7BC> > *What is CloudPelican? <http://goo.gl/HkB3D>* > > Disclaimer: The information contained in this message and attachments is > intended solely for the attention and use of the named addressee and may be > confidential. If you are not the intended recipient, you are reminded that > the information remains the property of the sender. You must not use, > disclose, distribute, copy, print or rely on this e-mail. If you have > received this message in error, please contact the sender immediately and > irrevocably delete this message and any copies. >