Depending on the compression rate, I think it would generate less garbage on
the Cassandra side if you compressed it client side. Something to test out.
> On Apr 4, 2018, at 7:19 AM, Jeff Jirsa wrote:
>
> Compressing server side and validating checksums is hugely important in the
> more freq
Compressing server side and validating checksums is hugely important in the
more frequently used versions of cassandra - so since you probably want to run
compression on the server anyway, I’m not sure why you’d compress it twice
--
Jeff Jirsa
> On Apr 4, 2018, at 6:23 AM, DuyHai Doan wrote
Compressing client-side is better because it will save:
1) a lot of bandwidth on the network
2) a lot of Cassandra CPU because no decompression server-side
3) a lot of Cassandra HEAP because the compressed blob should be relatively
small (text data compress very well) compared to the raw size
On
Hi,
We use a pseudo file-system table where the chunks are blobs of 64 KB and
we never had any performance issue.
Primary-key structure is ((file-uuid), chunck-id).
Jero
On Wed, Apr 4, 2018 at 9:25 AM, shalom sagges
wrote:
> Hi All,
>
> A certain application is writing ~55,000 characters for
Hi Shalom,
You might want to compress on application side before inserting in
Cassandra, using the algorithm on your choice, based on compression ratio
and speed that you found acceptable with your use case
On 4 April 2018 at 14:38, shalom sagges wrote:
> Thanks DuyHai!
>
> I'm using the defau
Thanks DuyHai!
I'm using the default table compression. Is there anything else I should
look into?
Regarding the table compression, I understand that for write heavy tables,
it's best to keep the default and not compress it further. Have I
understood correctly?
On Wed, Apr 4, 2018 at 3:28 PM, Duy
Compress it and stores it as a blob.
Unless you ever need to index it but I guess even with SASI indexing a so
huge text block is not a good idea
On Wed, Apr 4, 2018 at 2:25 PM, shalom sagges
wrote:
> Hi All,
>
> A certain application is writing ~55,000 characters for a single row. Most
> of the
Hi All,
A certain application is writing ~55,000 characters for a single row. Most
of these characters are entered to one column with "text" data type.
This looks insanely large for one row.
Would you suggest to change the data type from "text" to BLOB or any other
option that might fit this scen
Hey.
I’m considering migrating my DB from using multiple columns to just 2
columns, with the second one being a JSON object. Is there going to be any
real difference between TEXT or UTF-8 encoded BLOB?
I guess it would probably be easier to get tools like spark to parse the
object as JSON if