I concur with Eliot view. Only way you can reduce partition size is by
tweaking your partition key. Here with user_id as partition key, partition
size depends on the activity of the user. For a superactive user it can
become large in no time. After changing the key migration of old data  to
the new table will also be required, please keep that also in mind.

Regards
Manish

On Tue, Jul 20, 2021 at 2:54 AM Elliott Sims <elli...@backblaze.com> wrote:

> Your partition key determines your partition size.  Reducing retention
> sounds like it would help some in your case, but really you'd have to split
> it up somehow.  If it fits your query pattern, you could potentially have a
> compound key of userid+datetime, or some other time-based split.  You could
> also just split each user's rows into subsets with some sort of indirect
> mapping, though that can get messy pretty fast.
>
> On Mon, Jul 19, 2021 at 9:01 AM MyWorld <timeplus.1...@gmail.com> wrote:
>
>> Hi all,
>>
>> We are currently storing our user activity log in Cassandra with below
>> architecture.
>>
>> Create table user_act_log(
>> Userid bigint,
>> Datetime bigint,
>> Sno UUID,
>> ....some more columns)
>> With partition key - userid
>> Clustering key - datetime, sno
>> And TTL of 6 months
>>
>> With time our table data have grown to around 500gb and we notice from
>> table histogram our max partition size have also grown to tremendous size
>> (nearly 1gb)
>>
>> So, please help me out what should be the right architecture for this use
>> case?
>>
>> I am currently thinking of changing the compaction strategy to time
>> window from size tier with 30 day window. But will this improve the partion
>> size?
>>
>> Should we use any other db for such use case?
>>
>>
>>
>>

Reply via email to