Your partition key determines your partition size.  Reducing retention
sounds like it would help some in your case, but really you'd have to split
it up somehow.  If it fits your query pattern, you could potentially have a
compound key of userid+datetime, or some other time-based split.  You could
also just split each user's rows into subsets with some sort of indirect
mapping, though that can get messy pretty fast.

On Mon, Jul 19, 2021 at 9:01 AM MyWorld <timeplus.1...@gmail.com> wrote:

> Hi all,
>
> We are currently storing our user activity log in Cassandra with below
> architecture.
>
> Create table user_act_log(
> Userid bigint,
> Datetime bigint,
> Sno UUID,
> ....some more columns)
> With partition key - userid
> Clustering key - datetime, sno
> And TTL of 6 months
>
> With time our table data have grown to around 500gb and we notice from
> table histogram our max partition size have also grown to tremendous size
> (nearly 1gb)
>
> So, please help me out what should be the right architecture for this use
> case?
>
> I am currently thinking of changing the compaction strategy to time window
> from size tier with 30 day window. But will this improve the partion size?
>
> Should we use any other db for such use case?
>
>
>
>

Reply via email to