It seems to me you are on the right track. Finding the right balance of # rows
vs row width is the part that will take the most experimentation. -
Original Message -From: "Trevor Francis"
>;trevor.fran...@tgrahamcapital.com
Regarding Rotating, I was thinking about the concept of log rotate, where you
write to a file for a specific period of time, then you create a new file and
write to it after a specific set of time. So yes, it closes a row and opens
another row.
Since I will be generating analytics every 15 minu
Yes in this cassandra model, time wouldn't be a column value, it would be
part of the column name. Depending on how you want to access your data (give me
all data points for time X) and how many separate datapoints you have for time
X, you might consider packing all the data for a time in one
Yes in this cassandra model, time wouldn't be a column value, it would be part
of the column name. Depending on how you want to access your data (give me all
data points for time X) and how many separate datapoints you have for time X,
you might consider packing all the data for a time in one co
can perform background jobs against that row and store
> summary information for that time period.
>
>
> - Original Message -
> From: "Trevor Francis"
> Sent: Wed, April 18, 2012 15:48
> Subject: Re: Column Family per User
>
> Janne,
>
>
&g
Your design should be around how you want to query. If you are only querying
by user, then having a user as part of the row key makes sense. To manage row
size, you should think of a row as being a bucket of time. Cassandra supports a
large (but not without bounds) row size. To manage row size
Janne,
Of course, I am new to the Cassandra world, so it is taking some getting used
to understand how everything translates into my MYSQL head.
We are building an enterprise application that will ingest log information and
provide metrics and trending based upon the data contained in the logs
Each CF takes a fair chunk of memory regardless of how much data it has, so
this is probably not a good idea, if you have lots of users. Also using a
single CF means that compression is likely to work better (more redundant data).
However, Cassandra distributes the load across different nodes b
Our application has users that can write in upwards of 50 million records per
day. However, they all write the same format of records (20 fields…columns).
Should I put each user in their own column family, even though the column
family schema will be the same per user?
Would this help with dime