Re: data distribution along column family partitions

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
From: clohfin...@gmail.com Subject: Re: data distribution along column family partitions > not ok :) don't let a single partition get to 1gb, 100's of mb should be when > flares are going up. The main reasoning is compactions would be horrifically > slow and there will

Re: data distribution along column family partitions

2015-02-04 Thread Chris Lohfink
page the query > across multiple partitions if Y-X > bucket size. > > If I use paging, Cassandra won't try to allocate the whole partition on > the server node, it will just allocate memory in the heap for that page. > Check? > > Marcelo Valle > > From: user@c

Re: data distribution along column family partitions

2015-02-04 Thread Marcelo Valle (BLOOMBERG/ LONDON)
x27;t try to allocate the whole partition on the server node, it will just allocate memory in the heap for that page. Check? Marcelo Valle From: user@cassandra.apache.org Subject: Re: data distribution along column family partitions The data model lgtm. You may need to balance the size of the ti

Re: data distribution along column family partitions

2015-02-04 Thread Chris Lohfink
The data model lgtm. You may need to balance the size of the time buckets with the amount of alarms to prevent partitions from getting too large. 1 month may be a little large, I would aim to keep the partitions below 25mb (can check with nodetool cfstats) or so in size to keep everything happy.

data distribution along column family partitions

2015-02-04 Thread Marcelo Elias Del Valle
Hello, I am designing a model to store alerts users receive over time. I will want to store probably the last two years of alerts for each user. The first thought I had was having a column family partitioned by user + timebucket, where time bucket could be something like year + month. For instanc