There have been a couple proposals to split the sstables a node
maintains into several pieces, one for each of a group of sub-ranges
that the nodes divides its token range into.  (This could be done with
or without explicitly giving each node multiple Tokens, IMO.)

This would be a substantial change, and one that's not without
downsides (e.g., making indexed lookups from #749 less efficient in
some cases).

I think we can get a long ways with tickets like
https://issues.apache.org/jira/browse/CASSANDRA-579 and
https://issues.apache.org/jira/browse/CASSANDRA-1181 instead.

On Tue, Jun 8, 2010 at 9:35 AM, Jonathan Shook <jsh...@gmail.com> wrote:
> I'm curious if there are any efforts ongoing to amortize the
> background tasks in Cassandra over time?
> Specifically, the cost of compaction and AE, rebalancing, etc seems to
> be a problem for some users when they are expecting more steady-state
> performance. While this may sometimes be the result of a cluster which
> is at its marginal capacity, users are still surprised with the
> performance hit or downtime required for common operations. Making the
> cluster able to make finer-grained and measurable progress towards the
> ideal state may help other users, too.
>
> Is there a feasible design or enhancement which may allow these types
> of background tasks to be broken apart into smaller pieces without
> compromising overall consistency?
> It would be excellent if the user could see the over-all state of the
> storage cluster, and to choose the proportion of resources allocated
> to recovering backlog vs servicing clients, etc.
> Even better, if there were some basic heuristics which worked well for
> the general case, and users would only have to see the scheduling plan
> in special situations.
>
> How would you go about doing that? Does the current architecture lend
> itself to this type of optimization, or otherwise?
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Reply via email to