Not sure how feasible it is or if it's planned. But it would probably require that the nodes are able so share the state of their row cache so as to know which parts to warm. Otherwise it sounds like you're assuming the node can hold the entire data set in memory.
If you know in your application when you would like data to be in the cache, you can send a query like get_range_slices to the cluster and ask for 0 columns. That will warm the row cache for the keys it hits. I have heard it mentioned that the coordinator node will take action to when one node is considered to be running slow. So it may be able to work around the new node until it gets warmed up. Are you adding nodes often? Aaron On 7 Aug 2010, at 11:17, Artie Copeland wrote: > the way i understand how row caches work is that each node has an independent > cache, in that they do not push there cache contents with other nodes. if > that the case is it also true that when a new node is added to the cluster it > has to build up its own cache. if thats the case i see that as a possible > performance bottle neck once the node starts to accept requests. since there > is no way i know of to warm the cache without adding the node to the cluster. > would it be infeasible to have part of the bootstrap process not only stream > data from nodes but also cached rows that are associated with those same > keys? that would allow the new nodes to be able to provide the best > performance once the bootstrap process finishes. > > -- > http://yeslinux.org > http://yestech.org