We are currently running a three node cluster where we assigned the initial
tokens using the Python script that is in the Wiki, and we're currently
using the Random Partitioner, RF=1, Cassandra 0.8 from the Riptano RPM
....however we're seeing one node taken on over 60% of the data as we load
data.

Our keys are sequential, and can range from 0 to 2^64, though in practice
we're between 1 and 2,000,000,000, with the current  max around 50,000.   In
order to balance out the  load would we be best served changing our tokens
to make the top and bottom 1/3rd of the node go to the previous and next
nodes respectively, then running nodetool move?

Even if we do that, it would seem that we'd likely continue to run into this
sort of issue as  we  add  additionally data... would we be better served
with a different Partitioner strategy?  Or will we need to very actively
manage our tokens to avoid getting into an unbalanced situation?

-- 
*David McNelis*
Lead Software Engineer
Agentis Energy
www.agentisenergy.com
o: 630.359.6395
c: 219.384.5143

*A Smart Grid technology company focused on helping consumers of energy
control an often under-managed resource.*

Reply via email to