You can use prefix split policy. Put the Same prefix for the data you need in the same region and thus achieve locality of this data and also haves a good load of your data and avoid split policy. I'm not sure you really need the requirement you described below unless I didn't follow your business requirements very well
On Thursday, June 20, 2013, yun peng wrote: > It is our requirement that one batch of data writes (say of Memstore size) > should be in one RS. And > salting prefix, while even the load, may not have this property. > > Our problem is really how to manipulate/customise the mapping of row key > (or row key range) to the region servers, > so that after one region overflows and starts to flush, the write stream > can be automatically redirected to next region server, > like in a round robin way? > > Is it possible to customize such policy on hmaster? Or there is a similiar > way as what CoProcessor does on region servers... > > > On Wed, Jun 19, 2013 at 4:58 PM, Asaf Mesika > <asaf.mes...@gmail.com<javascript:;>> > wrote: > > > The new splitted region might be moved due to load balancing. Aren't you > > experiencing the classic hot spotting? Only 1 RS getting all write > traffic? > > Just place a preceding byte before the time stamp and round robin each > put > > on values 1-num of region servers. > > > > On Wednesday, June 19, 2013, yun peng wrote: > > > > > Hi, All, > > > Our use case requires to persist a stream into system like HBase. The > > > stream data is in format of <timestamp, value>. In other word, > timestamp > > is > > > used as rowkey. We want to explore whether HBase is suitable for such > > kind > > > of data. > > > > > > The problem is that the domain of row key (or timestamp) grow > constantly. > > > For example, given 3 nodes, n1 n2 n3, they are resp. hosting row key > > > partition [0,4], [5, 9], [10,12]. Currently it is the last node n3 who > is > > > busy receiving upcoming writes (of row key 13 and 14). This continues > > until > > > the region reaches max size 5 (that is, partition grows to [10,14]) and > > > potentially splits. > > > > > > I am not expert on HBase split, but I am wondering after split, will > the > > > new writes still go to node n3 (for [10,14]) or the write stream can be > > > intelligently redirected to other less busy node, like n1. > > > > > > In case HBase can't do things like this, how easy is it to extend HBase > > for > > > such functionality? Thanks... > > > Yun > > > > > >