You can pre split the table as per the key ranges and use a custom Load Balancer to keep the regions to required nodes (?) Seems you have to collocate 2 table regions in these nodes (to do the join)... So hope you already work with the LB
-Anoop- On Wed, Apr 8, 2015 at 8:17 AM, Alok Singh <[email protected]> wrote: > >I looked at presplit: { SPLITS => ['row100','row200','row300'] } , but > >don't think it serves this purpose. > > Why doesn't this work for you? Is it because regions are not evenly > distributed across the cluster after the split? You can move regions > manually and spread them out evenly. > > Alok > > On Tue, Apr 7, 2015 at 5:05 PM, Demai Ni <[email protected]> wrote: > > hi, folks, > > > > I have a question about region assignment and like to clarify some > through. > > > > Let's say I have a table with rowkey as "row00000 ~ row30000" on a 4 node > > hbase cluster, is there a way to keep data partitioned by range on each > > node? for example: > > > > node1: <=row10000 > > node2: row10001~row20000 > > node3: row20001~row30000 > > node4: >row30000 > > > > And even when one of the node become hotspot, the boundary won't be > crossed > > unless manually doing a load balancing? > > > > I looked at presplit: { SPLITS => ['row100','row200','row300'] } , but > > don't think it serves this purpose. > > > > BTW, a bit background. I am thinking to do a local join between two > tables > > if both have same rowkey, and partitioned by range (or same hash > > algorithm). If I can keep the join-key on the same node(aka > regionServer), > > the join can be handled locally instead of broadcast to all other nodes. > > > > Thanks for your input. A couple pointers to blog/presentation would be > > appreciated. > > > > Demai >
