You can pre split the table as per the key ranges and use a custom Load
Balancer to keep the regions to required nodes (?)  Seems you have to
collocate 2 table regions in these nodes (to do the join)...  So hope you
already work with the LB

-Anoop-

On Wed, Apr 8, 2015 at 8:17 AM, Alok Singh <[email protected]> wrote:

> >I looked at presplit: { SPLITS => ['row100','row200','row300'] } , but
> >don't think it serves this purpose.
>
> Why doesn't this work for you? Is it because regions are not evenly
> distributed across the cluster after the split? You can move regions
> manually and spread them out evenly.
>
> Alok
>
> On Tue, Apr 7, 2015 at 5:05 PM, Demai Ni <[email protected]> wrote:
> > hi, folks,
> >
> > I have a question about region assignment and like to clarify some
> through.
> >
> > Let's say I have a table with rowkey as "row00000 ~ row30000" on a 4 node
> > hbase cluster, is there a way to keep data partitioned by range on each
> > node? for example:
> >
> > node1:  <=row10000
> > node2:  row10001~row20000
> > node3:  row20001~row30000
> > node4:  >row30000
> >
> > And even when one of the node become hotspot, the boundary won't be
> crossed
> > unless manually doing a load balancing?
> >
> > I looked at presplit: { SPLITS => ['row100','row200','row300'] } , but
> > don't think it serves this purpose.
> >
> > BTW, a bit background. I am thinking to do a local join between two
> tables
> > if both have same rowkey, and partitioned by range (or same hash
> > algorithm). If I can keep the join-key on the same node(aka
> regionServer),
> > the join can be handled locally instead of broadcast to all other nodes.
> >
> > Thanks for your input. A couple pointers to blog/presentation would be
> > appreciated.
> >
> > Demai
>

Reply via email to