Maybe the OrderPreservingPartitioner should let user define the customized
comparator.
In fact, user can implement his/her own XXXOrderPreservingPartitioner.

On Tue, Jun 22, 2010 at 8:34 PM, Sylvain Lebresne <sylv...@yakaz.com> wrote:

> 2010/6/22 Maxim Kramarenko <maxi...@trackstudio.com>:
> > Hello!
> >
> > I use OrderPreservingPartitioner and assign tokens manually.
> >
> > Questions are:
> >
> > 1) Why range sorted in alphabetical order, not numeric order ?
> > It was ok with RandomPartitioner
>
> With RandomPartitioner, tokens are md5 hashes, thus number and the
> comparison between two tokens is the numeric one.
>
> With OrdrerPreservingPartitioner, tokens are the keys themselves, that is
> to say Strings, and the comparison is (utf8) String comparison (hence the
> alphabetic sorting). Note that as such, when switching from RP to OPP,
> you most certainly don't want to keep the same tokens (as they represents
> very different things (md5 hahes vs string key)).
>
> >
> > Address       Status     Load          Range           Ring
> >
> > 84000000000000000000000000000000000000
> > 172.19.0.35   Up         2.47 GB       0           |<--|
> > 172.19.0.31   Up         1.85 GB 112000000000000000000000000000000000000
> >  |   ^
> > 172.19.0.33   Up         1.46 GB 142000000000000000000000000000000000000
> >  v   |
> > 172.19.0.30   Up         1.44 GB 28000000000000000000000000000000000000
> > |   ^
> > 172.19.0.32   Up         2.63 GB 56000000000000000000000000000000000000
> > v   |
> > 172.19.0.34   Up         3.29 GB 84000000000000000000000000000000000000
> > |-->|
> >
> > 2) what is the token range ? For example, all our keys starts with
> customer
> > number (a few digits), but number is only small part of ASCII table.
> >
> > What is the best way to assign tokens manually when using
> > OrderPreservingPartitioner ?
>
> The first thing is to find (estimate most probably) the domain and
> repartition
> of the key you will use (note that this is really the hard part as
> most of the time you
> can only guess what the repartition will be and most of the time you
> will be wrong
> anyway and get bad load balancing).
> But when you know that, you just assign as tokens the particular keys
> that split this
> repartition the more evenly possible (and split here is with respect
> to (utf8) string
> comparison).
>
> --
> Sylvain
>
> >
> > --
> > Best regards,
> >  Maxim                            mailto:maxi...@trackstudio.com
> >
> > LinkedIn Profile: http://www.linkedin.com/in/maximkr
> > Google Talk/Jabber: maxi...@gmail.com
> > ICQ number: 307863079
> > Skype Chat: maxim.kramarenko
> > Yahoo! Messenger: maxim_kramarenko
> >
>

Reply via email to