Maybe the OrderPreservingPartitioner should let user define the customized comparator. In fact, user can implement his/her own XXXOrderPreservingPartitioner.
On Tue, Jun 22, 2010 at 8:34 PM, Sylvain Lebresne <sylv...@yakaz.com> wrote: > 2010/6/22 Maxim Kramarenko <maxi...@trackstudio.com>: > > Hello! > > > > I use OrderPreservingPartitioner and assign tokens manually. > > > > Questions are: > > > > 1) Why range sorted in alphabetical order, not numeric order ? > > It was ok with RandomPartitioner > > With RandomPartitioner, tokens are md5 hashes, thus number and the > comparison between two tokens is the numeric one. > > With OrdrerPreservingPartitioner, tokens are the keys themselves, that is > to say Strings, and the comparison is (utf8) String comparison (hence the > alphabetic sorting). Note that as such, when switching from RP to OPP, > you most certainly don't want to keep the same tokens (as they represents > very different things (md5 hahes vs string key)). > > > > > Address Status Load Range Ring > > > > 84000000000000000000000000000000000000 > > 172.19.0.35 Up 2.47 GB 0 |<--| > > 172.19.0.31 Up 1.85 GB 112000000000000000000000000000000000000 > > | ^ > > 172.19.0.33 Up 1.46 GB 142000000000000000000000000000000000000 > > v | > > 172.19.0.30 Up 1.44 GB 28000000000000000000000000000000000000 > > | ^ > > 172.19.0.32 Up 2.63 GB 56000000000000000000000000000000000000 > > v | > > 172.19.0.34 Up 3.29 GB 84000000000000000000000000000000000000 > > |-->| > > > > 2) what is the token range ? For example, all our keys starts with > customer > > number (a few digits), but number is only small part of ASCII table. > > > > What is the best way to assign tokens manually when using > > OrderPreservingPartitioner ? > > The first thing is to find (estimate most probably) the domain and > repartition > of the key you will use (note that this is really the hard part as > most of the time you > can only guess what the repartition will be and most of the time you > will be wrong > anyway and get bad load balancing). > But when you know that, you just assign as tokens the particular keys > that split this > repartition the more evenly possible (and split here is with respect > to (utf8) string > comparison). > > -- > Sylvain > > > > > -- > > Best regards, > > Maxim mailto:maxi...@trackstudio.com > > > > LinkedIn Profile: http://www.linkedin.com/in/maximkr > > Google Talk/Jabber: maxi...@gmail.com > > ICQ number: 307863079 > > Skype Chat: maxim.kramarenko > > Yahoo! Messenger: maxim_kramarenko > > >