kew .. hadoop is on my list, has been on my list, will probably still be on the list tomorrow ;)
On Tue, 2010-08-24 at 11:53 -0700, Jonathan Ellis wrote: > in other words you're reinventing hadoop. not really recommended, but > knock yourself out if that's what you want to do. :) > > On Tue, Aug 24, 2010 at 1:28 PM, B. Todd Burruss <bburr...@real.com> wrote: > > i just came across this and i use tokens in range queries because it is > > an easy straightforward way to divide the keyspace and operate on it > > using multiple threads and throttle the processing. maybe this is what > > hadoop does, i don't know much about hadoop. > > > > so i don't really agree that i'm doing it wrong. why is this? > > > > > > On Wed, 2010-08-18 at 11:18 -0700, Ran Tavory wrote: > >> > >> > >> On Wed, Aug 18, 2010 at 4:30 PM, Jonathan Ellis <jbel...@gmail.com> > >> wrote: > >> (a) if you're using token queries and you're not hadoop, > >> you're doing it wrong > >> ah, didn't know that, so I guess I'll remove support for it from > >> hector... > >> > >> (b) they are expected to be of the form generated by > >> TokenFactory.toString and fromString. You should not be > >> generating > >> them yourself. > >> > >> > >> On Wed, Aug 18, 2010 at 7:56 AM, Ran Tavory <ran...@gmail.com> > >> wrote: > >> > I'm a bit confused WRT KeyRange's tokens in 0.7.0 > >> > When making a range query you can either use KeyRange.key or > >> KeyRange.token. > >> > In 0.7.0 key was typed as byte[]. tokens remain strings. > >> > What does this string represent in case of a RP and in case > >> of an OPP? Did > >> > this change in 0.7.0? > >> > AFAIK in 0.6.0 if the partitioner is OPP then the tokens are > >> actual strings > >> > and they might just be actual subset of the keys. When using > >> a RP tokens are > >> > BigIntegers (keys are still strings) and I'm not actually > >> sure if you're > >> > allowed to shoot a range query using tokens... > >> > In 0.7.0 since keys are now bytes, when using an OPP, how do > >> those bytes > >> > translate to strings? I'd assume it'd just be byte[] -> UTF8 > >> conversion, > >> > only that this may result in illegal UTF8 chars when keys > >> are just random > >> > bytes, so I guess not... Perhaps md5 hashing? But then if > >> using an OPP and > >> > keys are actual strings, I want to have the same 0.6.0 > >> functionality in > >> > place, meaning tokens are strings like the keys. I actually > >> tested this > >> > scenario and it looks working, so it seems like the String > >> keys are > >> > translated to UTF8, but what happens when they are invalid > >> UTF8? > >> > Another question is what's the story with RP in 0.7.0? > >> Should range query > >> > even be supported with tokens? If so, then are the tokens > >> expected to be > >> > string of integers? (e.g. "1234567890") > >> > Thanks. > >> > >> > >> > >> > >> -- > >> Jonathan Ellis > >> Project Chair, Apache Cassandra > >> co-founder of Riptano, the source for professional Cassandra > >> support > >> http://riptano.com > >> > >> > > > > > > > > >