Re: TimeUUID Order Partitioner

Carlos Pérez Miguel Thu, 28 Mar 2013 01:39:56 -0700

Apparently the MemTable..writeSortedContents has the same problem: I can
see how it iterates over the stored keys in byte order, so  my classes have
something wrong. For the curious, these are my classes until now:


https://gist.github.com/anonymous/5261611


Carlos Pérez Miguel


2013/3/28 aaron morton <aa...@thelastpickle.com>

> That is the order I would expect to find if I read the CF, but if I do, I
> obtain (with any client or library I've tried):
>
>
> What happens if you export sstables with sstable2json ?
>
> Put some logging in Memtable.FlushRunnable.writeSortedContents to see the
> order the rows are written
>
> Cheers
>
>    -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 28/03/2013, at 5:05 AM, Carlos Pérez Miguel <cperez...@gmail.com>
> wrote:
>
> Thanks, Lanny. That is what I am doing.
>
> Actually I'm having another problem. My UUIDOrderedPartitioner doesn't
> order by time. Instead, it orders by byte order and I cannot find why.
> Which are the functions that control ordering between tokens? I have
> implemented time ordering in the "compareTo" function of my UUID token
> class, but it seems that Cassandra is ignoring it. For example:
>
> Let's suppouse that I have a Users CF where each row represents a user in
> a cluster of 1 node. Rows are ordered by TimeUUID. I create some users in
> the next order:
>
> user a created with user_id: eac850fa-96f4-11e2-9f22-72ad6af0e500
> user b created with user_id: f17f9ae8-96f4-11e2-98aa-421151417092
> user c created with user_id: f82fccfa-96f4-11e2-8d99-26f8461d074c
> user d created with user_id: fee21cec-96f4-11e2-945b-f9a2a2e32308
> user e created with user_id: 058ec180-96f5-11e2-8c88-4aaf94e4f04e
> user f created with user_id: 0c5032ba-96f5-11e2-95a5-60a128c0b3f4
> user g created with user_id: 13036b86-96f5-11e2-80dd-566654c686cb
> user h created with user_id: 19b245f6-96f5-11e2-9c8f-b315f455e5e0
>
> That is the order I would expect to find if I read the CF, but if I do, I
> obtain (with any client or library I've tried):
>
> user_id: 058ec180-96f5-11e2-8c88-4aaf94e4f04e name:"e"
> user_id: 0c5032ba-96f5-11e2-95a5-60a128c0b3f4 name:"f"
> user_id: 13036b86-96f5-11e2-80dd-566654c686cb name:"g"
> user_id: 19b245f6-96f5-11e2-9c8f-b315f455e5e0 name:"h"
> user_id: eac850fa-96f4-11e2-9f22-72ad6af0e500 name:"a"
> user_id: f17f9ae8-96f4-11e2-98aa-421151417092 name:"b"
> user_id: f82fccfa-96f4-11e2-8d99-26f8461d074c name:"c"
> user_id: fee21cec-96f4-11e2-945b-f9a2a2e32308 name:"d"
>
> Any idea what's happening?
>
>
> Carlos Pérez Miguel
>
>
> 2013/3/27 Lanny Ripple <la...@spotright.com>
>
>> Ah. TimeUUID.  Not as useful for you then but still something for the
>> toolbox.
>>
>> On Mar 27, 2013, at 8:42 AM, Lanny Ripple <la...@spotright.com> wrote:
>>
>> > A type 4 UUID can be created from two Longs.  You could MD5 your
>> strings giving you 128 hashed bits and then make UUIDs out of that.  Using
>> Scala:
>> >
>> >   import java.nio.ByteBuffer
>> >   import java.security.MessageDigest
>> >   import java.util.UUID
>> >
>> >   val key = "Hello, World!"
>> >
>> >   val md = MessageDigest.getInstance("MD5")
>> >   val dig = md.digest(key.getBytes("UTF-8"))
>> >   val bb = ByteBuffer.wrap(dig)
>> >
>> >   val msb = bb.getLong
>> >   val lsb = bb.getLong
>> >
>> >   val uuid = new UUID(msb, lsb)
>> >
>> >
>> > On Mar 26, 2013, at 3:22 PM, aaron morton <aa...@thelastpickle.com>
>> wrote:
>> >
>> >>> Any idea?
>> >> Not off the top of my head.
>> >>
>> >> Cheers
>> >>
>> >> -----------------
>> >> Aaron Morton
>> >> Freelance Cassandra Consultant
>> >> New Zealand
>> >>
>> >> @aaronmorton
>> >> http://www.thelastpickle.com
>> >>
>> >> On 26/03/2013, at 2:13 AM, Carlos Pérez Miguel <cperez...@gmail.com>
>> wrote:
>> >>
>> >>> Yes it does. Thank you Aaron.
>> >>>
>> >>> Now I realized that the system keyspace uses string as keys, like
>> "Ring" or "ClusterName", and I don't know how to convert these type of keys
>> into UUID. Any idea?
>> >>>
>> >>>
>> >>> Carlos Pérez Miguel
>> >>>
>> >>>
>> >>> 2013/3/25 aaron morton <aa...@thelastpickle.com>
>> >>> The best thing to do is start with a look at ByteOrderedPartitoner
>> and AbstractByteOrderedPartitioner.
>> >>>
>> >>> You'll want to create a new TimeUUIDToken extends Token<UUID> and a
>> new UUIDPartitioner that extends AbstractPartitioner<>
>> >>>
>> >>> Usual disclaimer that ordered partitioners cause problems with load
>> balancing.
>> >>>
>> >>> Hope that helps.
>> >>>
>> >>> -----------------
>> >>> Aaron Morton
>> >>> Freelance Cassandra Consultant
>> >>> New Zealand
>> >>>
>> >>> @aaronmorton
>> >>> http://www.thelastpickle.com
>> >>>
>> >>> On 25/03/2013, at 1:12 AM, Carlos Pérez Miguel <cperez...@gmail.com>
>> wrote:
>> >>>
>> >>>> Hi,
>> >>>>
>> >>>> I store in my system rows where the key is a UUID version1,
>> TimeUUID. I would like to maintain rows ordered by time. I know that in
>> this case, it is recomended to use an external CF where column names are
>> UUID ordered by time. But in my use case this is not possible, so I would
>> like to use a custom Partitioner in order to do this. If I use
>> ByteOrderedPartitioner rows are not correctly ordered because of the way a
>> UUID stores the timestamp. What is needed in order to implement my own
>> Partitioner?
>> >>>>
>> >>>> Thank you.
>> >>>>
>> >>>> Carlos Pérez Miguel
>> >>>
>> >>>
>> >>
>> >
>>
>>
>
>

Re: TimeUUID Order Partitioner

Reply via email to