Re: smallest/largest UUIDs for LexicalUUIDType

Sylvain Lebresne Thu, 06 Jun 2013 03:09:55 -0700

> I'm trying to use composite column names to organize 10**8 records.  Each
> record has a unique pair of UUIDs.  The first UUID is often repeated, so I
> want to use column_start and column_finish to find all the records that
> have a given UUID as the first UUID in the pair.
>
> I thought a simple way to get *all* of the columns would be to use
>
>  start  = uuid.UUID(int=0)        -> 00000000-0000-0000-0000-**
> 000000000000
>  finish = uuid.UUID(int=2**128-1) -> ffffffff-ffff-ffff-ffff-**
> ffffffffffff
>
> But strangely, this fails to find *any* of the columns, and it requires
> that column_reversed=True -- otherwise it raises an error about range
> finish not coming after start.  If I use ints that are much larger/smaller
> than these extremes, then reversed is not required!
>
> Can anyone explain why LexicalUUIDType() does not treat these extremal
> UUIDs like other UUIDs?
>


LexicalUUIDType compares the uuid using Java UUID compare method (
http://docs.oracle.com/javase/6/docs/api/java/util/UUID.html#compareTo(java.util.UUID)
).

As it happens this method consider a UUID as 2 longs and when comparing 2
uuids, it compares those longs lexicographically.
But java longs are signed. So for that
method, 00000000-0000-0000-0000-000000000000
> ffffffff-ffff-ffff-ffff-ffffffffffff (but for
instance, 00000000-0000-0000-0000-000000000000 <
7fffffff-ffff-ffff-ffff-ffffffffffff (because the first "long" of that 2nd
uuid is now positive)).

That's an historical accident, LexicalUUIDType should probably not have use
that comparison as it's arguably not very intuitive. However it's too late
to change it (as changing it now would basically corrupt all data for
people using LexicalUUIDType today).

I'll note that if you have the choice, you can use UUIDType rather than
LexicalUUIDType. UUIDType fixes that behavior and use a proper lexical
comparison for non-type-1 uuids (the other behavior of UUIDType is that for
type 1 uuid, it compares them by time first, i.e. it is equivalent to
TimeUUIDType for type 1 uuid).

--
Sylvain

Re: smallest/largest UUIDs for LexicalUUIDType

Reply via email to