> I'm trying to use composite column names to organize 10**8 records. Each > record has a unique pair of UUIDs. The first UUID is often repeated, so I > want to use column_start and column_finish to find all the records that > have a given UUID as the first UUID in the pair. > > I thought a simple way to get *all* of the columns would be to use > > start = uuid.UUID(int=0) -> 00000000-0000-0000-0000-** > 000000000000 > finish = uuid.UUID(int=2**128-1) -> ffffffff-ffff-ffff-ffff-** > ffffffffffff > > But strangely, this fails to find *any* of the columns, and it requires > that column_reversed=True -- otherwise it raises an error about range > finish not coming after start. If I use ints that are much larger/smaller > than these extremes, then reversed is not required! > > Can anyone explain why LexicalUUIDType() does not treat these extremal > UUIDs like other UUIDs? >
LexicalUUIDType compares the uuid using Java UUID compare method ( http://docs.oracle.com/javase/6/docs/api/java/util/UUID.html#compareTo(java.util.UUID) ). As it happens this method consider a UUID as 2 longs and when comparing 2 uuids, it compares those longs lexicographically. But java longs are signed. So for that method, 00000000-0000-0000-0000-000000000000 > ffffffff-ffff-ffff-ffff-ffffffffffff (but for instance, 00000000-0000-0000-0000-000000000000 < 7fffffff-ffff-ffff-ffff-ffffffffffff (because the first "long" of that 2nd uuid is now positive)). That's an historical accident, LexicalUUIDType should probably not have use that comparison as it's arguably not very intuitive. However it's too late to change it (as changing it now would basically corrupt all data for people using LexicalUUIDType today). I'll note that if you have the choice, you can use UUIDType rather than LexicalUUIDType. UUIDType fixes that behavior and use a proper lexical comparison for non-type-1 uuids (the other behavior of UUIDType is that for type 1 uuid, it compares them by time first, i.e. it is equivalent to TimeUUIDType for type 1 uuid). -- Sylvain