I'll note that if you have the choice, you can use UUIDType rather than LexicalUUIDType. UUIDType fixes that behavior and use a proper lexical comparison for non-type-1 uuids (the other behavior of UUIDType is that for type 1 uuid, it compares them by time first, i.e. it is equivalent to TimeUUIDType for type 1 uuid).

That's super helpful, thanks!

Follow-up question: it seems that range queries on the *second* field of a CompositeType(UUIDType(), UUIDType()) do not work.

What am I missing?

I thought that creating 1000 column names with the *same* UUID in the first field of the CompositeType and a random UUID in the second field would make it so I can get out four subranges of the second UUID by constructing four evenly spaced start/finish keys -- see below.

Instead, it appears that all of the rows end up in the last subrange. See assert below marked "this fails"

Advice on this much appreciated -- jrf


    sm = SystemManager(chosen_server)
    sm.create_keyspace(namespace, SIMPLE_STRATEGY, {'replication_factor': '1'})

    family = 'test'
    sm.create_column_family(
        namespace, family, super=False,
        key_validation_class = ASCII_TYPE,
        default_validation_class = BYTES_TYPE,
        comparator_type=CompositeType(UUIDType(), UUIDType()),
        )
    pool = ConnectionPool(namespace, config['storage_addresses'],
                          max_retries=1000, pool_timeout=10, pool_size=2,
                          timeout=120)

    cf = pycassa.ColumnFamily(pool, family)
    u1, u2, u3, u4 = uuid.uuid1(), uuid.uuid1(), uuid.uuid1(), uuid.uuid1()

    cf.insert('inbound', {(u1, u2): b''})
    cf.insert('inbound', {(u1, u3): b''})
    cf.insert('inbound', {(u1, u4): b''})

    ## test range searching
    start  = uuid.UUID(int=u3.int - 1)
    finish = uuid.UUID(int=u3.int + 1)
    assert start.int < u3.int < finish.int
    rec3 = cf.get('inbound',
                  column_start =(u1, start),
                  column_finish=(u1, finish)).items()
    assert len(rec3) == 1
    assert rec3[0][0][1] == u3
    ####  This assert above passes!

    ####  This next part fails :-/
    ## now insert many rows -- enough that some should fall in each
    ## subrange below
    for i in xrange(1000):
        cf.insert('inbound', {(u1, uuid.uuid4()): b''})

    ## do four ranges, and expect more than zero in each
    step_size = 2**(128 - 2)

    ## go in reverse order to illustrate that they are all at the end!!!?
    for i in range(2**2, 0, -1):
        start =  uuid.UUID(int=(i-1) * step_size)
        finish = uuid.UUID(int=min(i * step_size, 2**128 - 1))
        recs = cf.get('inbound',
                      column_start =(u1, start),
                      column_finish=(u1, finish)).items()
        for key, val in recs:
            assert val == b''
            assert start < key[1] < finish              ## this fails!!

        assert len(recs) > 0
        print len(recs), ' for ', start, finish

    sm.close()

Reply via email to