> I don't get what you're saying. If you want to loop over your entire range
> of keys, you can do it with a range query, and start and finish will both be
> "". Is there any scenario where you would want to do a range query where
> start and/or finish do not equal "", if you use random partitioning?

I you have 1 million rows and each of these rows are ~1kB (and you request
the rows fully), I guarantee you that your range query with start="" and
finish="" will not work.

More generally, in any non toy cluster, a range query with start=""
and end="" and a
count large enough that it would retrieve all the keys will fail
(timeout that is). To loop
over your entire range of keys in any such non toy cluster, you will
start with a range
query with start="" and finish="" but with a reasonable value for
count. Then you will
do the next range query with a start equal to the last key retrieved
by the previous
range query and so on ... until you have seen all the keys.

--
Sylvain

>
> 2010/6/9 Philip Stanhope <pstanh...@wimba.com>
>>
>> I feel that there is a significant bit of confusion here.
>> You CAN use start/finish when using get_range_slices with random
>> partitioner. But you can't make any assumptions about what key will be next
>> in the range which is the whole point of "random". If you do know a specific
>> key that you care about, you can use that as a start, but again, you don't
>> know what will come next.
>> If you have a CF with 1M keys ... you can effectively do a full row scan
>> ... it is expensive and you'd have to ask yourself why you'd be wanting to
>> do this in the first place.
>> Ordering with columns for a particular key is completely dependent on the
>> CompareWith choice you make when you defined the column family. For example,
>> you can make assumptions about the sequencing of columns returned from
>> get_slice (NOT get_range_slices).
>> -phil
>> On Jun 9, 2010, at 7:29 AM, David Boxenhorn wrote:
>>
>> To use start and finish parameters at all, you need to use OPP. Start and
>> finish parameters don't work if you don't use OPP, i.e. the result set won't
>> be:  start =< resultSet < finish
>>
>> 2010/6/9 Ben Browning <ben...@gmail.com>
>>>
>>> OPP stands for Order-Preserving Partitioner. For more information on
>>> partitioners, look here:
>>>
>>> http://wiki.apache.org/cassandra/StorageConfiguration#Partitioner
>>>
>>> To do key range slices that use both start and finish parameters and
>>> retrieve keys in-order, you need to use an ordered partitioner -
>>> either the built-in OPP or your own custom one.
>>>
>>> Ben
>>>
>>> On Tue, Jun 8, 2010 at 10:26 PM, sina <ywf2...@sina.com> wrote:
>>> > what's the mean of opp? And How can i make the "start" and "finish"
>>> > useful
>>> > and make sense?
>>> >
>>> >
>>> > 2010-06-09
>>> > ________________________________
>>> > 9527
>>> > ________________________________
>>> > 发件人: Ben Browning
>>> > 发送时间: 2010-06-02  21:08:57
>>> > 收件人: user
>>> > 抄送:
>>> > 主题: Re: Range search on keys not working?
>>> > They exist because when using OPP they are useful and make sense.
>>> > On Wed, Jun 2, 2010 at 8:59 AM, David Boxenhorn <da...@lookin2.com>
>>> > wrote:
>>> >> So why do the "start" and "finish" range parameters exist?
>>> >>
>>> >> On Wed, Jun 2, 2010 at 3:53 PM, Ben Browning <ben...@gmail.com> wrote:
>>> >>>
>>> >>> Martin,
>>> >>>
>>> >>> On Wed, Jun 2, 2010 at 8:34 AM, Dr. Martin Grabmüller
>>> >>> <martin.grabmuel...@eleven.de> wrote:
>>> >>> > I think you can specify an end key, but it should be a key which
>>> >>> > does
>>> >>> > exist
>>> >>> > in your column family.
>>> >>>
>>> >>>
>>> >>> Logically, it doesn't make sense to ever specify an end key with
>>> >>> random partitioner. If you specified a start key of "aaa" and and end
>>> >>> key of "aac" you might get back as results "aaa", "zfc", "hik", etc.
>>> >>> And, even if you have a key of "aab" it might not show up. Key ranges
>>> >>> only make sense with order-preserving partitioner. The only time to
>>> >>> ever use a key range with random partitioner is when you want to
>>> >>> iterate over all keys in the CF.
>>> >>>
>>> >>> Ben
>>> >>>
>>> >>>
>>> >>> > But maybe I'm off the track here and someone else here knows more
>>> >>> > about
>>> >>> > this
>>> >>> > key range stuff.
>>> >>> >
>>> >>> > Martin
>>> >>> >
>>> >>> > ________________________________
>>> >>> > From: David Boxenhorn [mailto:da...@lookin2.com]
>>> >>> > Sent: Wednesday, June 02, 2010 2:30 PM
>>> >>> > To: user@cassandra.apache.org
>>> >>> > Subject: Re: Range search on keys not working?
>>> >>> >
>>> >>> > In other words, I should check the values as I iterate, and stop
>>> >>> > iterating
>>> >>> > when I get out of range?
>>> >>> >
>>> >>> > I'll try that!
>>> >>> >
>>> >>> > On Wed, Jun 2, 2010 at 3:15 PM, Dr. Martin Grabmüller
>>> >>> > <martin.grabmuel...@eleven.de> wrote:
>>> >>> >>
>>> >>> >> When not using OOP, you should not use something like 'CATEGORY/'
>>> >>> >> as
>>> >>> >> the
>>> >>> >> end key.
>>> >>> >> Use the empty string as the end key and limit the number of
>>> >>> >> returned
>>> >>> >> keys,
>>> >>> >> as you did with
>>> >>> >> the 'max' value.
>>> >>> >>
>>> >>> >> If I understand correctly, the end key is used to generate an end
>>> >>> >> token
>>> >>> >> by
>>> >>> >> hashing it, and
>>> >>> >> there is not the same correspondence between 'CATEGORY' and
>>> >>> >> 'CATEGORY/'
>>> >>> >> as
>>> >>> >> for
>>> >>> >> hash('CATEGORY') and hash('CATEGORY/').
>>> >>> >>
>>> >>> >> At least, this was the explanation I gave myself when I had the
>>> >>> >> same
>>> >>> >> problem.
>>> >>> >>
>>> >>> >> The solution is to iterate through the keys by always using the
>>> >>> >> last
>>> >>> >> key
>>> >>> >> returned as the
>>> >>> >> start key for the next call to get_range_slices, and the to drop
>>> >>> >> the
>>> >>> >> first
>>> >>> >> element from
>>> >>> >> the result.
>>> >>> >>
>>> >>> >> HTH,
>>> >>> >>   Martin
>>> >>> >>
>>> >>> >> ________________________________
>>> >>> >> From: David Boxenhorn [mailto:da...@lookin2.com]
>>> >>> >> Sent: Wednesday, June 02, 2010 2:01 PM
>>> >>> >> To: user@cassandra.apache.org
>>> >>> >> Subject: Re: Range search on keys not working?
>>> >>> >>
>>> >>> >> The previous thread where we discussed this is called, "key is
>>> >>> >> sorted?"
>>> >>> >>
>>> >>> >>
>>> >>> >> On Wed, Jun 2, 2010 at 2:56 PM, David Boxenhorn
>>> >>> >> <da...@lookin2.com>
>>> >>> >> wrote:
>>> >>> >>>
>>> >>> >>> I'm not using OPP. But I was assured on earlier threads (I asked
>>> >>> >>> several
>>> >>> >>> times to be sure) that it would work as stated below: the results
>>> >>> >>> would not
>>> >>> >>> be ordered, but they would be correct.
>>> >>> >>>
>>> >>> >>> On Wed, Jun 2, 2010 at 2:51 PM, Torsten Curdt <tcu...@vafer.org>
>>> >>> >>> wrote:
>>> >>> >>>>
>>> >>> >>>> Sounds like you are not using an order preserving partitioner?
>>> >>> >>>>
>>> >>> >>>> On Wed, Jun 2, 2010 at 13:48, David Boxenhorn
>>> >>> >>>> <da...@lookin2.com>
>>> >>> >>>> wrote:
>>> >>> >>>> > Range search on keys is not working for me. I was assured in
>>> >>> >>>> > earlier
>>> >>> >>>> > threads
>>> >>> >>>> > that range search would work, but the results would not be
>>> >>> >>>> > ordered.
>>> >>> >>>> >
>>> >>> >>>> > I'm trying to get all the rows that start with "CATEGORY."
>>> >>> >>>> >
>>> >>> >>>> > I'm doing:
>>> >>> >>>> >
>>> >>> >>>> > String start = "CATEGORY.";
>>> >>> >>>> > .
>>> >>> >>>> > .
>>> >>> >>>> > .
>>> >>> >>>> > keyspace.getSuperRangeSlice(columnParent, slicePredicate,
>>> >>> >>>> > start,
>>> >>> >>>> > "CATEGORY/", max)
>>> >>> >>>> > .
>>> >>> >>>> > .
>>> >>> >>>> > .
>>> >>> >>>> >
>>> >>> >>>> > in a loop, setting start to the last key each time - but I'm
>>> >>> >>>> > getting
>>> >>> >>>> > rows
>>> >>> >>>> > that don't start with "CATEGORY."!!
>>> >>> >>>> >
>>> >>> >>>> > How do I get all rows that start with "CATEGORY."?
>>> >>> >>>
>>> >>> >>
>>> >>> >
>>> >>> >
>>> >>
>>> >>
>>> > __________ Information from ESET NOD32 Antivirus, version of virus
>>> > signature database 5164 (20100601) __________
>>> > The message was checked by ESET NOD32 Antivirus.
>>> > http://www.eset.com
>>
>>
>
>

Reply via email to