Same answer as on other thread right now about how to index:

http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/
http://www.slideshare.net/benjaminblack/cassandra-basics-indexing

On Fri, Aug 6, 2010 at 6:18 PM, Mark <static.void....@gmail.com> wrote:
> On 8/6/10 4:50 PM, Thomas Heller wrote:
>>>
>>> Thanks for the suggestion.
>>>
>>> I've somewhat understand all that, the point where my head begins to
>>> explode
>>> is when I want to figure out something like
>>>
>>> Continuing with your example: "Over the last X amount of days give me all
>>> the logs for remote_addr:XXX".
>>> I'm guessing I would need to create a separate index ColumnFamily???
>>>
>>>
>>
>> Depending on your needs you can either insert them directly or pull
>> them out later in some map/reduce fashion. What you want is another
>> column Family and a similar structure.
>>
>> ColumnFamily Standard "LogByRemoteAddrAndDate" CompareWith: TimeUUID
>>
>> Row: "127.0.0.1:20100806" Column TimeUUID/JSON as usual. If you want
>> to "link" to the actual log record (to avoid writing if multiple
>> times) just insert the same timeuuid you inserted into the other CF
>> and leave the value empty. So you have your "Index", aka list of
>> column names, and you can look up the actual values using get_slice
>> with column_names.
>>
>> Confusing at first, but really quite simple once you get used to the
>> idea. Just alot more work then letting SQL do it for you. ;)
>>
>> HTH,
>> /thomas
>>
>
> Ok, I think the part I was missing was the concatenation of the key and
> partition to do the look ups. Is this the preferred way of accomplishing
> needs such as this? Are there alternatives ways?
>
> How would one then "query" over multiple days? Same question for all days.
> Should I use range_slice or multiget_slice? And if its range_slice does that
> mean I need OrderPreservingPartitioner?
>
>
>

Reply via email to