On 8/6/10 4:50 PM, Thomas Heller wrote:
Thanks for the suggestion.
I've somewhat understand all that, the point where my head begins to explode
is when I want to figure out something like
Continuing with your example: "Over the last X amount of days give me all
the logs for remote_addr:XXX".
I'm guessing I would need to create a separate index ColumnFamily???
Depending on your needs you can either insert them directly or pull
them out later in some map/reduce fashion. What you want is another
column Family and a similar structure.
ColumnFamily Standard "LogByRemoteAddrAndDate" CompareWith: TimeUUID
Row: "127.0.0.1:20100806" Column TimeUUID/JSON as usual. If you want
to "link" to the actual log record (to avoid writing if multiple
times) just insert the same timeuuid you inserted into the other CF
and leave the value empty. So you have your "Index", aka list of
column names, and you can look up the actual values using get_slice
with column_names.
Confusing at first, but really quite simple once you get used to the
idea. Just alot more work then letting SQL do it for you. ;)
HTH,
/thomas
Ok, I think the part I was missing was the concatenation of the key and
partition to do the look ups. Is this the preferred way of accomplishing
needs such as this? Are there alternatives ways?
How would one then "query" over multiple days? Same question for all
days. Should I use range_slice or multiget_slice? And if its range_slice
does that mean I need OrderPreservingPartitioner?