Trying To Understand get_range_slices Results When Using RandomPartitioner

Larry Root Fri, 23 Apr 2010 13:34:06 -0700

I trying to better understand how using the RandomPartitioner will affect my
ability to select ranges of keys. Consider my simple example where we have
many online games across different game genres (GameType). These games need
to store data for each one of their users. With that in mind consider the
following data model:


enum GameType {'RPG', 'FPS', 'ARCADE'}

{
    "GameData": {                         // Super Column Family

        *GameType+"1234"*: {                // Row (concat gametype with a
game id for example)
            *"user-data:5678"*:{            // Super column (user data)
                *"user_prop_name"*: "value",// Subcolumn (arbitrary user
properties and values)
*                "another_prop_name"*: "value",
                 ...
            },
            *"user-data:9012"*:{
                *"**user_prop_name**"*: "value",
                 ...
            }
        },

        * GameType+"3456"*: {...},
        *GameType+"7890"*: {...},
        ...
    }
}

Assume we have a multi node cluster running Cassandra 0.6.1. In that
scenario could some one help me understand what the result would be in the
following cases:

   1. We use a range slice to grab keys for all 'RPG' games (range slice at
   the ROW level). Would we be able to get all games back in a single query or
   would that not be guaranteed?

   2. For a given game we use a range slice to grab all user-data keys in
   which the ID starts with '5' (range slice at the COLUMN level). Again, would
   we be able to get all keys in one call (assuming number of keys in the
   result was not an issue)?

   3. Finally for a given game and a given user we do a range slice to grab
   all user properties that start with 'a' (range slice at the SUBCOLUMN level
   of a SUPERCOLUMN). Is that possible in one call?

I'm trying to understand at what level the RandomPartioner affects my
example data model. Is it at a fixed level like just ROWS (the sub data is
fixed to the same node) or is all data at every level *randomized* across
all nodes.

Are there any tricks to doing these sort of range slices using RP? For
example if I set my consistency level to 'ALL' when doing a range slice
would that effectively compile a complete result set for me?

Thanks for the help!

larry

Trying To Understand get_range_slices Results When Using RandomPartitioner

Reply via email to