On Mon, Sep 12, 2011 at 1:54 PM, Tharindu Mathew <mcclou...@gmail.com> wrote: > Thanks Brandon for the clarification. > > I'd like to support a use case where an index is built in a row in a CF.
If you're just _building_ the row, the current state of things will work just fine. The trouble starts when you need to read it via hadoop. > So, as a starting point for a query, a known row with a larger number of > columns will have to be selected. The split to the hadoop nodes should start > at that level. The other problem here is if you want 10 nodes to operate on the row and have RF=3, you're losing locality for 7 of the nodes. If the task is heavily CPU-bound this is probably ok, otherwise it may be that only using 3 nodes is better (since they will have a local replica.) > Is this a common use case? I'm not entirely sure what it is you want to do yet, but maybe I answered it above. -Brandon