So I can have one PagedIndex CF that holdes a row for each data file I am
processing.
The columns for that row (in my example) would have X columns and I can make
those columns values be 100 strings that represent keys in another PagedData
CF
This other PagedData CF for each row would have 10,000
If you need to parallelize (and scale) you need to distribute across
multiple rows. One Big Row means all your 100 workers are hammering
the same 3 (for instance) replicas at the same time.
On Sun, Jun 5, 2011 at 1:43 PM, Joseph Stein wrote:
> What is the best practices here to page and slice col
What is the best practices here to page and slice columns from a row.
So lets say I have 1,000,000 columns in a row
I read the row but want to have 1 thread read columns 0 - , second
thread (actor in my case) 1 - 1 ... and so on so i can have 100
workers processing 10,000 columns for