You can use the output of describe_ring along with partitioner information to determine which nodes data lives on.
On Fri, Mar 29, 2013 at 12:33 PM, Alicia Leong <lccali...@gmail.com> wrote: > Hi All > > I’m thinking to do in this way. > > 1) 1) get_slice ( YYYYMMDDHH ) from Index Table. > > 2) 2) With the returned list of ROWKEYs > > 3) 3) Pass it to multiget_slice ( keys …) > > > > But my questions is how to ensure ‘Data Locality’ ?? > > > On Tue, Mar 19, 2013 at 3:33 PM, aaron morton <aa...@thelastpickle.com>wrote: > >> I would be looking at Hive or Pig, rather than writing the MapReduce. >> >> There is an example in the source cassandra distribution, or you can look >> at Data Stax Enterprise to start playing with Hive. >> >> Typically with hadoop queries you want to query a lot of data, if you are >> only querying a few rows consider writing the code in your favourite >> language. >> >> Cheers >> >> ----------------- >> Aaron Morton >> Freelance Cassandra Consultant >> New Zealand >> >> @aaronmorton >> http://www.thelastpickle.com >> >> On 18/03/2013, at 1:29 PM, Alicia Leong <lccali...@gmail.com> wrote: >> >> Hi All >> >> I have 2 tables >> >> Data Table >> ----------------- >> RowKey: 1 >> => (column=name, value=apple) >> RowKey: 2 >> => (column=name, value=orange) >> RowKey: 3 >> => (column=name, value=banana) >> RowKey: 4 >> => (column=name, value=mango) >> >> >> Index Table (YYYYMMDDHH) >> ------------------------------------------------ >> RowKey: 2013030114 >> => (column=1, value=) >> => (column=2, value=) >> => (column=3, value=) >> RowKey: 2013030115 >> => (column=4, value=) >> >> >> I would like to know, how to implement below in MapReduce >> 1) first query the Index Table by RowKey: 2013030114 >> 2) then pass the Index Table column names (1,2,3) to query the Data >> Table >> >> Thanks in advance. >> >> >> >