Re: MultiInput/MultiGet CF in MapReduce

Edward Capriolo Fri, 29 Mar 2013 16:54:15 -0700

You can use the output of describe_ring along with partitioner information
to determine which nodes data lives on.



On Fri, Mar 29, 2013 at 12:33 PM, Alicia Leong <lccali...@gmail.com> wrote:

> Hi All
>
> I’m thinking to do in this way.
>
> 1)      1) get_slice ( YYYYMMDDHH )  from Index Table.
>
> 2)      2) With the returned list of ROWKEYs
>
> 3)      3) Pass it to multiget_slice ( keys …)
>
>
>
> But my questions is how to ensure ‘Data Locality’  ??
>
>
> On Tue, Mar 19, 2013 at 3:33 PM, aaron morton <aa...@thelastpickle.com>wrote:
>
>> I would be looking at Hive or Pig, rather than writing the MapReduce.
>>
>> There is an example in the source cassandra distribution, or you can look
>> at Data Stax Enterprise to start playing with Hive.
>>
>> Typically with hadoop queries you want to query a lot of data, if you are
>> only querying a few rows consider writing the code in your favourite
>> language.
>>
>> Cheers
>>
>>    -----------------
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>>
>> @aaronmorton
>> http://www.thelastpickle.com
>>
>> On 18/03/2013, at 1:29 PM, Alicia Leong <lccali...@gmail.com> wrote:
>>
>> Hi All
>>
>> I have 2 tables
>>
>> Data Table
>> -----------------
>> RowKey: 1
>> => (column=name, value=apple)
>> RowKey: 2
>> => (column=name, value=orange)
>> RowKey: 3
>> => (column=name, value=banana)
>> RowKey: 4
>> => (column=name, value=mango)
>>
>>
>> Index Table (YYYYMMDDHH)
>> ------------------------------------------------
>> RowKey: 2013030114
>> => (column=1, value=)
>> => (column=2, value=)
>> => (column=3, value=)
>> RowKey: 2013030115
>> => (column=4, value=)
>>
>>
>> I would like to know, how to implement below in MapReduce
>> 1) first query the Index Table by RowKey: 2013030114
>> 2) then pass the Index Table column names  (1,2,3) to query the Data
>> Table
>>
>> Thanks in advance.
>>
>>
>>
>

Re: MultiInput/MultiGet CF in MapReduce

Reply via email to