Re: MultiInput/MultiGet CF in MapReduce

Alicia Leong Fri, 29 Mar 2013 09:33:31 -0700

Hi All

I’m thinking to do in this way.


1)      1) get_slice ( YYYYMMDDHH )  from Index Table.

2)      2) With the returned list of ROWKEYs

3)      3) Pass it to multiget_slice ( keys …)



But my questions is how to ensure ‘Data Locality’  ??


On Tue, Mar 19, 2013 at 3:33 PM, aaron morton <aa...@thelastpickle.com>wrote:

> I would be looking at Hive or Pig, rather than writing the MapReduce.
>
> There is an example in the source cassandra distribution, or you can look
> at Data Stax Enterprise to start playing with Hive.
>
> Typically with hadoop queries you want to query a lot of data, if you are
> only querying a few rows consider writing the code in your favourite
> language.
>
> Cheers
>
> -----------------
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 18/03/2013, at 1:29 PM, Alicia Leong <lccali...@gmail.com> wrote:
>
> Hi All
>
> I have 2 tables
>
> Data Table
> -----------------
> RowKey: 1
> => (column=name, value=apple)
> RowKey: 2
> => (column=name, value=orange)
> RowKey: 3
> => (column=name, value=banana)
> RowKey: 4
> => (column=name, value=mango)
>
>
> Index Table (YYYYMMDDHH)
> ------------------------------------------------
> RowKey: 2013030114
> => (column=1, value=)
> => (column=2, value=)
> => (column=3, value=)
> RowKey: 2013030115
> => (column=4, value=)
>
>
> I would like to know, how to implement below in MapReduce
> 1) first query the Index Table by RowKey: 2013030114
> 2) then pass the Index Table column names  (1,2,3) to query the Data Table
>
> Thanks in advance.
>
>
>

Re: MultiInput/MultiGet CF in MapReduce

Reply via email to