Yeah, it works for me.
Thanks
On Fri, Nov 18, 2016 at 3:08 AM, ayan guha wrote:
> Hi
>
> I think you can use map reduce paradigm here. Create a key using user ID
> and date and record as a value. Then you can express your operation (do
> something) part as a function. If the function meets cer
Hi
I think you can use map reduce paradigm here. Create a key using user ID
and date and record as a value. Then you can express your operation (do
something) part as a function. If the function meets certain criteria such
as associative and cumulative like, say Add or multiplication, you can use
That would help but again in a particular partitions i would need to a
iterate over the customers having first n letters of user id in that
partition. I want to get rid of nested iterations.
Thanks
On Thu, Nov 17, 2016 at 10:28 PM, Xiaomeng Wan wrote:
> You can partitioned on the first n letter
You can partitioned on the first n letters of userid
On 17 November 2016 at 08:25, titli batali wrote:
> Hi,
>
> I have a use case, where we have 1000 csv files with a column user_Id,
> having 8 million unique users. The data contains: userid,date,transaction,
> where we run some queries.
>
> We