Re: Incremental Data Processing With Hive UDAF

buddhika chamith Mon, 14 Jan 2013 06:46:38 -0800

Any suggestions on this are greatly appreciated. Any one see major road
blocks on this?


Regards
Buddhika

On Sat, Jan 12, 2013 at 10:31 AM, buddhika chamith
<chamibuddh...@gmail.com>wrote:

> Hi All,
>
> In order to achieve above I am researching on the feasibility of using a
> set of custom UADFs for distributive aggregate operations (e.g: sum, count
> etc..). Idea is to incorporate some state persisted from earlier
> aggregations to the current aggregation value inside merge of the UDAF. For
> distributing state data I was thinking of utilizing Hadoop distributed
> cache. But I am not sure about how exactly UDAF's are executed at runtime.
> Would including the logic to add the persisted state to the current result
> at terminate() ensure that it would be added only once? (Assuming all the
> aggregations fan in at terminate. I may gotten it all wrong here. :)). Or
> is there better way of achieving the same?
>
> Regards
> Buddhika
>

Re: Incremental Data Processing With Hive UDAF

Reply via email to