indeed +1 to Gopal on that explanation! That was huge.

On Wed, Aug 17, 2016 at 12:58 AM, 明浩 冯 <qiuff...@hotmail.com> wrote:

> Hi Gopal,
>
>
> It works when I disabled the dfs.namenode.acls.
>
> For the data loss, it doesn't affect me too much currently. But I will
> track the issue in Kylin.
>
> Thank you very much for your detailed explain and solution.  You saved me!
>
>
> Best Regards,
>
> Minghao Feng
> ------------------------------
> *From:* Gopal Vijayaraghavan <go...@hortonworks.com> on behalf of Gopal
> Vijayaraghavan <gop...@apache.org>
> *Sent:* Wednesday, August 17, 2016 1:18:54 PM
> *To:* user@hive.apache.org
> *Subject:* Re: hive throws ConcurrentModificationException when executing
> insert overwrite table
>
>
> > Yes, Kylin generated the query. I'm using Kylin 1.5.3.
>
> I would report a bug to Kylin about DISTRIBUTE BY RAND().
>
> This is what happens when a node which ran a Map task fails and the whole
> task is retried.
>
> Assume that the first attempt of the Map task0 wrote value1 into
> reducer-99, because RAND() returned 99.
>
> Now the task succeeds and then reducer starts, running reducer-0
> successfully, which write 0000_0.
>
> But before reducer-99 runs, the node which ran Map task0 crashes.
>
> So, the engine re-runs Map task0 on another node. Except because RAND() is
> completely random, it may give 0 as the output of RAND() for "value1".
>
> The reducer-0 output from Map task0 now has "value1", except there's no
> task which will ever read that out or write that out.
>
> In short, the output of the table will not contain "value1", despite the
> input and the shuffle outputs containing "value1".
>
> I would replace the DISTRIBUTE BY RAND() with SORT BY 0, for a random
> distribution without data loss.
>
> > But I still not sure how can I fix the problem. I'm a beginner of Hive
> >and Kylin, Can the problem be fixed by just change the hive or kylin
> >settings?
>
> If you're just experimenting with Kylin right now, I recommend just
> disabling the ACL settings in HDFS (this is not permissions btw, ACLs are
> permissions++).
>
> Set dfs.namenode.acls.enabled=false in core-site.xml and wherever else in
> your /etc/hadoop/conf it shows up and you should be good to avoid the race
> condition.
>
> Cheers,
> Gopal
>
>
>

Reply via email to