Hi guys,
I spent the day today investigating this issue, it seems like the differences occur when there are many killed tasks.

We are using the fair scheduler, I ran the queries on large data and with low priority which caused the tasks of this job to be preempt(killed) many times.

After I began suspecting this issue, I gave the query the highest priority by doing that I reduced the number of killed tasks, that seemed to solve the problem


It is not that whenever there are killed task there are differences, it is when there many killed task because of preemption there are differences.

What do you say?
On Tue 10 Jan 2012 11:49:35 AM IST, Guy Doulberg wrote:
Hi,
Sorry for the late answer,
I ran the query on small data, but couldn't reproduce,
I can reproduce it at the moment on data that takes about 1.5  hour to
process,
I am trying to narrow the amount of data as much as I can, and still
reproduce it...

But I think it is clear to me, that the scale of data is the reason for
the differences,

What do you think?



On Mon 09 Jan 2012 08:14:10 PM IST, Edward Capriolo wrote:
Create table, query , and some small data set to reproduce

On Monday, January 9, 2012, Guy Doulberg<guy.doulb...@conduit.com
<mailto:guy.doulb...@conduit.com>>  wrote:
Thanks, I am trying to reproduce it again,

But what should I send the ML?




On Mon 09 Jan 2012 07:54:24 PM IST, Edward Capriolo wrote:

Can you reproduce the issue? possibly with the smaller tables and
send that to the ML?

Edward

On Mon, Jan 9, 2012 at 12:46 PM, Guy Doulberg
<guy.doulb...@conduit.com<mailto:guy.doulb...@conduit.com>
<mailto:guy.doulb...@conduit.com<mailto:guy.doulb...@conduit.com>>>
wrote:

    Hey Dave,
    I didn't understand your question,

    The Inconsistant is slightly different, about 2% of differences,

    Thanks

    Guy

    On 01/09/2012 07:05 PM, David Houston wrote:

    Hi Guy,

    Inconsistant by way of the results are total off or the order is
    different?

    Thanks

    Dave

    On Jan 9, 2012 5:03 PM, "Guy Doulberg"
<guy.doulb...@conduit.com<mailto:guy.doulb...@conduit.com>
<mailto:guy.doulb...@conduit.com
<mailto:guy.doulb...@conduit.com>>>  wrote:

        Hi guys,

        We are using hive for a while now, and recently we have
        encountered an issue we just can't understand,

        We are selecting(the select includes count(*)) over a join of
        two big tables.

        We ran the same query twice consequently over the same two
        tables , and each time the result were slightly different.

        We don't know how should we debug this issue, where should we
        look, any ideas?

        Thanks

        Guy Doulberg,
        Data infrastructure engineer,
        Conduit



Reply via email to