Hi Guy The easily possible option to nail down the root cause is divide and conquer. You can try the following -ensure the results are consistent on individual tables without joins
-try to narrow down the input to your join with a few ON condns You can get whether it is an issue with code on a data quality issue. It could mostly be a data quality issue. Regards Bejoy.K.S ________________________________ From: Guy Doulberg <guy.doulb...@conduit.com> To: user@hive.apache.org Sent: Monday, January 9, 2012 11:16 PM Subject: Re: inconsistent results when doing a select over a join Hey Dave, I didn't understand your question, The Inconsistant is slightly different, about 2% of differences, Thanks Guy On 01/09/2012 07:05 PM, David Houston wrote: Hi Guy, >Inconsistant by way of the results are total off or the order is different? >Thanks >Dave >On Jan 9, 2012 5:03 PM, "Guy Doulberg" <guy.doulb...@conduit.com> wrote: > >Hi guys, >> >>We are using hive for a while now, and recently we have encountered an issue we just can't understand, >> >>We are selecting(the select includes count(*)) over a join of two big tables. >> >>We ran the same query twice consequently over the same two tables , and each time the result were slightly different. >> >>We don't know how should we debug this issue, where should we look, any ideas? >> >>Thanks >> >>Guy Doulberg, >>Data infrastructure engineer, >>Conduit >>