:( Upgrading is not really an option at this point. However, any idea if there was any bug like this solved? Maybe i could port the patch to the 0.6 version.
On 2 November 2011 04:16, Steven Wong <sw...@netflix.com> wrote: > I suspect very few people are still using Hive 0.6 or older. Try upgrading. > **** > > ** ** > > ** ** > > *From:* Florin Diaconeasa [mailto:florin.diacone...@gmail.com] > *Sent:* Monday, October 31, 2011 6:37 AM > *To:* user@hive.apache.org > *Subject:* High number of input files problems**** > > ** ** > > Hello,**** > > ** ** > > Lately our user base has increased so the input files have increased > considerably in size and number.**** > > ** ** > > One of our processing steps is doing a query of the form found at the end > of the email. My problem is that apparently, sometimes, the processing > misses some of the input files (for the 2nd select in most cases).**** > > ** ** > > I'm using Hive 0.6, Hadoop 0.20.2 on a Debian 5 64bit and we are > connecting to a hive server instance using JDBC. Any idea on what > parameters i could tune of any tickets that have been opened on this > problem? I searched the Hive JIRA for nothing until now... The only thing > that i think might be related is > https://issues.apache.org/jira/browse/HIVE-1884**** > > ** ** > > SELECT**** > > t.a,**** > > sum(t.b),**** > > sum(t.c),**** > > sum(t.d)**** > > FROM**** > > (**** > > SELECT**** > > a,**** > > sum(x) as b,**** > > sum(y) as c,**** > > sum(z) as d**** > > FROM T1**** > > WHERE ...**** > > GROUP BY ...**** > > **** > > UNION ALL**** > > ** ** > > SELECT**** > > a,**** > > sum(x) as b,**** > > sum(y) as c,**** > > sum(z) as d**** > > FROM T2**** > > WHERE ...**** > > GROUP BY ...**** > > **** > > UNION ALL**** > > ** ** > > SELECT**** > > a,**** > > sum(x) as b,**** > > sum(y) as c,**** > > sum(z) as d**** > > FROM T3**** > > WHERE ...**** > > GROUP BY ...**** > > ) t**** > > ** ** > > GROUP BY ...**** > > ** ** > > ** ** > > ** ** > > -- > > > Florin**** > -- Florin