Re: High number of input files problems

Florin Diaconeasa Thu, 03 Nov 2011 01:13:32 -0700

:( Upgrading is not really an option at this point. However, any idea if
there was any bug like this solved? Maybe i could port the patch to the 0.6
version.


On 2 November 2011 04:16, Steven Wong <sw...@netflix.com> wrote:

> I suspect very few people are still using Hive 0.6 or older. Try upgrading.
> ****
>
> ** **
>
> ** **
>
> *From:* Florin Diaconeasa [mailto:florin.diacone...@gmail.com]
> *Sent:* Monday, October 31, 2011 6:37 AM
> *To:* user@hive.apache.org
> *Subject:* High number of input files problems****
>
> ** **
>
> Hello,****
>
> ** **
>
> Lately our user base has increased so the input files have increased
> considerably in size and number.****
>
> ** **
>
> One of our processing steps is doing a query of the form found at the end
> of the email. My problem is that apparently, sometimes, the processing
> misses some of the input files (for the 2nd select in most cases).****
>
> ** **
>
> I'm using Hive 0.6, Hadoop 0.20.2 on a Debian 5 64bit and we are
> connecting to a hive server instance using JDBC. Any idea on what
> parameters i could tune of any tickets that have been opened on this
> problem? I searched the Hive JIRA for nothing until now... The only thing
> that i think might be related is
> https://issues.apache.org/jira/browse/HIVE-1884****
>
> ** **
>
> SELECT****
>
>             t.a,****
>
>             sum(t.b),****
>
>             sum(t.c),****
>
>             sum(t.d)****
>
> FROM****
>
> (****
>
>             SELECT****
>
>                         a,****
>
>                         sum(x) as b,****
>
>                         sum(y) as c,****
>
>                         sum(z) as d****
>
>             FROM T1****
>
>             WHERE ...****
>
>             GROUP BY ...****
>
>             ****
>
> UNION ALL****
>
> ** **
>
>             SELECT****
>
>                         a,****
>
>                         sum(x) as b,****
>
>                         sum(y) as c,****
>
>                         sum(z) as d****
>
>             FROM T2****
>
>             WHERE ...****
>
>             GROUP BY ...****
>
>             ****
>
> UNION ALL****
>
> ** **
>
>             SELECT****
>
>                         a,****
>
>                         sum(x) as b,****
>
>                         sum(y) as c,****
>
>                         sum(z) as d****
>
>             FROM T3****
>
>             WHERE ...****
>
>             GROUP BY ...****
>
> ) t****
>
> ** **
>
> GROUP BY ...****
>
> ** **
>
> ** **
>
> ** **
>
> --
>
>
> Florin****
>



-- 


Florin

Re: High number of input files problems

Reply via email to