Query optimizer in hive is awful on memory consumption. 15k partitions sounds a
bit early for it to fail though..
What is your heap size?
Regards,
Terje
> On 22 Feb 2014, at 12:05, Norbert Burger wrote:
>
> Hi folks,
>
> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore.
>
Hi,
I was running on a 0.7 trunk build from February 2011 until last Friday and
upgraded to trunk again then.
Things works ok except memory usage when doing queries with large number of
partitions is quite dramatically up.
I could query 12 months of data in one table with ~72k of partitions with
never had this problem again:
>
>
>
>
>
>
>
> hive.fileformat.check
>
> false
>
>
>
>
> --
>
> *From:* Terje Marthinussen [mailto:tmarthinus...@gmail.com]
> *Sent:* Friday, January 07, 2011 4:14
file handlers.
> >
> > I would appreciate some feedback. (trying to find my earlier email)
> >
> > Thanks,
> > Viral
> >
> > On Thu, Jan 6, 2011 at 4:57 PM, Terje Marthinussen <
> tmarthinus...@gmail.com>
> > wrote:
> >>
> >> Hi,
&
Hi,
While loading some 10k+ .gz files through HiveServer with LOAD FILE etc.
etc.
11/01/06 22:12:42 INFO exec.CopyTask: Copying data from file:XXX.gz to
hdfs://YYY
11/01/06 22:12:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream
java.net.SocketException: Too many open files
11/01/06 22
Hi,
I made HIVE-1850 a week ago, but I just realized that of course, this is a
bit more generic. Any alter table operation may put hive in a state which
you cannot get out of today.
For instance, I just added a column which was type INT to a table. As a
result, I know get:
FAILED: Hive Internal E
Hi,
Are there any good scheduling tools out there suitable for the dependencies
you may get in Hive?
Specific example I have right now:
- 2 tables with event logs from different sources
- 1 table with some additional data from a different source, but this data
is daily summary
None of this data