Re: Metastore performance on HDFS-backed table with 15000+ partitions

2014-02-22 Thread Terje Marthinussen
Query optimizer in hive is awful on memory consumption. 15k partitions sounds a bit early for it to fail though.. What is your heap size? Regards, Terje > On 22 Feb 2014, at 12:05, Norbert Burger wrote: > > Hi folks, > > We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore. >

OOM after upgrade to last weeks 0.7

2011-05-17 Thread Terje Marthinussen
Hi, I was running on a 0.7 trunk build from February 2011 until last Friday and upgraded to trunk again then. Things works ok except memory usage when doing queries with large number of partitions is quite dramatically up. I could query 12 months of data in one table with ~72k of partitions with

Re: Too many open files

2011-01-07 Thread Terje Marthinussen
never had this problem again: > > > > > > > > hive.fileformat.check > > false > > > > > -- > > *From:* Terje Marthinussen [mailto:tmarthinus...@gmail.com] > *Sent:* Friday, January 07, 2011 4:14

Re: Too many open files

2011-01-06 Thread Terje Marthinussen
file handlers. > > > > I would appreciate some feedback. (trying to find my earlier email) > > > > Thanks, > > Viral > > > > On Thu, Jan 6, 2011 at 4:57 PM, Terje Marthinussen < > tmarthinus...@gmail.com> > > wrote: > >> > >> Hi, &

Too many open files

2011-01-06 Thread Terje Marthinussen
Hi, While loading some 10k+ .gz files through HiveServer with LOAD FILE etc. etc. 11/01/06 22:12:42 INFO exec.CopyTask: Copying data from file:XXX.gz to hdfs://YYY 11/01/06 22:12:42 INFO hdfs.DFSClient: Exception in createBlockOutputStream java.net.SocketException: Too many open files 11/01/06 22

alter table bypasses regexSerDe checks

2010-12-21 Thread Terje Marthinussen
Hi, I made HIVE-1850 a week ago, but I just realized that of course, this is a bit more generic. Any alter table operation may put hive in a state which you cannot get out of today. For instance, I just added a column which was type INT to a table. As a result, I know get: FAILED: Hive Internal E

Scheduling jobs in hive

2010-10-27 Thread Terje Marthinussen
Hi, Are there any good scheduling tools out there suitable for the dependencies you may get in Hive? Specific example I have right now: - 2 tables with event logs from different sources - 1 table with some additional data from a different source, but this data is daily summary None of this data