Thanks everyone for the feedback. Just to follow up in case someone else runs into this: I can confirm that local client works around the OOMEs, but it's still very slow.
It does seem like we were hitting some combination of HIVE-4051 and HIVE-5158. We'll try reducing partition count first, and then switch to 0.12.0 if that doesn't improve things significantly. Fwiw - http://www.slideshare.net/oom65/optimize-hivequeriespptx also has has some good rules-of-thumb. Norbert On Sat, Feb 22, 2014 at 1:27 PM, Stephen Sprague <sprag...@gmail.com> wrote: > yeah. That traceback pretty much spells it out - its metastore related and > that's where the partitions are stored. > > I'm with the others on this. HiveServer2 is still a little jankey on > memory management. I bounce mine once a day at midnight just to play it > safe (and because i can.) > > Again, for me, i use the hive local client for production jobs and remote > client for adhoc stuff. > > you may wish to confirm the local hive client has no problem with your > query. > > other than that you either increase your heap size on the HS2 process and > hope for the best and/or file a bug report. > > bottom line hiveserver2 isn't production bullet proof just yet, IMHO. > Others may disagree. > > Regards, > Stephen. > > > > On Sat, Feb 22, 2014 at 9:50 AM, Norbert Burger > <norbert.bur...@gmail.com>wrote: > >> Thanks all for the quick feedback. >> >> I'm a bit surprised to learn 15k is considered too much, but we can work >> around it. I guess I'm also curious why the query planner needs to know >> about all partitions even in the case of simple select/limit queries, where >> the query might target only a single partition. >> >> Here's the client-side OOME with HADOOP_HEAPSIZE=2048: >> >> >> https://gist.githubusercontent.com/nburger/3286d2052060e2efe161/raw/dc30231086803c1d33b9137b5844d2d0e20e350d/gistfile1.txt >> >> This was from a CDH4.3.0 client hitting HIveServer2. Any idea what's >> consuming the heap? >> >> Norbert >> >> >> On Sat, Feb 22, 2014 at 10:32 AM, Edward Capriolo >> <edlinuxg...@gmail.com>wrote: >> >>> Dont make tbales with that many partitions. It is an anti pattern. I >>> hwve tables with 2000 partitions a day and that is rewlly to many. Hive >>> needs go load that informqtion into memory to plan the query. >>> >>> >>> On Saturday, February 22, 2014, Terje Marthinussen < >>> tmarthinus...@gmail.com> wrote: >>> > Query optimizer in hive is awful on memory consumption. 15k partitions >>> sounds a bit early for it to fail though.. >>> > >>> > What is your heap size? >>> > >>> > Regards, >>> > Terje >>> > >>> >> On 22 Feb 2014, at 12:05, Norbert Burger <norbert.bur...@gmail.com> >>> wrote: >>> >> >>> >> Hi folks, >>> >> >>> >> We are running CDH 4.3.0 Hive (0.10.0+121) with a MySQL metastore. >>> >> >>> >> In Hive, we have an external table backed by HDFS which has a 3-level >>> partitioning scheme that currently has 15000+ partitions. >>> >> >>> >> Within the last day or so, queries against this table have started >>> failing. A simple query which shouldn't take very long at all (select * >>> from ... limit 10) fails after several minutes with a client OOME. I get >>> the same outcome on count(*) queries (which I thought wouldn't send any >>> data back to the client). Increasing heap on both client and server JVMs >>> (via HADOOP_HEAPSIZE) doesn't have any impact. >>> >> >>> >> We were only able to work around the client OOMEs by reducing the >>> number of partitions in the table. >>> >> >>> >> Looking at the MySQL querylog, my thought is that the Hive client is >>> quite busy making requests for partitions that doesn't contribute to the >>> query. Has anyone else had similar experience against tables this size? >>> >> >>> >> Thanks, >>> >> Norbert >>> > >>> >>> -- >>> Sorry this was sent from mobile. Will do less grammar and spell check >>> than usual. >>> >> >> >