Re: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Matt Olson
Yes, I've reverted to the default there. Thanks! On Mon, May 2, 2016 at 7:47 PM, Prasanth Jayachandran < pjayachand...@hortonworks.com> wrote: > Hi Matt > > As Gopal mentioned below you might have to unset > hive.exec.orc.memory.pool=1.0; > since some memory is required for sorting. Try running

Re: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Prasanth Jayachandran
Hi Matt As Gopal mentioned below you might have to unset hive.exec.orc.memory.pool=1.0; since some memory is required for sorting. Try running with the defaults for hive.exec.orc.memory.pool. Thanks Prasanth On May 2, 2016, at 9:41 PM, Matt Olson mailto:maolso...@gmail.com>> wrote: Hi Prasant

Re: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Matt Olson
Hi Prasanth, Thank you for the helpful information. I have been using the default ORC stripe size, which I believe is 67,108,864 bytes. I was able to remove the constant value for dt as you suggested, and set hive.optimize.sort.dynamic.partition=true. I saw in the new explain plan that the partit

Re: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Prasanth Jayachandran
Hi Matt So it looks like you are hitting the issue that I had mentioned previously. You might need to apply the patch from HIVE-12893. Alternatively, if dt has only one possible value then its better to remove the constant value for dt and the where condition. This will enable sorted dynamic par

Re: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Matt Olson
Hi Prasanth, Here is the explain plan for the insert query: OK STAGE DEPENDENCIES: Stage-1 is a root stage Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3, Stage-5 Stage-4 Stage-0 depends on stages: Stage-4, Stage-3, Stage-6 Stage-2 depends on stages: Stage-0 Stage-3

Re: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Prasanth Jayachandran
Hi Can you please post explain plan for your insert query? I suspect sorted dynamic partition optimization is bailing out because of the constant value for ‘dt' column. If you are not seeing a reducer then its likely not using the sorted dynamic partition optimization. You are probably hitting t

RE: Container out of memory: ORC format with many dynamic partitions

2016-05-02 Thread Ryan Harris
reading this: "but when I add 2000 new titles with 300 rows each" I'm thinking that you are over-partitioning your data I'm not sure exactly how that relates to the OOM error you are getting (it may not)I'd test things out partitioning by date-only maybe date + title_type, but adding

Re: Container out of memory: ORC format with many dynamic partitions

2016-04-30 Thread Gopal Vijayaraghavan
> SET hive.exec.orc.memory.pool=1.0; Might be a bad idea in general, this causes more OOMs than less. > SET mapred.map.child.java.opts=-Xmx2048M; > SET mapred.child.java.opts=-Xmx2048M; ... > Container >[pid=6278,containerID=container_e26_1460661845156_49295_01_000244] is >running beyond physic

Re: Container out of memory: ORC format with many dynamic partitions

2016-04-29 Thread Jörn Franke
I would still need some time to dig deeper in this. Are you using a specific distribution? Would it be possible to upgrade to a more recent Hive version? However, having so many small partitions is a bad practice which seriously affects performance. Each partition should at least contain several