Re: OutOfMemoryError after loading lots of dynamic partitions

Suresh Kumar Sethuramaswamy Thu, 09 Jan 2020 11:02:39 -0800

That's awesome.

Thanks
Suresh Sethuramaswamy



On Thu, Jan 9, 2020 at 2:00 PM Patrick Duin <[email protected]> wrote:

> Thanks Suresh, changing the heap was our first guess as well actually. I
> think we were on the right track there. Weird thing is that our jobs seems
> to now run fine (all partitions are added) despite still giving this error.
> Weird but it seems to be ok now.
>
> Thanks for the help.
>
> Op wo 8 jan. 2020 om 19:54 schreef Suresh Kumar Sethuramaswamy <
> [email protected]>:
>
>> Thanks for the Query and the hive options.
>>
>> Looks like the JVM HEAP space for HIVE CLI is running out of memory as
>> per the EMR documentation
>> https://aws.amazon.com/premiumsupport/knowledge-center/emr-hive-outofmemoryerror-heap-space/
>>
>>
>>
>>
>> On Wed, Jan 8, 2020 at 11:38 AM Patrick Duin <[email protected]> wrote:
>>
>>> The query is rather large it won't tell you much (it's generated).
>>>
>>> It comes down to this:
>>> WITH gold AS ( select * from table1),
>>> delta AS (select * from table2)
>>> INSERT OVERWRITE TABLE
>>>    my_db.temp__v1_2019_12_03_182627
>>> PARTITION (`c_date`,`c_hour`,`c_b`,`c_p`)
>>>   SELECT * FROM gold
>>>   UNION DISTINCT
>>>   SELECT * FROM delta
>>>  DISTRIBUTE BY c_date, c_hour, c_b, c_p
>>>
>>> We run with this:
>>>
>>>  hive -f /tmp/populateTempTable6388054392973078671.hql --verbose
>>> --hiveconf hive.exec.dynamic.partition='true' --hiveconf
>>> hive.exec.dynamic.partition.mode='nonstrict' --hiveconf
>>> hive.exec.max.dynamic.partitions.pernode='5000' --hiveconf
>>> hive.exec.max.dynamic.partitions='50000' --hiveconf
>>> parquet.compression='SNAPPY' --hiveconf hive.execution.engine='mr'
>>> --hiveconf mapreduce.map.java.opts='-Xmx4608m' --hiveconf
>>> mapreduce.map.memory.mb='5760' --hiveconf
>>> mapreduce.reduce.java.opts='-Xmx10400m' --hiveconf
>>> mapreduce.reduce.memory.mb='13000' --hiveconf
>>> hive.optimize.sort.dynamic.partition='false' --hiveconf
>>> hive.blobstore.optimizations.enabled='false' --hiveconf
>>> hive.map.aggr='false' --hiveconf= yarn.app.mapreduce.am.resource.mb=15000
>>>
>>> We run on EMR m5.2xlarge nodes (32GB of memory). As I said the M/R bit
>>> runs fine, the job is listed as succeeded in the ResourceManager, after we
>>> get the error somehow,
>>>
>>>
>>>
>>> Op wo 8 jan. 2020 om 17:22 schreef Suresh Kumar Sethuramaswamy <
>>> [email protected]>:
>>>
>>>> Could you please post your insert query snippet along with the SET
>>>> statements ?
>>>>
>>>> On Wed, Jan 8, 2020 at 11:17 AM Patrick Duin <[email protected]> wrote:
>>>>
>>>>> Hi,
>>>>> I got a query that's producing about 3000 partitions which we load
>>>>> dynamically (On Hive 2.3.5).
>>>>> At the end of this query (running on M/R which runs fine) the M/R job
>>>>> is finished and we see this on the hive cli:
>>>>>
>>>>> Loading data to table my_db.temp__v1_2019_12_03_182627 partition
>>>>> (c_date=null, c_hour=null, c_b=null, c_p=null)
>>>>>
>>>>>
>>>>>          Time taken to load dynamic partitions: 540.025 seconds
>>>>>          Time taken for adding to write entity : 0.329 seconds
>>>>> #
>>>>> # java.lang.OutOfMemoryError: Java heap space
>>>>> # -XX:OnOutOfMemoryError="kill -9 %p"
>>>>> #   Executing /bin/sh -c "kill -9 19644"...
>>>>> os::fork_and_exec failed: Cannot allocate memory (12)
>>>>> MapReduce Jobs Launched:
>>>>> Stage-Stage-1: Map: 387  Reduce: 486   Cumulative CPU: 110521.05 sec
>>>>> HDFS Read: 533411354 HDFS Write: 262054898296 SUCCESS
>>>>> Stage-Stage-2: Map: 973  Reduce: 1009   Cumulative CPU: 48710.45 sec
>>>>> HDFS Read: 262126094987 HDFS Write: 70666472011 SUCCESS
>>>>> Total MapReduce CPU Time Spent: 1 days 20 hours 13 minutes 51 seconds
>>>>> 500 msec
>>>>> OK
>>>>>
>>>>> Where is this OutOfMemoryError coming from which heap space am I
>>>>> supposed to increase. We've tried increasing
>>>>> 'yarn.app.mapreduce.am.resource.mb' but that didn't seem to help.
>>>>> I know we should probably not have this many partitions but this is a
>>>>> one off would like this to just work.
>>>>>
>>>>> Thanks for any pointers,
>>>>>  Patrick
>>>>>
>>>>>
>>>>>
>>>>>

Re: OutOfMemoryError after loading lots of dynamic partitions

Reply via email to