Sahil Takiar created HIVE-17684:
-----------------------------------

             Summary: HoS memory issues with MapJoinMemoryExhaustionHandler
                 Key: HIVE-17684
                 URL: https://issues.apache.org/jira/browse/HIVE-17684
             Project: Hive
          Issue Type: Bug
          Components: Spark
            Reporter: Sahil Takiar
            Assignee: Sahil Takiar


We have seen a number of memory issues due the {{HashSinkOperator}} use of the 
{{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect scenarios 
where the small table is taking too much space in memory, in which case a 
{{MapJoinMemoryExhaustionError}} is thrown.

The configs to control this logic are:

{{hive.mapjoin.localtask.max.memory.usage}} (default 0.90)
{{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55)

The handler works by using the {{MemoryMXBean}} and uses the following logic to 
estimate how much memory the {{HashMap}} is consuming: 
{{MemoryMXBean#getHeapMemoryUsage().getUsed() / 
MemoryMXBean#getHeapMemoryUsage().getMax()}}

The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be 
inaccurate. The value returned by this method returns all reachable and 
unreachable memory on the heap, so there may be a bunch of garbage data, and 
the JVM just hasn't taken the time to reclaim it all. This can lead to 
intermittent failures of this check even though a simple GC would have 
reclaimed enough space for the process to continue working.

We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. In 
Hive-on-MR this probably made sense to use because every Hive task was run in a 
dedicated container, so a Hive Task could assume it created most of the data on 
the heap. However, in Hive-on-Spark there can be multiple Hive Tasks running in 
a single executor, each doing different things.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to