[ https://issues.apache.org/jira/browse/HIVE-17684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622731#comment-16622731 ]
Sahil Takiar commented on HIVE-17684: ------------------------------------- [~mi...@cloudera.com] I'm not entirely sure whats going on with the compilation failures. They don't seem related to the patch, so I think its safe to ignore them for now. I will look into them separately. Hive still seems to be able to compile the code successfully given that it is able to run unit tests. I'm looking it why the HoS tests are failing. > HoS memory issues with MapJoinMemoryExhaustionHandler > ----------------------------------------------------- > > Key: HIVE-17684 > URL: https://issues.apache.org/jira/browse/HIVE-17684 > Project: Hive > Issue Type: Bug > Components: Spark > Reporter: Sahil Takiar > Assignee: Misha Dmitriev > Priority: Major > Attachments: HIVE-17684.01.patch, HIVE-17684.02.patch, > HIVE-17684.03.patch, HIVE-17684.04.patch, HIVE-17684.05.patch, > HIVE-17684.06.patch, HIVE-17684.07.patch, HIVE-17684.08.patch, > HIVE-17684.09.patch, HIVE-17684.10.patch > > > We have seen a number of memory issues due the {{HashSinkOperator}} use of > the {{MapJoinMemoryExhaustionHandler}}. This handler is meant to detect > scenarios where the small table is taking too much space in memory, in which > case a {{MapJoinMemoryExhaustionError}} is thrown. > The configs to control this logic are: > {{hive.mapjoin.localtask.max.memory.usage}} (default 0.90) > {{hive.mapjoin.followby.gby.localtask.max.memory.usage}} (default 0.55) > The handler works by using the {{MemoryMXBean}} and uses the following logic > to estimate how much memory the {{HashMap}} is consuming: > {{MemoryMXBean#getHeapMemoryUsage().getUsed() / > MemoryMXBean#getHeapMemoryUsage().getMax()}} > The issue is that {{MemoryMXBean#getHeapMemoryUsage().getUsed()}} can be > inaccurate. The value returned by this method returns all reachable and > unreachable memory on the heap, so there may be a bunch of garbage data, and > the JVM just hasn't taken the time to reclaim it all. This can lead to > intermittent failures of this check even though a simple GC would have > reclaimed enough space for the process to continue working. > We should re-think the usage of {{MapJoinMemoryExhaustionHandler}} for HoS. > In Hive-on-MR this probably made sense to use because every Hive task was run > in a dedicated container, so a Hive Task could assume it created most of the > data on the heap. However, in Hive-on-Spark there can be multiple Hive Tasks > running in a single executor, each doing different things. -- This message was sent by Atlassian JIRA (v7.6.3#76005)