Wei Zheng created HIVE-12837:
--------------------------------
Summary: Better memory estimation/allocation for hybrid grace hash
join during hash table loading
Key: HIVE-12837
URL: https://issues.apache.org/jira/browse/HIVE-12837
Project: Hive
Issue Type: Bug
Components: Hive
Affects Versions: 2.1.0
Reporter: Wei Zheng
Assignee: Wei Zheng
This is to avoid an edge case when the memory available is very little (less
than a single write buffer size), and we start loading the hash table. Since
the write buffer is lazily allocated, we will easily run out of memory before
even checking if we should spill any hash partition.
e.g.
Total memory available: 210 MB
Size of ref array of BytesBytesMultiHashMap for each hash partition: ~16 MB
Size of write buffer: 8 MB (lazy allocation)
# hash partitions: 16
# hash partitions created in memory: 13
# hash partitions created on disk: 3
Available memory left after HybridHashTableContainer initialization:
210-16*13=2MB
Now let's say a row is to be loaded into a hash partition in memory, it will
try to allocate an 8MB write buffer for it, but we only have 2MB, thus OOM.
Solution is to perform the check for possible spilling earlier so we can spill
partitions if memory is about to be full, to avoid OOM.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)