[ https://issues.apache.org/jira/browse/HIVE-15221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Fei Hui updated HIVE-15221: --------------------------- Description: i see in the current master version {code:title=MapJoinMemoryExhaustionHandler.java|borderStyle=solid} public void checkMemoryStatus(long tableContainerSize, long numRows) throws MapJoinMemoryExhaustionException { long usedMemory = memoryMXBean.getHeapMemoryUsage().getUsed(); double percentage = (double) usedMemory / (double) maxHeapSize; String msg = Utilities.now() + "\tProcessing rows:\t" + numRows + "\tHashtable size:\t" + tableContainerSize + "\tMemory usage:\t" + usedMemory + "\tpercentage:\t" + percentageNumberFormat.format(percentage); console.printInfo(msg); if(percentage > maxMemoryUsage) { throw new MapJoinMemoryExhaustionException(msg); } } {code} if {{percentage > maxMemoryUsage}}, then throw MapJoinMemoryExhaustionException in my opinion, running is better than fail. after System.gc, ' if percentage > maxMemoryUsage, then throw MapJoinMemoryExhaustionException' maybe better And original checking way has a problem: 1) consuming much memory cause gc (e.g young gc), then check after adding row and pass. 2) consuming much memory does not cause gc, then check after adding rows but throw Exception sometimes 2) occurs, but it contians less rows than 1). was: i see in the current master version percentage = (double) usedMemory / (double) maxHeapSize; if percentage > maxMemoryUsage, then throw MapJoinMemoryExhaustionException in my opinion, running is better than fail. after System.gc, ' if percentage > maxMemoryUsage, then throw MapJoinMemoryExhaustionException' maybe better And original checking way has a problem: 1) consuming much memory cause gc (e.g young gc), then check after adding row and pass. 2) consuming much memory does not cause gc, then check after adding rows but throw Exception sometimes 2) occurs, but it contians less rows than 1). > Improvement for MapJoin checkMemoryStatus, adding gc before throwing Exception > ------------------------------------------------------------------------------ > > Key: HIVE-15221 > URL: https://issues.apache.org/jira/browse/HIVE-15221 > Project: Hive > Issue Type: Improvement > Components: Query Processor > Affects Versions: 2.1.0, 2.0.1 > Reporter: Fei Hui > Assignee: Fei Hui > Attachments: HIVE-15221.1.patch, stat_gc.png > > > i see in the current master version > {code:title=MapJoinMemoryExhaustionHandler.java|borderStyle=solid} > public void checkMemoryStatus(long tableContainerSize, long numRows) > throws MapJoinMemoryExhaustionException { > long usedMemory = memoryMXBean.getHeapMemoryUsage().getUsed(); > double percentage = (double) usedMemory / (double) maxHeapSize; > String msg = Utilities.now() + "\tProcessing rows:\t" + numRows + > "\tHashtable size:\t" > + tableContainerSize + "\tMemory usage:\t" + usedMemory + > "\tpercentage:\t" + percentageNumberFormat.format(percentage); > console.printInfo(msg); > if(percentage > maxMemoryUsage) { > throw new MapJoinMemoryExhaustionException(msg); > } > } > {code} > if {{percentage > maxMemoryUsage}}, then throw > MapJoinMemoryExhaustionException > in my opinion, running is better than fail. after System.gc, ' if percentage > > maxMemoryUsage, then throw MapJoinMemoryExhaustionException' maybe better > And original checking way has a problem: 1) consuming much memory cause gc > (e.g young gc), then check after adding row and pass. 2) consuming much > memory does not cause gc, then check after adding rows but throw Exception > sometimes 2) occurs, but it contians less rows than 1). -- This message was sent by Atlassian JIRA (v6.3.15#6346)