Mostafa Mokhtar created HIVE-7574: ------------------------------------- Summary: CommonJoinOperator.checkAndGenObject calls LOG.info per row from probe side in a HashMap join consuming 4% of the CPU Key: HIVE-7574 URL: https://issues.apache.org/jira/browse/HIVE-7574 Project: Hive Issue Type: Bug Components: HiveServer2 Affects Versions: 0.13.1 Reporter: Mostafa Mokhtar Fix For: 0.14.0
In Map join Log4JLogger.trace takes 4% of the CPU time as it gets called per row from the probe side by CommonJoinOperator.genAllOneUniqueJoinObject. Fix is to remove the logging code code below from CommonJoinOperator.genAllOneUniqueJoinObject: {code} if (allOne) { LOG.info("calling genAllOneUniqueJoinObject"); genAllOneUniqueJoinObject(); LOG.info("called genAllOneUniqueJoinObject"); } else { LOG.trace("calling genUniqueJoinObject"); genUniqueJoinObject(0, 0); LOG.trace("called genUniqueJoinObject"); } {code} And {code} if (!hasEmpty && !mayHasMoreThanOne) { LOG.trace("calling genAllOneUniqueJoinObject"); genAllOneUniqueJoinObject(); LOG.trace("called genAllOneUniqueJoinObject"); } else if (!hasEmpty && !hasLeftSemiJoin) { LOG.trace("calling genUniqueJoinObject"); genUniqueJoinObject(0, 0); LOG.trace("called genUniqueJoinObject"); } else { LOG.trace("calling genObject"); genJoinObject(); LOG.trace("called genObject"); } {code} This is the call stack {code} Stack Trace Sample Count Percentage(%) hadoop.hive.ql.exec.MapJoinOperator.processOp(Object, int) 388 75.486 hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() 121 23.541 hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject() 92 17.899 commons.logging.impl.Log4JLogger.trace(Object) 20 3.891 log4j.Category.log(String, Priority, Object, Throwable) 20 3.891 log4j.Category.getEffectiveLevel() 10 1.946 {code} -- This message was sent by Atlassian JIRA (v6.2#6252)