Mostafa Mokhtar created HIVE-7574:
-------------------------------------
Summary: CommonJoinOperator.checkAndGenObject calls LOG.info per
row from probe side in a HashMap join consuming 4% of the CPU
Key: HIVE-7574
URL: https://issues.apache.org/jira/browse/HIVE-7574
Project: Hive
Issue Type: Bug
Components: HiveServer2
Affects Versions: 0.13.1
Reporter: Mostafa Mokhtar
Fix For: 0.14.0
In Map join Log4JLogger.trace takes 4% of the CPU time as it gets called per
row from the probe side by CommonJoinOperator.genAllOneUniqueJoinObject.
Fix is to remove the logging code code below from
CommonJoinOperator.genAllOneUniqueJoinObject:
{code}
if (allOne) {
LOG.info("calling genAllOneUniqueJoinObject");
genAllOneUniqueJoinObject();
LOG.info("called genAllOneUniqueJoinObject");
} else {
LOG.trace("calling genUniqueJoinObject");
genUniqueJoinObject(0, 0);
LOG.trace("called genUniqueJoinObject");
}
{code}
And
{code}
if (!hasEmpty && !mayHasMoreThanOne) {
LOG.trace("calling genAllOneUniqueJoinObject");
genAllOneUniqueJoinObject();
LOG.trace("called genAllOneUniqueJoinObject");
} else if (!hasEmpty && !hasLeftSemiJoin) {
LOG.trace("calling genUniqueJoinObject");
genUniqueJoinObject(0, 0);
LOG.trace("called genUniqueJoinObject");
} else {
LOG.trace("calling genObject");
genJoinObject();
LOG.trace("called genObject");
}
{code}
This is the call stack
{code}
Stack Trace Sample Count Percentage(%)
hadoop.hive.ql.exec.MapJoinOperator.processOp(Object, int) 388 75.486
hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject() 121 23.541
hadoop.hive.ql.exec.CommonJoinOperator.genAllOneUniqueJoinObject()
92 17.899
commons.logging.impl.Log4JLogger.trace(Object) 20 3.891
log4j.Category.log(String, Priority, Object, Throwable) 20
3.891
log4j.Category.getEffectiveLevel() 10 1.946
{code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)