[ https://issues.apache.org/jira/browse/HIVE-18445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325380#comment-16325380 ]
Laszlo Bodor edited comment on HIVE-18445 at 1/19/18 1:02 PM: -------------------------------------------------------------- The issue here is, that the test is about exhausting the local mapper task's memory, and to achieve this, it sets a parameter at the beginning... {code:java} set hive.mapjoin.localtask.max.memory.usage = 0.0001; {code} ...so the task can use the 0.01% percent of the process' memory. It seems to be ok for testing memory exhaustion, but the problem is that it affects all queries. Checking the q.out file, it seems like we expect an exhaustion by running the 2nd query: {code:java} FROM src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key) INSERT OVERWRITE TABLE dest_j2 SELECT src1.key, src3.value; {code} But when the test fails, it fails on the first statement (which is not supposed to fail): {code:java} FROM srcpart src1 JOIN src src2 ON (src1.key = src2.key) INSERT OVERWRITE TABLE dest1 SELECT src1.key, src2.value where (src1.ds = '2008-04-08' or src1.ds = '2008-04-09' )and (src1.hr = '12' or src1.hr = '11'); {code} I think the best practise would be to set the parameter before the target query, and reset it to default (or a higher value) after, like: {code:java} set hive.mapjoin.localtask.max.memory.usage = 0.0001; FROM src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key) INSERT OVERWRITE TABLE dest_j2 SELECT src1.key, src3.value; set hive.mapjoin.localtask.max.memory.usage = 0.9; {code} was (Author: abstractdog): The issue here is, that the test is about exhausting the local mapper task's memory, and to achieve this, it sets a parameter at the beginning... {code} set hive.mapjoin.localtask.max.memory.usage = 0.0001; {code} ...so the task can use the 0.0001% percent of the process' memory. It seems to be ok for testing memory exhaustion, but the problem is that it affects all queries. Checking the q.out file, it seems like we expect an exhaustion by running the 2nd query: {code} FROM src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key) INSERT OVERWRITE TABLE dest_j2 SELECT src1.key, src3.value; {code} But when the test fails, it fails on the first statement (which is not supposed to fail): {code} FROM srcpart src1 JOIN src src2 ON (src1.key = src2.key) INSERT OVERWRITE TABLE dest1 SELECT src1.key, src2.value where (src1.ds = '2008-04-08' or src1.ds = '2008-04-09' )and (src1.hr = '12' or src1.hr = '11'); {code} I think the best practise would be to set the parameter before the target query, and reset it to default (or a higher value) after, like: {code} set hive.mapjoin.localtask.max.memory.usage = 0.0001; FROM src src1 JOIN src src2 ON (src1.key = src2.key) JOIN src src3 ON (src1.key + src2.key = src3.key) INSERT OVERWRITE TABLE dest_j2 SELECT src1.key, src3.value; set hive.mapjoin.localtask.max.memory.usage = 0.9; {code} > qtests: auto_join25.q fails permanently > --------------------------------------- > > Key: HIVE-18445 > URL: https://issues.apache.org/jira/browse/HIVE-18445 > Project: Hive > Issue Type: Bug > Components: Tests > Reporter: Laszlo Bodor > Assignee: Laszlo Bodor > Priority: Major > > org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[auto_join25] > (batchId=72) -- This message was sent by Atlassian JIRA (v7.6.3#76005)