[ https://issues.apache.org/jira/browse/HIVE-23802?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17835718#comment-17835718 ]
kongxianghe commented on HIVE-23802: ------------------------------------ try this hive.server2.tez.default.queues? > “merge files” job was submited to default queue when set hive.merge.tezfiles > to true > ------------------------------------------------------------------------------------ > > Key: HIVE-23802 > URL: https://issues.apache.org/jira/browse/HIVE-23802 > Project: Hive > Issue Type: Bug > Components: HiveServer2 > Affects Versions: 3.1.0 > Reporter: gaozhan ding > Assignee: gaozhan ding > Priority: Major > Labels: pull-request-available > Attachments: 15940042679272.png, HIVE-23802.patch > > Time Spent: 0.5h > Remaining Estimate: 0h > > We use tez as the query engine. When hive.merge.tezfiles set to true,merge > files task, which followed by orginal task, will be submit to default queue > rather then the queue same with orginal task. > I study this issue for days and found that, every time starting a container, > "tez,queue.name" whill be unset in current session. Code are as below: > {code:java} > // TezSessionState.startSessionAndContainers() > // sessionState.getQueueName() comes from cluster wide configured queue names. > // sessionState.getConf().get("tez.queue.name") is explicitly set by user in > a session. > // TezSessionPoolManager sets tez.queue.name if user has specified one or > use the one from > // cluster wide queue names. > // There is no way to differentiate how this was set (user vs system). > // Unset this after opening the session so that reopening of session uses > the correct queue > // names i.e, if client has not died and if the user has explicitly set a > queue name > // then reopened session will use user specified queue name else default > cluster queue names. > conf.unset(TezConfiguration.TEZ_QUEUE_NAME); > {code} > So after the orgin task was submited to yarn, "tez.queue.name" will be unset. > While starting merge file task, it will try use the same session with orgin > job, but get false due to tez.queue.name was unset. Seems like we could not > unset this property. > {code:java} > // TezSessionPoolManager.canWorkWithSameSession() > if (!session.isDefault()) { > String queueName = session.getQueueName(); > String confQueueName = conf.get(TezConfiguration.TEZ_QUEUE_NAME); > LOG.info("Current queue name is " + queueName + " incoming queue name is " > + confQueueName); > return (queueName == null) ? confQueueName == null : > queueName.equals(confQueueName); > } else { > // this session should never be a default session unless something has > messed up. > throw new HiveException("The pool session " + session + " should have been > returned to the pool"); > } > {code} > !15940042679272.png! > > -- This message was sent by Atlassian Jira (v8.20.10#820010)