[ https://issues.apache.org/jira/browse/HIVE-17941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Janaki Lahorani reassigned HIVE-17941: -------------------------------------- Assignee: Janaki Lahorani > Don't Re-Create RunningJob Client During Status Checks > ------------------------------------------------------ > > Key: HIVE-17941 > URL: https://issues.apache.org/jira/browse/HIVE-17941 > Project: Hive > Issue Type: Improvement > Components: HiveServer2 > Affects Versions: 3.0.0, 2.3.1 > Reporter: BELUGA BEHR > Assignee: Janaki Lahorani > Priority: Major > > {code:java|title=org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper} > while (!rj.isComplete()) { > ... > RunningJob newRj = jc.getJob(rj.getID()); > if (newRj == null) { > // under exceptional load, hadoop may not be able to look up status > // of finished jobs (because it has purged them from memory). From > // hive's perspective - it's equivalent to the job having failed. > // So raise a meaningful exception > throw new IOException("Could not find status of job:" + rj.getID()); > } else { > th.setRunningJob(newRj); > rj = newRj; > } > } > ... > } > {code} > https://github.com/apache/hive/blob/a9f25c0e7ad3f81a9f00f601947a161516e33f1b/ql/src/java/org/apache/hadoop/hive/ql/exec/mr/HadoopJobExecHelper.java#L295-L306 > Every time we loop here for a status update, we are rebuilding the RunningJob > object to test if the Job information is still loaded in YARN. Rebuilding > this RunningJob object is not trivial because it requires that we re-load and > parse the Job Configuration XML file every time. > {code:java|title=Outdated Stacktrace But Same Idea Holds} > at java.io.FileInputStream.open(Native Method) > at java.io.FileInputStream.<init>(FileInputStream.java:120) > at > org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:1924) > at > org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:1877) > at > org.apache.hadoop.conf.Configuration.getProps(Configuration.java:1785) > at org.apache.hadoop.conf.Configuration.get(Configuration.java:712) > at > org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:1951) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:398) > at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:388) > at > org.apache.hadoop.mapred.JobClient$NetworkedJob.<init>(JobClient.java:174) > at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:655) > at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:668) > at > org.apache.hadoop.hive.ql.exec.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:282) > at > org.apache.hadoop.hive.ql.exec.HadoopJobExecHelper.progress(HadoopJobExecHelper.java:532) > {code} > Maybe we can be use {{isRetired()}} instead for this particular check. We > also probably need to be better about checking the return value from any of > the {{RunningJob}} methods if it's the case that they can fail/go-away at any > time if YARN purges the information. It seems that perhaps this was an > attempt to detect a purged job before exercising the {{RunningJob}} object... > even though it can go bad at any point. > https://hadoop.apache.org/docs/r2.7.1/api/org/apache/hadoop/mapred/RunningJob.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)