zhihai xu created HADOOP-12404: ---------------------------------- Summary: Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class. Key: HADOOP-12404 URL: https://issues.apache.org/jira/browse/HADOOP-12404 Project: Hadoop Common Issue Type: Improvement Components: conf Reporter: zhihai xu Assignee: zhihai xu Priority: Minor
Disable caching for JarURLConnection to avoid sharing JarFile with other users when loading resource from URL in Configuration class. Currently {{Configuration#parse}} will call {{url.openStream}} to get the InputStream for {{DocumentBuilder}} to parse. Based on the JDK source code, the calling sequence is url.openStream => [handler.openConnection.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/Handler.java] => [new JarURLConnection|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarURLConnection.java#JarURLConnection] => JarURLConnection.connect => [factory.get(getJarFileURL(), getUseCaches())|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/JarFileFactory.java] => [URLJarFile.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/sun/net/www/protocol/jar/URLJarFile.java#URLJarFile.getJarFile%28java.net.URL%2Csun.net.www.protocol.jar.URLJarFile.URLJarFileCloseController%29]=>[JarFile.getInputStream|http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/jar/JarFile.java#JarFile.getInputStream%28java.util.zip.ZipEntry%29]=>ZipFile.getInputStream If {{URLConnection#getUseCaches}} is true (by default), URLJarFile will be shared for the same URL. If the shared URLJarFile is closed by other users, all the InputStream returned by URLJarFile#getInputStream will be closed based on the document: http://docs.oracle.com/javase/7/docs/api/java/util/zip/ZipFile.html#getInputStream(java.util.zip.ZipEntry) So we saw the following exception at rare situation which cause a hive job failed in a heavyload system. {code} 2014-10-21 23:44:41,856 ERROR org.apache.hadoop.hive.ql.exec.Task: Ended Job = job_1413909398487_3696 with exception 'java.lang.RuntimeException(java.io.IOException: Stream closed)' java.lang.RuntimeException: java.io.IOException: Stream closed at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2484) at org.apache.hadoop.conf.Configuration.loadResources(Configuration.java:2337) at org.apache.hadoop.conf.Configuration.getProps(Configuration.java:2254) at org.apache.hadoop.conf.Configuration.get(Configuration.java:861) at org.apache.hadoop.mapred.JobConf.checkAndWarnDeprecation(JobConf.java:2030) at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:479) at org.apache.hadoop.mapred.JobConf.<init>(JobConf.java:469) at org.apache.hadoop.mapreduce.Cluster.getJob(Cluster.java:187) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:582) at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:580) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j ava:1614) at org.apache.hadoop.mapred.JobClient.getJobUsingCluster(JobClient.java:580) at org.apache.hadoop.mapred.JobClient.getJob(JobClient.java:598) at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe cHelper.java:288) at org.apache.hadoop.hive.ql.exec.mr.HadoopJobExecHelper.progress(HadoopJobExe cHelper.java:547) at org.apache.hadoop.hive.ql.exec.mr.ExecDriver.execute(ExecDriver.java:426) at org.apache.hadoop.hive.ql.exec.mr.MapRedTask.execute(MapRedTask.java:136) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:153) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:85) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1516) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1283) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1101) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:924) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:919) at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation .java:145) at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation. java:69) at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.jav a:200) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.j ava:1614) at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java: 502) at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java: 213) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1 145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java: 615) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: Stream closed at java.util.zip.InflaterInputStream.ensureOpen(InflaterInputStream.java:67) at java.util.zip.InflaterInputStream.read(InflaterInputStream.java:142) at java.io.FilterInputStream.read(FilterInputStream.java:133) at com.sun.org.apache.xerces.internal.impl.XMLEntityManager$RewindableInputStr eam.read(XMLEntityManager.java:2902) at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java: 302) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScan ner.java:1753) at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.skipChar(XMLEntity Scanner.java:1426) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl$Frag mentContentDriver.next(XMLDocumentFragmentScannerImpl.java:2807) at com.sun.org.apache.xerces.internal.impl.XMLDocumentScannerImpl.next(XMLDocu mentScannerImpl.java:606) at com.sun.org.apache.xerces.internal.impl.XMLNSDocumentScannerImpl.next(XMLNS DocumentScannerImpl.java:117) at com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scan Document(XMLDocumentFragmentScannerImpl.java:510) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co nfiguration.java:848) at com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Co nfiguration.java:777) at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:1 41) at com.sun.org.apache.xerces.internal.parsers.DOMParser.parse(DOMParser.java:2 43) at com.sun.org.apache.xerces.internal.jaxp.DocumentBuilderImpl.parse(DocumentB uilderImpl.java:347) at javax.xml.parsers.DocumentBuilder.parse(DocumentBuilder.java:150) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2325) at org.apache.hadoop.conf.Configuration.parse(Configuration.java:2313) at org.apache.hadoop.conf.Configuration.loadResource(Configuration.java:2384) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)