hi,Cody i met this issue days before and i post a PR for this( https://github.com/apache/spark/pull/1385) it's very strange that if i synchronize conf it will deadlock but it is ok when synchronize initLocalJobConfFuncOpt
Here's the entire jstack output. On Mon, Jul 14, 2014 at 4:44 PM, Patrick Wendell <pwend...@gmail.com <mailto:pwend...@gmail.com>> wrote: Hey Cody, This Jstack seems truncated, would you mind giving the entire stack trace? For the second thread, for instance, we can't see where the lock is being acquired. - Patrick On Mon, Jul 14, 2014 at 1:42 PM, Cody Koeninger <cody.koenin...@mediacrossing.com <mailto:cody.koenin...@mediacrossing.com>> wrote: > Hi all, just wanted to give a heads up that we're seeing a reproducible > deadlock with spark 1.0.1 with 2.3.0-mr1-cdh5.0.2 > > If jira is a better place for this, apologies in advance - figured talking > about it on the mailing list was friendlier than randomly (re)opening jira > tickets. > > I know Gary had mentioned some issues with 1.0.1 on the mailing list, once > we got a thread dump I wanted to follow up. > > The thread dump shows the deadlock occurs in the synchronized block of code > that was changed in HadoopRDD.scala, for the Spark-1097 issue > > Relevant portions of the thread dump are summarized below, we can provide > the whole dump if it's useful. > > Found one Java-level deadlock: > ============================= > "Executor task launch worker-1": > waiting to lock monitor 0x00007f250400c520 (object 0x00000000fae7dc30, a > org.apache.hadoop.co <http://org.apache.hadoop.co> > nf.Configuration), > which is held by "Executor task launch worker-0" > "Executor task launch worker-0": > waiting to lock monitor 0x00007f2520495620 (object 0x00000000faeb4fc8, a > java.lang.Class), > which is held by "Executor task launch worker-1" > > > "Executor task launch worker-1": > at > org.apache.hadoop.conf.Configuration.reloadConfiguration(Configuration.java:791) > - waiting to lock <0x00000000fae7dc30> (a > org.apache.hadoop.conf.Configuration) > at > org.apache.hadoop.conf.Configuration.addDefaultResource(Configuration.java:690) > - locked <0x00000000faca6ff8> (a java.lang.Class for > org.apache.hadoop.conf.Configurati > on) > at > org.apache.hadoop.hdfs.HdfsConfiguration.<clinit>(HdfsConfiguration.java:34) > at > org.apache.hadoop.hdfs.DistributedFileSystem.<clinit>(DistributedFileSystem.java:110 > ) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl. > java:57) > at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native > Method) > at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl. > java:57) > at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAcces > sorImpl.java:45) > at java.lang.reflect.Constructor.newInstance(Constructor.java:525) > at java.lang.Class.newInstance0(Class.java:374) > at java.lang.Class.newInstance(Class.java:327) > at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:373) > at java.util.ServiceLoader$1.next(ServiceLoader.java:445) > at > org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2364) > - locked <0x00000000faeb4fc8> (a java.lang.Class for > org.apache.hadoop.fs.FileSystem) > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167) > at > org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:587) > at > org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:315) > at > org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:288) > at > org.apache.spark.SparkContext$$anonfun$22.apply(SparkContext.scala:546) > at > org.apache.spark.SparkContext$$anonfun$22.apply(SparkContext.scala:546) > at > org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:145) > > > > ...elided... > > > "Executor task launch worker-0" daemon prio=10 tid=0x0000000001e71800 > nid=0x2d97 waiting for monitor entry [0x00007f24d2bf1000] > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.fs.FileSystem.loadFileSystems(FileSystem.java:2362) > - waiting to lock <0x00000000faeb4fc8> (a java.lang.Class for > org.apache.hadoop.fs.FileSystem) > at > org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2375) > at > org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2392) > at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:89) > at > org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2431) > at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2413) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:368) > at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:167) > at > org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:587) > at > org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:315) > at > org.apache.hadoop.mapred.FileInputFormat.setInputPaths(FileInputFormat.java:288) > at > org.apache.spark.SparkContext$$anonfun$22.apply(SparkContext.scala:546) > at > org.apache.spark.SparkContext$$anonfun$22.apply(SparkContext.scala:546) > at > org.apache.spark.rdd.HadoopRDD$$anonfun$getJobConf$1.apply(HadoopRDD.scala:145)
-- Best Regards Fei Wang --------------------------------------------------------------------------------