Hi Marcelo,

Thanks for your response.

I have dumped the threads on the server where I submitted the spark
application:

'''
...
"dispatcher-event-loop-2" #28 daemon prio=5 os_prio=0
tid=0x00007f56cee0e000 nid=0x1cb6 waiting on condition [0x00007f5699811000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006400161b8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"dispatcher-event-loop-1" #27 daemon prio=5 os_prio=0
tid=0x00007f56cee0c800 nid=0x1cb5 waiting on condition [0x00007f5699912000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006400161b8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"dispatcher-event-loop-0" #26 daemon prio=5 os_prio=0
tid=0x00007f56cee0c000 nid=0x1cb4 waiting on condition [0x00007f569a120000]
   java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for  <0x00000006400161b8> (a
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442)
at
org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)

"Service Thread" #20 daemon prio=9 os_prio=0 tid=0x00007f56cc12d800
nid=0x1ca5 runnable [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE

"C1 CompilerThread14" #19 daemon prio=9 os_prio=0 tid=0x00007f56cc12a000
nid=0x1ca4 waiting on condition [0x0000000000000000]
   java.lang.Thread.State: RUNNABLE
...
"Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f56cc0ce000 nid=0x1c93 in
Object.wait() [0x00007f56ab3f2000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:143)
- locked <0x00000006400cd498> (a java.lang.ref.ReferenceQueue$Lock)
at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:164)
at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:209)

"Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f56cc0c9800
nid=0x1c92 in Object.wait() [0x00007f55cfffe000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.ref.Reference.tryHandlePending(Reference.java:191)
- locked <0x00000006400a2660> (a java.lang.ref.Reference$Lock)
at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)

"main" #1 prio=5 os_prio=0 tid=0x00007f56cc021000 nid=0x1c74 in
Object.wait() [0x00007f56d344c000]
   java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1249)
- locked <0x000000064056f6a0> (a
org.apache.hadoop.util.ShutdownHookManager$1)
at java.lang.Thread.join(Thread.java:1323)
at
java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:106)
at
java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
at java.lang.Shutdown.runHooks(Shutdown.java:123)
at java.lang.Shutdown.sequence(Shutdown.java:167)
at java.lang.Shutdown.exit(Shutdown.java:212)
- locked <0x00000006404e65b8> (a java.lang.Class for java.lang.Shutdown)
at java.lang.Runtime.exit(Runtime.java:109)
at java.lang.System.exit(System.java:971)
at scala.sys.package$.exit(package.scala:40)
at scala.sys.package$.exit(package.scala:33)
at
actionmodel.ParallelAdvertiserBeaconModel$.main(ParallelAdvertiserBeaconModel.scala:252)
at
actionmodel.ParallelAdvertiserBeaconModel.main(ParallelAdvertiserBeaconModel.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:879)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

"VM Thread" os_prio=0 tid=0x00007f56cc0c1800 nid=0x1c91 runnable
...
'''

I have no clear idea what went wrong. I did call  awaitTermination to
terminate the thread pool. Or is there any way to force close all those
'WAITING' threads associated with my spark application?

On Wed, Jan 16, 2019 at 8:31 AM Marcelo Vanzin <van...@cloudera.com> wrote:

> If System.exit() doesn't work, you may have a bigger problem
> somewhere. Check your threads (using e.g. jstack) to see what's going
> on.
>
> On Wed, Jan 16, 2019 at 8:09 AM Pola Yao <pola....@gmail.com> wrote:
> >
> > Hi Marcelo,
> >
> > Thanks for your reply! It made sense to me. However, I've tried many
> ways to exit the spark (e.g., System.exit()), but failed. Is there an
> explicit way to shutdown all the alive threads in the spark application and
> then quit afterwards?
> >
> >
> > On Tue, Jan 15, 2019 at 2:38 PM Marcelo Vanzin <van...@cloudera.com>
> wrote:
> >>
> >> You should check the active threads in your app. Since your pool uses
> >> non-daemon threads, that will prevent the app from exiting.
> >>
> >> spark.stop() should have stopped the Spark jobs in other threads, at
> >> least. But if something is blocking one of those threads, or if
> >> something is creating a non-daemon thread that stays alive somewhere,
> >> you'll see that.
> >>
> >> Or you can force quit with sys.exit.
> >>
> >> On Tue, Jan 15, 2019 at 1:30 PM Pola Yao <pola....@gmail.com> wrote:
> >> >
> >> > I submitted a Spark job through ./spark-submit command, the code was
> executed successfully, however, the application got stuck when trying to
> quit spark.
> >> >
> >> > My code snippet:
> >> > '''
> >> > {
> >> >
> >> > val spark = SparkSession.builder.master(...).getOrCreate
> >> >
> >> > val pool = Executors.newFixedThreadPool(3)
> >> > implicit val xc = ExecutionContext.fromExecutorService(pool)
> >> > val taskList = List(train1, train2, train3)  // where train* is a
> Future function which wrapped up some data reading and feature engineering
> and machine learning steps
> >> > val results = Await.result(Future.sequence(taskList), 20 minutes)
> >> >
> >> > println("Shutting down pool and executor service")
> >> > pool.shutdown()
> >> > xc.shutdown()
> >> >
> >> > println("Exiting spark")
> >> > spark.stop()
> >> >
> >> > }
> >> > '''
> >> >
> >> > After I submitted the job, from terminal, I could see the code was
> executed and printing "Exiting spark", however, after printing that line,
> it never existed spark, just got stuck.
> >> >
> >> > Does any body know what the reason is? Or how to force quitting?
> >> >
> >> > Thanks!
> >> >
> >> >
> >>
> >>
> >> --
> >> Marcelo
>
>
>
> --
> Marcelo
>

Reply via email to