toninis opened a new issue, #5249: URL: https://github.com/apache/hudi/issues/5249
**Describe the problem you faced** We 've noticed that when you enable hoodie JMX metrics Shutdown hook is never called. We took a thread dump to check with threads remain in running state and keep the JVM up . ``` "RMI TCP Accept-9889": running at [java.net.PlainSocketImpl.socketAccept(Native Method)](https://spotify.github.io/threaddump-analyzer/#java.net.PlainSocketImpl.socketAccept(Native%20Method)) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) at java.net.ServerSocket.implAccept(ServerSocket.java:560) at java.net.ServerSocket.accept(ServerSocket.java:528) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:405) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:377) at java.lang.Thread.run(Thread.java:750) ``` **To Reproduce** Steps to reproduce the behavior: 1. Enable JMX hoodie metrics ``` hoodie.metrics.on=true hoodie.metrics.reporter.type=JMX ``` 2. Run a Deltastreamer job on Kubernetes **Expected behavior** We expect when Spark Context is stopped also the JVM should stop. **Environment Description** * Hudi version : 0.9.0 * Spark version : 3.1.2 * Hive version : 2.3.7 * Hadoop version : N/A * Storage (HDFS/S3/GCS..) : S3 * Running on Docker? (yes/no) : yes * Java Version : ``` openjdk version "1.8.0_322" OpenJDK Runtime Environment (build 1.8.0_322-b06) OpenJDK 64-Bit Server VM (build 25.322-b06, mixed mode) ``` & ``` openjdk version "11.0.14.1" 2022-02-08 OpenJDK Runtime Environment 18.9 (build 11.0.14.1+1) OpenJDK 64-Bit Server VM 18.9 (build 11.0.14.1+1, mixed mode, sharing) ``` **Additional context** We tested this with both Java 11 and Java 8 and got the same results . You can find below the full thread dump output once the Spark Context is succefully stoppped . We got it with `kill -3 <pid>` . <details><summary>Full Thread Dump Output</summary> ``` Full thread dump OpenJDK 64-Bit Server VM (25.322-b06 mixed mode): "DestroyJavaVM" #73 prio=5 os_prio=0 tid=0x00007f8a7c011000 nid=0x1a waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "RMI TCP Connection(idle)" #67 daemon prio=5 os_prio=0 tid=0x00007f8974003800 nid=0x5d waiting on condition [0x00007f897b8f9000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000c10bc948> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) "RMI RenewClean-[100.97.64.2:9889]" #66 daemon prio=5 os_prio=0 tid=0x00007f896c006800 nid=0x5c in Object.wait() [0x00007f897b9fa000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x00000000c1064a20> (a java.lang.ref.ReferenceQueue$Lock) at sun.rmi.transport.DGCClient$EndpointEntry$RenewCleanThread.run(DGCClient.java:563) at java.lang.Thread.run(Thread.java:750) "RMI Scheduler(0)" #65 daemon prio=5 os_prio=0 tid=0x00007f8a7ddf6000 nid=0x5b waiting on condition [0x00007f897bafb000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000c1064c28> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) "RMI TCP Connection(idle)" #64 daemon prio=5 os_prio=0 tid=0x00007f8974001800 nid=0x5a waiting on condition [0x00007f897bbfc000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000c10bc948> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) "GC Daemon" #63 daemon prio=2 os_prio=0 tid=0x00007f8a7dde9800 nid=0x59 in Object.wait() [0x00007f897befd000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at sun.misc.GC$Daemon.run(GC.java:117) - locked <0x00000000c1067260> (a sun.misc.GC$LatencyLock) "RMI Reaper" #62 prio=5 os_prio=0 tid=0x00007f8a7dde7800 nid=0x58 in Object.wait() [0x00007f897bffe000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x00000000c10673e8> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at sun.rmi.transport.ObjectTable$Reaper.run(ObjectTable.java:351) at java.lang.Thread.run(Thread.java:750) "RMI TCP Accept-9889" #61 daemon prio=5 os_prio=0 tid=0x00007f8a7c275800 nid=0x57 runnable [0x00007f89d81e5000] java.lang.Thread.State: RUNNABLE at java.net.PlainSocketImpl.socketAccept(Native Method) at java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409) at java.net.ServerSocket.implAccept(ServerSocket.java:560) at java.net.ServerSocket.accept(ServerSocket.java:528) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:405) at sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:377) at java.lang.Thread.run(Thread.java:750) "java-sdk-http-connection-reaper" #58 daemon prio=5 os_prio=0 tid=0x00007f8a7da85800 nid=0x54 waiting on condition [0x00007f89d8ae8000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(Native Method) at com.amazonaws.http.IdleConnectionReaper.run(IdleConnectionReaper.java:181) "Timer for 's3a-file-system' metrics system" #57 daemon prio=5 os_prio=0 tid=0x00007f8a7e0fa000 nid=0x53 in Object.wait() [0x00007f89d8fe9000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.util.TimerThread.mainLoop(Timer.java:552) - locked <0x00000000d6720a88> (a java.util.TaskQueue) at java.util.TimerThread.run(Timer.java:505) "MutableQuantiles-0" #56 daemon prio=5 os_prio=0 tid=0x00007f8a7e0c2000 nid=0x52 waiting on condition [0x00007f89d90ea000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000d66e87c8> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2078) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093) at java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) "org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner" #55 daemon prio=5 os_prio=0 tid=0x00007f8a7e0bd000 nid=0x51 in Object.wait() [0x00007f89d91eb000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x00000000d66e8a08> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at org.apache.hadoop.fs.FileSystem$Statistics$StatisticsDataReferenceCleaner.run(FileSystem.java:3762) at java.lang.Thread.run(Thread.java:750) "Okio Watchdog" #41 daemon prio=5 os_prio=0 tid=0x00007f8a7d1ea800 nid=0x41 in Object.wait() [0x00007f89dbbfe000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at okio.AsyncTimeout.awaitTimeout(AsyncTimeout.java:347) at okio.AsyncTimeout$Watchdog.run(AsyncTimeout.java:312) - locked <0x00000000d62d18c8> (a java.lang.Class for okio.AsyncTimeout) "OkHttp ConnectionPool" #40 daemon prio=5 os_prio=0 tid=0x00007f8a7da40000 nid=0x40 in Object.wait() [0x00007f8a4c69b000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:460) at okhttp3.ConnectionPool$1.run(ConnectionPool.java:67) - locked <0x00000000d5ccb0e8> (a okhttp3.ConnectionPool) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) "process reaper" #10 daemon prio=10 os_prio=0 tid=0x00007f8a7ec2d800 nid=0x24 waiting on condition [0x00007f8a6c082000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000d5888a60> (a java.util.concurrent.SynchronousQueue$TransferStack) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) at java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460) at java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362) at java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941) at java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:750) "Service Thread" #7 daemon prio=9 os_prio=0 tid=0x00007f8a7c0c1800 nid=0x21 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C1 CompilerThread1" #6 daemon prio=9 os_prio=0 tid=0x00007f8a7c0bc800 nid=0x20 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" #5 daemon prio=9 os_prio=0 tid=0x00007f8a7c0ba800 nid=0x1f waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Signal Dispatcher" #4 daemon prio=9 os_prio=0 tid=0x00007f8a7c0a6800 nid=0x1e waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Finalizer" #3 daemon prio=8 os_prio=0 tid=0x00007f8a7c083000 nid=0x1d in Object.wait() [0x00007f8a810e3000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144) - locked <0x00000000d55feb78> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165) at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:216) "Reference Handler" #2 daemon prio=10 os_prio=0 tid=0x00007f8a7c07e800 nid=0x1c in Object.wait() [0x00007f8a811e4000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) at java.lang.Object.wait(Object.java:502) at java.lang.ref.Reference.tryHandlePending(Reference.java:191) - locked <0x00000000d55fed30> (a java.lang.ref.Reference$Lock) at java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153) "VM Thread" os_prio=0 tid=0x00007f8a7c075000 nid=0x1b runnable "VM Periodic Task Thread" os_prio=0 tid=0x00007f8a7c0c4800 nid=0x22 waiting on condition ``` </details> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: commits-unsubscr...@hudi.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org