[ https://issues.apache.org/jira/browse/SOLR-17220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17831887#comment-17831887 ]
Aman Raj commented on SOLR-17220: --------------------------------- Link to the Kyuubi community discussion about this issue - https://github.com/apache/kyuubi/issues/6212 > SolrZkClient should be a daemon-thread > -------------------------------------- > > Key: SOLR-17220 > URL: https://issues.apache.org/jira/browse/SOLR-17220 > Project: Solr > Issue Type: Bug > Security Level: Public(Default Security Level. Issues are Public) > Reporter: Aman Raj > Priority: Major > > I am submitting a SparkSql job through spark-submit. Spark version 3.3.1 and > Kyuubi version 1.8.0. I am using Open Source Spark Engine with Kyuubi Authz > module running on top of the Spark Driver in client mode. The Spark job is > successful, but the Spark Driver does not stop and keeps on running and I see > the PolicyRefresher keeps on polling policies from Ranger. > [!https://private-user-images.githubusercontent.com/104416558/317133070-5bf7e9af-24d3-4ffb-8239-53eae0bd88fc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE2NDM0MDQsIm5iZiI6MTcxMTY0MzEwNCwicGF0aCI6Ii8xMDQ0MTY1NTgvMzE3MTMzMDcwLTViZjdlOWFmLTI0ZDMtNGZmYi04MjM5LTUzZWFlMGJkODhmYy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMzI4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDMyOFQxNjI1MDRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00NjJjY2RhOGRhZjU1M2Q3YmM2Y2Q0MzRmYWEzOTlkMGU1ODkyNTkzYzMyN2ZlZTBlMGRiOTI4MmQzNmJmYWE1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.8mUQhNXUL9KOEoXWJhw3vI8eUaCZDFvjwgRzvqwlvEY!|https://private-user-images.githubusercontent.com/104416558/317133070-5bf7e9af-24d3-4ffb-8239-53eae0bd88fc.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE2NDM0MDQsIm5iZiI6MTcxMTY0MzEwNCwicGF0aCI6Ii8xMDQ0MTY1NTgvMzE3MTMzMDcwLTViZjdlOWFmLTI0ZDMtNGZmYi04MjM5LTUzZWFlMGJkODhmYy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMzI4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDMyOFQxNjI1MDRaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT00NjJjY2RhOGRhZjU1M2Q3YmM2Y2Q0MzRmYWEzOTlkMGU1ODkyNTkzYzMyN2ZlZTBlMGRiOTI4MmQzNmJmYWE1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.8mUQhNXUL9KOEoXWJhw3vI8eUaCZDFvjwgRzvqwlvEY] > If you see the logs, PolicyRefresher is still running even after Spark > Context has stopped. This is leading to the Spark Driver not ending and > therefore after sometime I have to manually kill the job. > The issue is that the SolrZkClient used for logging to Solr currently opens > up a non-daemon thread on top of Spark Driver. When Spark Driver is stuck I > collected the jstack of non-daemon threads as follows: > > {{"zkConnectionManagerCallback-5-thread-1" #202 prio=5 os_prio=0 > tid=0x00007f1cbc003000 nid=0x4ca7 waiting on condition [0x00007f1bc3a01000] > java.lang.Thread.State: WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000007b218dad8> (a > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039) > at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:442) > at > java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > at java.lang.Thread.run(Thread.java:750) > "DestroyJavaVM" #453 prio=5 os_prio=0 tid=0x00007f1dfc019000 nid=0x4ab5 > waiting on condition [0x0000000000000000] > java.lang.Thread.State: RUNNABLE > "VM Thread" os_prio=0 tid=0x00007f1dfc09a000 nid=0x4ac3 runnable > "VM Periodic Task Thread" os_prio=0 tid=0x00007f1dfc10e800 nid=0x4ad4 waiting > on condition}} > > So when Spark Context stops, SolrZkClient does not let Spark Driver to exit > since currently it is of type non-daemon as shown below : > !https://private-user-images.githubusercontent.com/104416558/317615386-a0c14b53-4725-46c7-b762-658a838606c8.png?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MTE2NDM0MDUsIm5iZiI6MTcxMTY0MzEwNSwicGF0aCI6Ii8xMDQ0MTY1NTgvMzE3NjE1Mzg2LWEwYzE0YjUzLTQ3MjUtNDZjNy1iNzYyLTY1OGE4Mzg2MDZjOC5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjQwMzI4JTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI0MDMyOFQxNjI1MDVaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT1kYTMzZWE1YzhmZmI2M2ViNzIxNTU1YTVjNWE1YzZhNWUzOWJjMDVmNGU0YWEzMWRiYzY2OWFhYWU0NTFhZjlmJlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCZhY3Rvcl9pZD0wJmtleV9pZD0wJnJlcG9faWQ9MCJ9.vpKICuUHju3g8U48B4O_8iUDBgjvZaijOS0FoDRPEtY! > > The fix for this is to make SolrZkClient as a daemon thread. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org