We are using SparkLauncher and SparkAppHandle.Listener to launch spark applications from a Java web application and listen to the state changes. Our observation is that as the number of concurrent jobs grow sometimes some of the state changes are not reported (e.g. some applications never report final state even when the corresponding spark job in YARN UI is marked FINISHED). I'm wondering if there are any guidelines/limits on launching (potentially large number of long running), concurrent spark jobs?
Thanks,