Hi all, We have a spark job (spark 2.4.4, hadoop 2.7, scala 2.11.12) where we use semaphores / parallel collections within our spark job. We definitely notice a huge speedup in our job from doing this, but were wondering if this could cause any unintended side effects? Particularly I’m worried about any deadlocks and if it could mess with the fixes for issues such as this https://issues.apache.org/jira/plugins/servlet/mobile#issue/SPARK-26961
We do run with multiple cores. Thanks! -- Cheers, Ruijing Li