LuciferYang commented on code in PR #50545: URL: https://github.com/apache/spark/pull/50545#discussion_r2035025586
########## core/src/test/scala/org/apache/spark/executor/ProcfsMetricsGetterSuite.scala: ########## @@ -86,7 +86,7 @@ class ProcfsMetricsGetterSuite extends SparkFunSuite { val child = process.toHandle.pid() eventually(timeout(10.seconds), interval(100.milliseconds)) { val pids = p.computeProcessTree() - assert(pids.size === 3) + assert(pids.size === 4) assert(pids.contains(currentPid)) assert(pids.contains(child)) Review Comment: I added some logs to the code. ``` eventually(timeout(10.seconds), interval(100.milliseconds)) { val pids = p.computeProcessTree() println(s"pids size = ${pids.size}") println(s"currentPid = $currentPid, currentHandle = ${currentHandle.info()}") currentHandle.descendants().forEachOrdered((t: ProcessHandle) => println(s"child = ${t.pid()}, info = ${t.info()}")) println("=============================") assert(pids.size === 3) assert(pids.contains(currentPid)) assert(pids.contains(child)) } ``` Then I found that the above `eventually` block doesn't always capture the stage where `pids.size` is 3: My test environment is: ``` M3 macOS 15.4 zulu 17.0.14 ``` If the test command is `build/sbt clean "core/testOnly org.apache.spark.ui.UISeleniumSuite org.apache.spark.executor.ProcfsMetricsGetterSuite` The stage where `pids.size` is 2 ``` pids size = 2 currentPid = 36278, currentHandle = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [@/var/folders/j2/cfn7w6795538n_416_27rkqm0000gn/T/sbt-args7389474327143147310.tmp], startTime: Optional[2025-04-09T09:35:13.084Z], totalTime: Optional[PT24.80429S]] child = 36959, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 36278], startTime: Optional[2025-04-09T09:36:12.360Z]] ``` The stage where `pids.size` is 4 ``` pids size = 4 currentPid = 36278, currentHandle = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [@/var/folders/j2/cfn7w6795538n_416_27rkqm0000gn/T/sbt-args7389474327143147310.tmp], startTime: Optional[2025-04-09T09:35:13.084Z], totalTime: Optional[PT24.860472S]] child = 36959, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 36278], startTime: Optional[2025-04-09T09:36:12.360Z]] child = 36969, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 36278], startTime: Optional[2025-04-09T09:36:12.380Z]] child = 36970, info = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [-Xmx128m, -cp, /Users/yangjie01/SourceCode/git/spark-sbt/launcher/target/scala-2.13/classes:/Users/yangjie01/SourceCode/git/spark-sbt/assembly/target/scala-2.13/jars/*, org.apache.spark.launcher.Main, org.apache.spark.executor.ProcfsMetricsGetterSuite, 36278], startTime: Optional[2025-04-09T09:36:12.380Z]] ``` The stage where `pids.size` switches back to 2 ``` pids size = 2 currentPid = 36278, currentHandle = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [@/var/folders/j2/cfn7w6795538n_416_27rkqm0000gn/T/sbt-args7389474327143147310.tmp], startTime: Optional[2025-04-09T09:35:13.084Z], totalTime: Optional[PT24.910265S]] child = 36959, info = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [-cp, /Users/yangjie01/SourceCode/git/spark-sbt/conf/:/Users/yangjie01/SourceCode/git/spark-sbt/common/kvstore/target/scala-2.13/classes/:/Users/yangjie01/SourceCode/git/spark-sbt/common/network-common/target/..., -Xmx1g, org.apache.spark.executor.ProcfsMetricsGetterSuite, 36278], startTime: Optional[2025-04-09T09:36:12.360Z]] ``` The stage where `pids.size` is 1 ``` pids size = 1 currentPid = 36278, currentHandle = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [@/var/folders/j2/cfn7w6795538n_416_27rkqm0000gn/T/sbt-args7389474327143147310.tmp], startTime: Optional[2025-04-09T09:35:13.084Z], totalTime: Optional[PT26.330861S]] ``` If the test command is `build/sbt clean "core/testOnly org.apache.spark.executor.ProcfsMetricsGetterSuite`, although there is indeed a moment when the size of `pids` is 3, if we re-acquire the children from `currentHandle.descendants()`, there are some different changes observed. ``` pids size = 3 currentPid = 43845, currentHandle = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [@/var/folders/j2/cfn7w6795538n_416_27rkqm0000gn/T/sbt-args11250934242956975594.tmp], startTime: Optional[2025-04-09T09:46:18.914Z], totalTime: Optional[PT2.16209S]] child = 43849, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 43845], startTime: Optional[2025-04-09T09:46:20.078Z]] child = 43854, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 43845], startTime: Optional[2025-04-09T09:46:20.098Z]] child = 43855, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 43845], startTime: Optional[2025-04-09T09:46:20.098Z]] child = 43856, info = [user: Optional[yangjie01], cmd: /usr/bin/dirname, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class], startTime: Optional[2025-04-09T09:46:20.098Z]] ``` But if we modify `assert(pids.size === 4)`, then ``` pids size = 4 currentPid = 45974, currentHandle = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [@/var/folders/j2/cfn7w6795538n_416_27rkqm0000gn/T/sbt-args656143949227362337.tmp], startTime: Optional[2025-04-09T09:49:03.895Z], totalTime: Optional[PT3.006415S]] child = 45977, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 45974], startTime: Optional[2025-04-09T09:49:05.616Z]] child = 45987, info = [user: Optional[yangjie01], cmd: /bin/bash, args: [/Users/yangjie01/SourceCode/git/spark-sbt/bin/spark-class, org.apache.spark.executor.ProcfsMetricsGetterSuite, 45974], startTime: Optional[2025-04-09T09:49:05.635Z]] child = 45988, info = [user: Optional[yangjie01], cmd: /Users/yangjie01/Tools/zulu17/bin/java, args: [-Xmx128m, -cp, /Users/yangjie01/SourceCode/git/spark-sbt/launcher/target/scala-2.13/classes:/Users/yangjie01/SourceCode/git/spark-sbt/assembly/target/scala-2.13/jars/*, org.apache.spark.launcher.Main, org.apache.spark.executor.ProcfsMetricsGetterSuite, 45974], startTime: Optional[2025-04-09T09:49:05.636Z]] ``` It seems that `assert(pids.size === 4)` might be more reasonable. However, the next two assertions are more critical. Perhaps we could change it to `assert(pids.size === 3 || pids.size === 4)`, or even remove the assertion related to size altogether. What do you think about this? @yaooqinn -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org