philwalk opened a new pull request, #48937:
URL: https://github.com/apache/spark/pull/48937

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: 
https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: 
https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., 
'[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a 
faster review.
     7. If you want to add a new configuration, please read the guideline first 
for naming configurations in
        
'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the 
guideline first in
        'common/utils/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   The last action in 
[bin/load-spark-env.sh](https://github.com/apache/spark/blob/d5da49d56d7dec5f8a96c5252384d865f7efd4d9/bin/load-spark-env.sh#L68)
 performs a test to determine whether running in a terminal or not, and whether 
`stdin` is reading from a pipe.   A more portable test is needed.
   
   ### Why are the changes needed?
   The current approach relies on `ps` with options that vary significantly 
between different Unix-like systems.  Specifically, it prints an error message 
in both `cygwin` and `msys2` (and by extension, in all of the variations of 
`git-for-windows`).   It doesn't print an error message, but fails to detect a 
terminal session in `Linux` and `Osx/Darwin homebrew` (always thinks STDIN is a 
pipe).
   
   Here's what the problem looks like in a `cygwin64` session (with `set -x` 
just ahead of the section of interest):
   
   If called directly:
   ```bash
   $ bin/load-spark-env.sh
   ++ ps -o stat= -p 1947
   ps: unknown option -- o
   Try `ps --help' for more information.
   + [[ ! '' =~ \+ ]]
   + [[ -p /dev/stdin ]]
   + export 'SPARK_BEELINE_OPTS= -Djline.terminal=jline.UnsupportedTerminal'
   + SPARK_BEELINE_OPTS=' -Djline.terminal=jline.UnsupportedTerminal'
   ```
   Interestingly, due to the 2-part test, it does the right thing w.r.t. the 
Terminal test, the main problem being the error message.
   If called downstream from a pipe:
   ```bash
   $ echo "yo" | bin/load-spark-env.sh
   ++ ps -o stat= -p 1955
   ps: unknown option -- o
   Try `ps --help' for more information.
   + [[ ! '' =~ \+ ]]
   + [[ -p /dev/stdin ]]
   ```
   Again, it correctly detects the pipe environment, but with an error message.
   
   In WSL2 Ubuntu, the test doesn't correctly detect a non-pipe terminal 
session:
   ```bash
   # /opt/spark$ bin/load-spark-env.sh
   ++ ps -o stat= -p 1423
   + [[ ! S+ =~ \+ ]]
   # echo "yo!" | bin/load-spark-env.sh
   ++ ps -o stat= -p 1416
   + [[ ! S+ =~ \+ ]]
   ```
   In `#134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 2024`, the same failure occurs 
(it doesn't recognize terminal environments).
   
   ### Does this PR introduce _any_ user-facing change?
   This is a proposed bug fix, and, other than fixing the bug,  should be 
invisible to users.
   
   ### How was this patch tested?
   The patch was verified to behave as intended in terminal sessions, both 
interactive and piped, in the following 5 environments.
   ```
   
   - Linux quadd 5.15.0-124-generic #134-Ubuntu SMP Fri Sep 27 20:20:17 UTC 
2024 x86_64 x86_64 x86_64 GNU/Linux
   - Linux d5 5.15.153.1-microsoft-standard-WSL2 #1 SMP Fri Mar 29 23:14:13 UTC 
2024 x86_64 x86_64 x86_64 GNU/Linux
   - MINGW64_NT-10.0-22631 d5 3.5.4-0bc1222b.x86_64 2024-09-04 18:28 UTC x86_64 
Msys
   - CYGWIN_NT-10.0-22631 d5 3.5.3-1.x86_64 2024-04-03 17:25 UTC x86_64 Cygwin
   - Darwin suemac.local 23.6.0 Darwin Kernel Version 23.6.0: Mon Jul 29 
21:14:21 PDT 2024; root:xnu-10063.141.2~1/RELEASE_ARM64_T8103 arm64
   
   ```
   The test was to manually run the following script, verifying the expected 
response to both pipe and terminal sessions.
   ```bash
   #!/bin/bash
   if [ -e /usr/bin/tty -a "`tty`" != "not a tty" -a ! -p /dev/stdin ]; then
     echo "not a pipe"
   else
     echo "is a pipe"
   fi
   ```
   The output of the manual test in all 5 tested environments.
   ```
   philwalk@quadd:/opt/spark
   $ isPipe
   not a pipe
   #
   $ echo "yo" | isPipe
   is a pipe
   #
   ```
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to