allisonwang-db commented on code in PR #49535:
URL: https://github.com/apache/spark/pull/49535#discussion_r1924245423


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3459,6 +3459,16 @@ object SQLConf {
       .checkValues(Set("legacy", "row", "dict"))
       .createWithDefaultString("legacy")
 
+  val PYSPARK_HIDE_TRACEBACK =
+    buildConf("spark.sql.execution.pyspark.udf.hideTraceback.enabled")
+      .doc(
+        "When true, only show the message of the exception from Python UDFs, " 
+
+        "hiding stack trace and exception type."
+      )

Review Comment:
   nit
   ```suggestion
           "hiding stack trace and exception type.")
   ```



##########
core/src/main/scala/org/apache/spark/api/python/PythonRunner.scala:
##########
@@ -122,6 +122,7 @@ private[spark] abstract class BasePythonRunner[IN, OUT](
   protected val authSocketTimeout = conf.get(PYTHON_AUTH_SOCKET_TIMEOUT)
   private val reuseWorker = conf.get(PYTHON_WORKER_REUSE)
   protected val faultHandlerEnabled: Boolean = 
conf.get(PYTHON_WORKER_FAULTHANLDER_ENABLED)
+  protected val hideTraceback: Boolean = false

Review Comment:
   Can we use `conf.get(PYSPARK_HIDE_TRACEBACK)` here so that we don't need to 
override every subclass?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3459,6 +3459,16 @@ object SQLConf {
       .checkValues(Set("legacy", "row", "dict"))
       .createWithDefaultString("legacy")
 
+  val PYSPARK_HIDE_TRACEBACK =
+    buildConf("spark.sql.execution.pyspark.udf.hideTraceback.enabled")
+      .doc(
+        "When true, only show the message of the exception from Python UDFs, " 
+
+        "hiding stack trace and exception type."
+      )

Review Comment:
   I think we should only hide stack trace but not the Python error class 
(exception type)?



##########
python/pyspark/util.py:
##########
@@ -468,16 +468,19 @@ def handle_worker_exception(e: BaseException, outfile: 
IO) -> None:
     and exception traceback info to outfile. JVM could then read from the 
outfile and perform
     exception handling there.
     """
-    try:
-        exc_info = None
+
+    def format_exception() -> str:
+        if os.environ.get("SPARK_HIDE_TRACEBACK", False):
+            return "".join(traceback.format_exception_only(type(e), e))

Review Comment:
   Instead of checking this environment variable, can we add a parameter to 
this `handle_worker_exception` method?
   Something like `handle_worker_exception(..., hide_traceback)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to