HyukjinKwon commented on code in PR #50313:
URL: https://github.com/apache/spark/pull/50313#discussion_r2009406920


##########
python/pyspark/errors/exceptions/base.py:
##########
@@ -449,3 +449,35 @@ def summary(self) -> str:
         Summary of the exception cause.
         """
         ...
+
+
+T = TypeVar("T", bound=PySparkException)
+
+
+def recover_python_exception(e: T) -> T:
+    """
+    Recover Python exception stack trace.
+
+    Many JVM exceptions types may wrap Python exceptions. For example:
+    - UDFs can cause PythonException
+    - UDTFs and Data Sources can cause AnalysisException
+    """
+    python_exception_header = "Traceback (most recent call last):"
+    try:
+        from pyspark.errors.exceptions.tblib import Traceback
+
+        message = str(e)
+        start = message.find(python_exception_header)
+        if start == -1:
+            # No Python exception found
+            return e
+
+        # The message contains a Python exception. Parse it to use it as the 
exception's traceback.
+        # This allows richer error messages, for example showing line content 
in Python UDF.
+        python_exception_string = message[start:]
+        tb = Traceback.from_string(python_exception_string)
+        tb.populate_linecache()
+        return e.with_traceback(tb.as_traceback())
+    except Exception:

Review Comment:
   I think we need a configuration, or at least environment variable if it's 
difficult to add a configuration here (for the cases when exceptions are thrown 
without JVM). Parsing exceptions can potentially cause a lot of performance 
overhead, e.g., if users are relaying on a lot of exceptions.
   
   In addition, it would have to be `BaseException` if you absolutely want to 
catch all the exceptions.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to