Kontinuation commented on issue #1771: URL: https://github.com/apache/sedona/issues/1771#issuecomment-2616864626
The way we obtain spark session in [dataframe_api.py](https://github.com/apache/sedona/blob/sedona-1.7.0/python/sedona/sql/dataframe_api.py#L60-L78) is problematic in multi-threaded environment. The "active session" is thread local and `SparkSession.getActiveSession` will only return a valid session in the thread that starts the Spark session. I believe that the Python backend is handling requests in a different thread so that thread has no active session. What we need for calling sedona function is a JVMView object. We can obtain this object from `SparkContext._jvm` instead of `spark._jvm`. This won't use any thread local states and will work correctly when there's an active Spark context in the current process. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
