paleolimbot opened a new pull request, #558: URL: https://github.com/apache/sedona-db/pull/558
I had intended to post a reprex to GeoPandas regarding threading but was caught by this issue, where the way we collected things into Python caused a lot of attempts to acquire the GIL which interefered with UDF execution. Briefly, before this PR, the Python bindings always collected via a special `RecordBatchReader` that called `block_on()`, waiting for the next batch in the output `SendableRecordBatchStream`. To ensure cancellation requests worked, we aquired the GIL every 1 second to check for signals. This constant `block_on()` + GIL acquisition caused a deadlock when Python UDFs were also trying to acquire the GIL. The workaround here is not a full solution but covers the most common case, where a user wants to collect the entire result (e.g., `.to_pandas()`. This is simpler to orchestrate. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
