Ivan Sadikov created SPARK-56019:
------------------------------------
Summary: Close JDBC connection on task kill to unblock native
socket reads
Key: SPARK-56019
URL: https://issues.apache.org/jira/browse/SPARK-56019
Project: Spark
Issue Type: Bug
Components: SQL
Affects Versions: 4.2.0
Reporter: Ivan Sadikov
*Problem:* **When a Spark task is killed via the task reaper, tasks that are
blocked in a native JDBC socket read (e.g.
java.net.SocketInputStream.socketRead0() or
sun.nio.ch.SocketDispatcher.read0()) never terminate. This affects both the
read path (JDBCRDD.compute — blocked in ResultSet.next()) and the write path
(JdbcUtils.savePartition — blocked in PreparedStatement.executeBatch()).
*Root* *Cause:* **The SQL Server JDBC driver (and others) perform I/O via
blocking native calls. Thread.interrupt() sets the interrupt flag but does
*not* unblock a thread stuck in a native socket read. The existing
addTaskCompletionListener that closes rs/stmt/conn fires only _after_ the task
body exits — which never happens if the thread is stuck waiting for the
database.
*Potential* *Fix:* **Register a mechanism that fires from the task-killer side
(before the task thread exits) to forcibly close the JDBC Connection. Closing
the connection closes the underlying TCP socket, which causes the blocked
native read to throw a SocketException (propagated as SQLException), promptly
unblocking the task thread. We will need to handle
* Write path (JdbcUtils.savePartition)
* Read path (JDBCRDD.compute)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]