zecookiez commented on code in PR #50123: URL: https://github.com/apache/spark/pull/50123#discussion_r2035566019
########## sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala: ########## @@ -593,7 +593,10 @@ trait StateStoreProvider { def supportedInstanceMetrics: Seq[StateStoreInstanceMetric] = Seq.empty } -object StateStoreProvider { +object StateStoreProvider extends Logging { + + @GuardedBy("this") Review Comment: Yeah I'll add more context to this, but there were some situations that caused the `coordinatorRef` call from the uploadSnapshot method to freeze. I think the issue is a lock contention with the `loadedProviders` lock, so RPC calls to obtain the coordinator were getting stuck. Since these upload RPC calls seemed logically separate to what was being used in StateStore object, I made a separate endpoint from StateStoreProviders. Maybe we can put this elsewhere that would make more sense This was the stack trace and exception error reported: ``` org.apache.spark.SparkException: [CANNOT_LOAD_STATE_STORE.UNRELEASED_THREAD_ERROR] An error occurred during loading state. StateStoreId(opId=0,partId=4,name=default): RocksDB instance could not be acquired by [ThreadId: Some(16)] for operationType=close_store as it was not released by [ThreadId: Some(314791), task: partition 4.0 in stage 372.0, TID 1145] after 120007 ms. [info] Thread holding the lock has trace: app//org.apache.spark.sql.execution.streaming.state.StateStore$.coordinatorRef(StateStore.scala:1157) [info] app//org.apache.spark.sql.execution.streaming.state.StateStore$.reportSnapshotUploaded(StateStore.scala:1154) [info] app//org.apache.spark.sql.execution.streaming.state.RocksDBEventListener.reportSnapshotUploaded(RocksDBStateStoreProvider.scala:981) ... ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org