zecookiez commented on code in PR #50123:
URL: https://github.com/apache/spark/pull/50123#discussion_r2035566019


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/StateStore.scala:
##########
@@ -593,7 +593,10 @@ trait StateStoreProvider {
   def supportedInstanceMetrics: Seq[StateStoreInstanceMetric] = Seq.empty
 }
 
-object StateStoreProvider {
+object StateStoreProvider extends Logging {
+
+  @GuardedBy("this")

Review Comment:
   Yeah I'll add more context to this, but there were some situations that 
caused the `coordinatorRef` call from the uploadSnapshot method to freeze. 
   
   I think the issue is a lock contention with the `loadedProviders` lock, so 
RPC calls to obtain the coordinator were getting stuck. Since these upload RPC 
calls seemed logically separate to what was being used in StateStore object, I 
made a separate endpoint from StateStoreProviders.
   
   Maybe we can put this elsewhere that would make more sense
   
   This was the stack trace and exception error reported:
   ```
   org.apache.spark.SparkException: 
[CANNOT_LOAD_STATE_STORE.UNRELEASED_THREAD_ERROR] An error occurred during 
loading state. StateStoreId(opId=0,partId=4,name=default): RocksDB instance 
could not be acquired by [ThreadId: Some(16)] for operationType=close_store as 
it was not released by [ThreadId: Some(314791), task: partition 4.0 in stage 
372.0, TID 1145] after 120007 ms.
   [info] Thread holding the lock has trace: 
app//org.apache.spark.sql.execution.streaming.state.StateStore$.coordinatorRef(StateStore.scala:1157)
   [info] 
app//org.apache.spark.sql.execution.streaming.state.StateStore$.reportSnapshotUploaded(StateStore.scala:1154)
   [info] 
app//org.apache.spark.sql.execution.streaming.state.RocksDBEventListener.reportSnapshotUploaded(RocksDBStateStoreProvider.scala:981)
   ...
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to