lucasbru opened a new pull request, #12875:
URL: https://github.com/apache/kafka/pull/12875

   In this change, we enable backing off when the state directory
   is still locked during initialization of a task. For this, we
   introduce a new queue inside the state updater, that keeps all
   tasks that still need to be initialized. When a new task is added
   to the state updater, it is inserted into the queue for
   initialization.
   
   When the state directory is locked, the task is reinserted into
   the initialization queue. We will reattempt to acquire the lock
   after the next round of restoration. In the rare case where
   all tasks are still locked from being initialized, we back-off
   for 1 second before retrying, and avoid a busy wait on the lock
   this way.
   
   During system testing, `ThreadCache` threw a concurrent
   modification exception - when the state updater would create
   a new cache, while the main thread would compute the size of the
   caches for eviction inside `sizeBytes`. Since the data structure
   is designed to be thread-safe, in this change we also synchronize
   the `sizeBytes` function.
   
   ### Committer Checklist (excluded from commit message)
   - [x] Verify design and implementation 
   - [x] Verify test coverage and CI build status
   - [x] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to