kamalcph opened a new pull request, #19462:
URL: https://github.com/apache/kafka/pull/19462

   For segments that are uploaded to remote, RemoteIndexCache caches the 
fetched offset, timestamp, and transaction index entries on the first 
invocation to remote, then the subsequent invocations are accessed from local.
   
   The remote indexes that are cached locally gets removed on two cases:
   
   1. Remote segments are deleted due to breach by retention size/time and 
start-offset.
   2. The number of cached indexes exceed the remote-log-index-cache size limit 
of 1 GB (default).
   
   There are two layers of locks used in the RemoteIndexCache. One first-layer 
lock on the RemoteIndexCache and the second-layer lock on the 
RemoteIndexCache#Entry.
   
   **Issue**
   
   1. The first-layer of lock coordinates the remote-log reader and deleter 
threads. To ensure that the reader and deleter threads are not blocked on each 
other, we only take `lock.readLock()` when accessing/deleting a cached index 
entries.
   2. The issue happens when both the reader and deleter threads took the 
readLock, then the deleter thread marked the index as `markedForCleanup`. Now, 
the reader thread which holds the `indexEntry` gets an IllegalStateException 
when accessing it.
   3. This is a concurrency issue, where we mark the entry as 
`markedForCleanup` before removing it from the cache. See 
RemoteIndexCache#remove, and RemoteIndexCache#removeAll methods.
   4. When an entry gets evicted from cache, then the cache remove that entry 
before calling the evictionListener and all the operations are performed 
atomically by caffeine cache.
   
   **Solution**
   
   1. When the deleter thread marks an Entry for deletion, then we rename the 
underlying index files with ".deleted" as suffix and add a job to the 
remote-log-index-cleaner thread which perform the actual cleanup. Previously, 
the indexes were not accessible once it was marked for deletion. Now, we are 
allowing to access thoes renamed files until they are removed from disk.
   2. Similar to local-log index/segemnt deletion, once the files gets renamed 
with ".deleted" as suffix then the actual deletion of file happens after 
file.delete.delay.ms delay of 1 minute.
   3. During this time, if the same index entry gets fetched again from remote, 
then it does not have conflict with the deleted entry as the file names are 
different.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to