showuon commented on code in PR #21088:
URL: https://github.com/apache/kafka/pull/21088#discussion_r2625375514


##########
storage/src/main/java/org/apache/kafka/storage/internals/log/RemoteIndexCache.java:
##########
@@ -345,7 +368,18 @@ public Entry getIndexEntry(RemoteLogSegmentMetadata 
metadata) {
         lock.readLock().lock();
         try {
             throwIfCacheClosed(uuid);
-            return internalCache.get(uuid, k -> createCacheEntry(metadata));
+            Entry entry = internalCache.get(uuid, k -> 
createCacheEntry(metadata));
+
+            // Handle race where entry is evicted and marked for cleanup, but 
still returned by cache.get().
+            // Treat as cache miss and refetch to avoid IllegalStateException 
during subsequent lookups.
+            if (entry.isMarkedForCleanup()) {
+                log.debug("Entry for segment {} is marked for cleanup, 
invalidating and refetching", uuid);
+                refetchAfterEvictionCount.incrementAndGet();
+                internalCache.invalidate(uuid);

Review Comment:
   If the evictionListener is not invoked, the entry will not be marked for 
cleanup, and the file will not be renamed and deleted. So that means, if the 
`getIndexEntry()` is called before the `evictionListener` is invoked, 
everything will be fine. But once `evictionListener` is invoked, the cache will 
be invalidated. Also, the callback in `evictionListener` is atomic, so we can 
make sure these will happen atomically:
   1. entry markForCleanup
   2. renaming the file
   
   That means, even if the files is not deleted yet, it is still safe to 
re-retrieve indexes and save in the disk.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to