[ 
https://issues.apache.org/jira/browse/KAFKA-19014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kamal Chandraprakash resolved KAFKA-19014.
------------------------------------------
    Fix Version/s: 4.1.0
         Assignee: Kamal Chandraprakash
       Resolution: Duplicate

Please reopen the ticket if you still face this issue. 

> Potential race condition in remote-log-reader and remote-log-index-cleaner 
> thread
> ---------------------------------------------------------------------------------
>
>                 Key: KAFKA-19014
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19014
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 3.8.1
>            Reporter: Hasil Sharma
>            Assignee: Kamal Chandraprakash
>            Priority: Major
>              Labels: tiered-storage
>             Fix For: 4.1.0
>
>
> A race condition between threads below results in MappedByteBuffer to 
> reference to a deleted file and attempts to read the file are potentially 
> resulting in JVM to crash.
>  
> Chain of events:
> *Thread - 1 remote-log-reader*
> 1/ Fetches the offsetIndex from the indexCache which internally maps the 
> physical offset index file as MappedByteBuffer.
> OffsetIndex offsetIndex = 
> indexCache.getIndexEntry(segmentMetadata).offsetIndex(); 
> ([here|https://github.com/apache/kafka/blob/cf7029c0264fd7f7b15c2e98acc874ec8c3403f2/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1772])
> *Thread - 2 index cache thread*
> Entry is marked for cleanup i.e physical offset index file is renamed.
> *Thread - 3 remote-log-index-cleaner*
> Physical offset index file is deleted.
> *Thread - 1 remote-log-reader*
> Attempts run binary search on the MappedByteBuffer that is mapped to a 
> non-existent file.
> long upperBoundOffset = 
> offsetIndex.fetchUpperBoundOffset(startOffsetPosition, 
> fetchSize).map(position -> 
> position.offset).orElse(segmentMetadata.endOffset() + 1); 
> ([here|https://github.com/apache/kafka/blob/3.8/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L1619])
>  
> Results in JVM fatal error (SIGSEV) with stack trace:
>  
> {code:java}
> Stack: [0x000072ee9112d000,0x000072ee9122d000],  sp=0x000072ee9122b360,  free 
> space=1016k
> Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native 
> code)
> J 6483 c2 java.nio.DirectByteBuffer.getInt(I)I java.base@17.0.14 (28 bytes) @ 
> 0x000072f23d2f12f1 [0x000072f23d2f12a0+0x0000000000000051]
> j  
> org.apache.kafka.storage.internals.log.OffsetIndex.relativeOffset(Ljava/nio/ByteBuffer;I)I+5
> j  
> org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/OffsetPosition;+11
> j  
> org.apache.kafka.storage.internals.log.OffsetIndex.parseEntry(Ljava/nio/ByteBuffer;I)Lorg/apache/kafka/storage/internals/log/IndexEntry;+3
> j  
> org.apache.kafka.storage.internals.log.AbstractIndex.binarySearch(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;II)I+30
> j  
> org.apache.kafka.storage.internals.log.AbstractIndex.indexSlotRangeFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;Lorg/apache/kafka/storage/internals/log/AbstractIndex$SearchResultType;)I+126
> j  
> org.apache.kafka.storage.internals.log.AbstractIndex.smallestUpperBoundSlotFor(Ljava/nio/ByteBuffer;JLorg/apache/kafka/storage/internals/log/IndexSearchType;)I+8
>  {code}
>  
>  
> As per MappedByteBuffer documentation 
> ([here|https://devdocs.io/openjdk~17/java.base/java/nio/mappedbytebuffer]):
> All or part of a mapped byte buffer may become inaccessible at any time, for 
> example if the mapped file is truncated. An attempt to access an inaccessible 
> region of a mapped byte buffer will not change the buffer's content and will 
> cause an unspecified exception to be thrown either at the time of the access 
> or at some later time. It is therefore strongly recommended that appropriate 
> precautions be taken to avoid the manipulation of a mapped file by this 
> program, or by a concurrently running program, except to read or write the 
> file's content.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to