[I] SegmentDocValuesProducer checkIntegrity might open a dropped segment [lucene]

via GitHub Tue, 16 Jan 2024 18:13:40 -0800


noAfraidStart opened a new issue, #13020:
URL: https://github.com/apache/lucene/issues/13020


   ### Description
   
   We are using HDFS for file storage and the softUpdateDocuments interface for 
writing data. 
   We have found that during concurrent writes, the dvd files selected for 
merging can be deleted by other write/flush threads
   If we change to updateDocuments interface for writing data, 
FileNotFoundException will not occur.
   we test lucene-9.5.0 to lucene-9.8.0, all theses version will occur this 
exception.
   
   The exception as follows:
   java.io.FileNotFoundException: File does not exist: 
/search/test/1/index/_l5_1_Lucene90_0.dvd
           at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2308)
           at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:800)
           at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:479)
           at 
org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   
           at 
org.apache.hadoop.hdfs.DFSClient.callGetBlockLocations(DFSClient.java:1403) 
           at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1390) 
           at 
org.apache.hadoop.hdfs.DFSClient.getLocatedBlocks(DFSClient.java:1379) 
           at 
org.apache.hadoop.hdfs.DFSInputStream.fetchLocatedBlocksAndGetLastBlockLength(DFSInputStream.java:366)
 
           at 
org.apache.hadoop.hdfs.DFSInputStream.openInfo(DFSInputStream.java:284) 
           at 
org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1299)
 
           at 
org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1245) 
           at 
org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1224) 
           at 
org.apache.hadoop.hdfs.DFSInputStream.fetchBlockByteRange(DFSInputStream.java:1405)
 
           at 
org.apache.hadoop.hdfs.DFSInputStream.doPread(DFSInputStream.java:1831) 
           at 
org.apache.hadoop.hdfs.DFSInputStream.pread(DFSInputStream.java:1785) 
           at 
org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:1773) 
           at 
org.apache.hadoop.fs.FSInputStream.readFully(FSInputStream.java:124) 
           at 
org.apache.hadoop.fs.FSDataInputStream.readFully(FSDataInputStream.java:117) 
   
           at org.apache.lucene.store.DataInput.readBytes(DataInput.java:72)
           at 
org.apache.lucene.store.ChecksumIndexInput.skipByReading(ChecksumIndexInput.java:79)
           at 
org.apache.lucene.store.ChecksumIndexInput.seek(ChecksumIndexInput.java:64)
           at 
org.apache.lucene.codecs.CodecUtil.checksumEntireFile(CodecUtil.java:618)
           at 
org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer.checkIntegrity(Lucene90DocValuesProducer.java:1640)
 
           at 
org.apache.lucene.codecs.perfield.PerFieldDocValuesFormat$FieldsReader.checkIntegrity(PerFieldDocValuesFormat.java:380)
           at 
org.apache.lucene.index.SegmentDocValuesProducer.checkIntegrity(SegmentDocValuesProducer.java:131)
   
   Caused by: org.apache.hadoop.ipc.RemoteException: File does not exist: 
/search/test/1/index/_l5_1_Lucene90_0.dvd
           at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:125)
           at 
org.apache.hadoop.hdfs.server.namenode.INodeFile.valueOf(INodeFile.java:115)
           at 
org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getBlockLocations(FSDirStatAndListingOp.java:205)
           at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1.doCall(FSNamesystem.java:2304)
           at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem$1.doCall(FSNamesystem.java:2301)
           at 
org.apache.hadoop.hdfs.server.namenode.LinkResolver.resolve(LinkResolver.java:43)
           at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getBlockLocations(FSNamesystem.java:2308)
           at 
org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getBlockLocations(NameNodeRpcServer.java:800)
           at 
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getBlockLocations(ClientNamenodeProtocolServerSideTranslatorPB.java:479)
   
   ### Version and environment details
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[I] SegmentDocValuesProducer checkIntegrity might open a dropped segment [lucene]

Reply via email to