benwtrent opened a new issue, #14429:
URL: https://github.com/apache/lucene/issues/14429
### Description
Found this in the wild. I haven't been able to replicate :(
I don't even know what it means to hit this `fst.outputs.merge` branch and
under what conditions is it valid/invalid. Any pointers here would be useful.
We ran into a strange postings merge error in production.
The FST compiler reaches the "merge" line when merging some segments:
``` if (lastInput.length() == input.length && prefixLenPlus1 == 1 +
input.length) {
// same input more than 1 time in a row, mapping to
// multiple outputs
lastNode.output = fst.outputs.merge(lastNode.output, output);```
However, the "outputs" provided by `Lucene90BlockTreeTermsWriter` is
`ByteSequenceOutputs`, which does not override merge, and thus throws an
unsupported operation exception.
Given this, it seems like it should be "impossible" to reach the
"Outputs.merge" path when merging with the `Lucene90BlockTreeTermsWriter`, but
somehow it did.
Any ideas on where I should look?
```Caused by:
org.elasticsearch.common.io.stream.NotSerializableExceptionWrapper:
unsupported_operation_exception: null
at org.apache.lucene.util.fst.Outputs.merge(Outputs.java:95)
~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.util.fst.FSTCompiler.add(FSTCompiler.java:936)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$PendingBlock.append(Lucene90BlockTreeTermsWriter.java:593)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$PendingBlock.compileIndex(Lucene90BlockTreeTermsWriter.java:562)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.writeBlocks(Lucene90BlockTreeTermsWriter.java:776)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter$TermsWriter.finish(Lucene90BlockTreeTermsWriter.java:1163)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.codecs.lucene90.blocktree.Lucene90BlockTreeTermsWriter.write(Lucene90BlockTreeTermsWriter.java:402)
~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.codecs.FieldsConsumer.merge(FieldsConsumer.java:95)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsWriter.merge(PerFieldPostingsFormat.java:204)
~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.SegmentMerger.mergeTerms(SegmentMerger.java:211)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.index.SegmentMerger.mergeWithLogging(SegmentMerger.java:300)
~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:139)
~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:5293)
~[lucene-core-9.11.1.jar:?]
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4761)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.index.IndexWriter$IndexWriterMergeSource.merge(IndexWriter.java:6582)
~[lucene-core-9.11.1.jar:?]
at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:660)
~[lucene-core-9.11.1.jar:?]
at
org.elasticsearch.index.engine.ElasticsearchConcurrentMergeScheduler.doMerge(ElasticsearchConcurrentMergeScheduler.java:134)
~[elasticsearch-8.15.0.jar:?]
at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:721)
~[lucene-core-9.11.1.jar:?]```
### Version and environment details
Lucene 9.11.1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]