mayya-sharipova opened a new issue, #13103:
URL: https://github.com/apache/lucene/issues/13103
### Description
UnifiedHighlighter based on matches incorrectly returns field 'X' was
indexed without offsets, cannot highlight
Test to reproduce:
```java
static final FieldType textType = new FieldType(TextField.TYPE_STORED);
static {
textType.setStoreTermVectors(true);
textType.setStoreTermVectorPositions(true);
textType.setStoreTermVectorOffsets(true);
textType.freeze();
}
public void testHighlgiht() {
String indexPath = "../lucene-test-indices/index1";
Path path = Paths.get(indexPath);
try {
Directory directory = NIOFSDirectory.open(path);
Analyzer analyzer = new ClassicAnalyzer();
IndexWriterConfig config = new IndexWriterConfig(analyzer);
try (IndexWriter writer = new IndexWriter(directory, config)) {
addDoc(writer, "The quick brown fox jumps over the lazy
dog");
}
try (IndexReader reader = DirectoryReader.open(directory)) {
IndexSearcher searcher = new IndexSearcher(reader);
Query query = new IntervalQuery("content",
Intervals.analyzedText("quick brown fox jumps over
the lazy dog", analyzer, "content", 0, true));
TopDocs topDocs = searcher.search(query, 10);
UnifiedHighlighter.Builder uhBuilder = new
UnifiedHighlighter.Builder(searcher, analyzer)
.withWeightMatches(true);
UnifiedHighlighter highlighter = new
UnifiedHighlighter(uhBuilder);
String[] highlights = highlighter.highlight("content",
query, topDocs, 1);
System.out.println(Arrays.toString(highlights));
}
} catch (IOException e) {
e.printStackTrace();
}
}
private static void addDoc(IndexWriter writer, String content) throws
IOException {
Document doc = new Document();
doc.add(new Field("content", content, textType));
writer.addDocument(doc);
}
```
produces an error:
```
java.lang.IllegalArgumentException: field 'content' was indexed without
offsets, cannot highlight
at
org.apache.lucene.search.uhighlight.FieldHighlighter.highlightOffsetsEnums(FieldHighlighter.java:157)
at
org.apache.lucene.search.uhighlight.FieldHighlighter.highlightFieldForDoc(FieldHighlighter.java:83)
at
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFieldsAsObjects(UnifiedHighlighter.java:944)
at
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFields(UnifiedHighlighter.java:814)
at
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlightFields(UnifiedHighlighter.java:792)
at
org.apache.lucene.search.uhighlight.UnifiedHighlighter.highlight(UnifiedHighlighter.java:725)
```
A workaround to disable highlighting based on matches:
```java
UnifiedHighlighter.Builder uhBuilder = new
UnifiedHighlighter.Builder(searcher, analyzer)
.withWeightMatches(false);
```
This happens because of `ClassicAnalyzer` that removes stop words, and
because of it usage of `ExtendedIntervalsSource` that returns -1 offsets.
### Version and environment details
Lucene v 9.9.1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]