Hello list, I'm strugging again with the highlighter. I don't understand why I obtain sporadically InvalidTokenOffsetsException.
The mission: given a query, detect which field was matched, among the names of the concepts: there can be several names for a given concept, also in one language. Concepts are documents and names are in fields name-xx where xx is the two-letter-language. Here's the method I'm using: public String computeMatchedField(int docNum, Document doc, Analyzer analyzer, Query query) throws IOException { //System.out.println("----- computing matched field for query " + query + " on document " + doc.get("uri")); query = query.rewrite(this.reader); String found = null; float maxScore = 0; try { for(Field f: (List<Field>) doc.getFields()) { QueryScorer scorer = new QueryScorer(query,reader,f.name()); if(!f.name().startsWith("name-")) continue; //System.out.println("Measuring field " + f.name() + ": " + f.stringValue()); String text = f.stringValue(); TokenStream tokenStream = TokenSources.getAnyTokenStream(reader,docNum, f.name(), doc, analyzer); SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter(); Highlighter highlighter = new Highlighter(htmlFormatter, scorer); TextFragment[] frags = highlighter.getBestTextFragments(tokenStream, text, false, 1); if(frags==null || frags.length==0) continue; float score = frags[0].getScore(); //System.out.println("Score: " + score); if(score > maxScore) { maxScore = score; found = frags[0].toString(); } } } catch(Exception ex) {ex.printStackTrace();} return found; } Unfortunately, I have to catch InvalidTokenOffsetsException which does happen sometimes, not always. When it occurs, it stops the highlighting (the detected field is "null") and also costs quite some time. What am I doing wrong? I tried making my own tokenStream with no difference. thanks in advance paul --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org