Dear list,
I try to use the Term Highlighter in my webapp but I have a problem. I want
to highlight the terms in a text without extracting the most relevant
sections.
The highlighting works but the last characters are trimmed !
Here is a portion of my code :
Analyzer analyzer = new StandardAnalyzer();
Query query = null;
try {
query = QueryParser.parse(queryStr, "scientificName", analyzer);
query = query.rewrite(IndexReader.open("E:/specimenset-index"));
} catch (ParseException e) {
// TODO Auto-generated catch block
e.printStackTrace();
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
QueryScorer scorer = new QueryScorer(query);
SimpleHTMLFormatter formatter = new SimpleHTMLFormatter(
"<span class=\"highlight\">", "</span>");
Highlighter highlighter = new Highlighter(formatter, scorer);
TokenStream tokenStream = analyzer.tokenStream("scientificName",
new StringReader(text));
String highlightedText = null;
try {
highlightedText = highlighter.getBestFragment(
tokenStream, text);
} catch (IOException e1) {
// TODO Auto-generated catch block
e1.printStackTrace();
}
return highlightedText ;
A value for text variable is for instance :
<a href='taxoninfo.html?id=112'><span class='genus-species'>Capparimyia
savastani</span> (Martelli)</a>
The corresponding value for highlightedText variable is :
<a href='taxoninfo.html?id=112'><span class='genus-species'><span
class="highlight">Capparimyia</span> savastani</span> (Martelli
The ")</a>" are trimmed for some mysterious reason !! I try to play with
Encoder and Fragmenter classes but without success !
Any help would be appreciate.
Best regards,
Johan
Johan Duflost
Belgian Biodiversity Information Facility (BeBIF)
Universite Libre de Bruxelles
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]