There have been a couple of alternative Highlighter contributions recently, I 
can't recall which claim to support "proper" highlighting of phrases but you 
might want to give them a try.


http://issues.apache.org/jira/browse/LUCENE-644

http://issues.apache.org/jira/browse/LUCENE-663


Ultimately "proper" highlighting is very hard to achieve if you both want to 
select a summary of the doc's "best bits" and also support queries which can be 
arbitrarily complex nestings of boolean queries which can also contain 
Spans/phrases covering large sections of the document. Some compromises are 
inevitable under these extreme circumstances and I don't think there is one 
implementation that is capable of catering for this.

Let us know how you get on.

Cheers
Mark


----- Original Message ----
From: Heikki Doeleman <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, 10 November, 2006 2:45:23 PM
Subject: Highlighting span for Phrase Queries

Hi there,

I have a question on using the Highlighter.

I'm using Lucene in a web application that allows you to search the 
catalogue of a library. The idea is to highlight, in the results, the 
terms entered by the user. I'm using a Highlighter with a NullFragmenter 
because I want the whole field highlighted (for each field the user 
searched in).

The catalogue is indexed with a StandardAnalyzer on each field. 
Highlighting a field from a result I do like this  :

            QueryScorer scorer = new QueryScorer ( theQuery, 
searcher.getIndexReader ( ) , fieldName ) ;
            Highlighter highlighter = new Highlighter ( new 
SimpleHTMLFormatter ( "<span style=\"border-style:solid;\">" , "</span>" ) 
, scorer ) ;
            highlighter. setTextFragmenter ( new NullFragmenter ( ) ) ;

        highlightedFieldContent = highlighter. getBestFragment ( analyzer 
, fieldName , originalFieldContent ) ; 

This works all very well, except for phrase queries. The spans of the 
phrase queries as such are not highlighted and instead, each of the terms 
that is in the phrase query, gets highlighted. I guess this is because the 
indexed fields have been tokenized, and what-not, by the StandardAnalyzer.

Does anyone have a good example about how to implement a highlighting 
function that works well with phrase queries, too ?

thank you very much.

Heikki DOELEMAN




                
___________________________________________________________ 
Now you can scan emails quickly with a reading pane. Get the new Yahoo! Mail. 
http://uk.docs.yahoo.com/nowyoucan.html

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to