Re: Document as Paramter (Rephrased)

Erik Hatcher Sat, 12 Nov 2005 00:06:16 -0800


On 11 Nov 2005, at 20:32, bib_lucene bib wrote:

-- Text I want to highlight is stored in the file system and index
-- I can search and highlight the searched terms in results page( just snippets)-- I have given a download link next to snippets ( which willpoint to file I stored in ROOT webapp of tomcat)
I understood the concept of NullFragmenter, sorry for repeatingmyself...It is something like a google search. In google if I enter searchterm "highlight" and click on search, I get back search resultswith word "highlight" in bold. ( I can do that)

If you're using NullFragmenter (which I just committed to contrib/highlighter) then you're not getting snippets, you're getting thefull text. But this is unrelated to your main question.

Now when I click on the links (Ex: GNU Source-highlight 2.2) whichis http://www.gnu.org/software/src-highlite/source-highlight.html Iwant the term 'highlight" in bold when the page is displayed (ThisI do not know how to do)
I will index docs like html, pdf, word etc.
As I have already extracted text using textminer etc, question iswhen I click on "show full document" link in search results pagewhich I will give below the highlighted search snippet how can ipull out full text of the document from index. As there is nounique identifier for each doc in the index (?)

Well, as I said, from here on out is really in the domain of yourapplication and not Lucene and the Highlighter. Perhaps your indexshould have a unique identifying key field (Field.Keyword works wellfor this). And then your links to the full text should be augmentedto have that key in them, and that link needs to be to somethingdynamic, like a servlet, rather than a static link, that passes inthe id of the document to be highlighted and the query to use withHighlighter.

Or do I just have to extract the text from the file stored in filesystem and pass it to the highlighter.

Perhaps. I cannot decide this for you as it is, again, yourapplication business logic that determines where the text of adocument is. You already mentioned the text is in the Lucene indexin your case, so you could pass the document id (or the path, or someunique key) to the highlighting servlet, retrieve that document fromLucene, highlight the text using the NullFragmenter, and serve upthat highlighted text.

Here is the code snippet.

LuceneHitHighlighter highlighter = new LuceneHitHighlighter(queryStr, "snippet", "body");

                for (int i = 0; i < hits.size(); i++) {
                    Document doc = (Document) hits.get(i);
                    highlighter.doHighlight(doc);
                    out.println("SNIPPET: " + doc.get("snippet"));
                    out.println("<hr>");

You're adding a field to the document in doHighlight - this is notsomething I'd recommend. It is not persistent, just so you know. Irecommend simply returning the highlighted String from that methodrather than adding a field.


    Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Document as Paramter (Rephrased)

Reply via email to