Dalton, Jeffery wrote:

You mentioned that "it will scale well in the future".  Does this imply
that it doesn't scale well now?

Absolutely not. I'm saying the practice of storing your documents outside of a Lucene, is, in general, a good practice to follow. Lucene is brilliant, but specialized. It should be used for it's intended porpoise (everything should have an intended porpoise). I wasn't slighting the Highlighter, my project has been very much the better for having used it.

 What are the current limitations of the
Lucene Highlighter? Does does it perform under high query load?
I don't know. I can run ten full page text documents through the highlighter and generate a search results page in about 50ms, about the same amount of time it took to get the hits object. And I have reason to believe I may not have used the highlighter in the most effective manor. I stopped futzing with it when I felt I was getting results back in an acceptable amount of time.

This is just a curiousity of mine, but nutch has a separate Summarizer:
net.nutch.search.summarizer.  The Nutch summarizer looks much more
efficient ( aka more simplistic) and therefore probably more scalable?
This is probably a question for the Nutch user list, but why doesn't
Nutch use the Lucene Summarizer?
Thoughts, comments?

- Jeff

-----Original Message-----
From: Dan Funk [mailto:[EMAIL PROTECTED] Sent: Friday, September 23, 2005 8:28 AM
To: java-user@lucene.apache.org
Subject: Re: Displaying search context

What you are doing is a good, scalable practice.  You need to store
those email messages somewhere outside of Lucene, and use a unique id to
correlate the two.  When you want to display relevant text for a search
result, find the file on disk, and pass it through the Lucene
Highlighter  (see the Lucene sandbox).  This will give you what you are
looking for, and it will scale well in the future.

Anand Kishore wrote:

Hi,

I am indexing emails through Lucene. The body of the mails is stored in

an ''Unstored" field. I also have a search interface setup which returns me all Documents matching my query. What i need is to display a

few lines from the body of the mails where the queryTerm was found. How

can this be achieved as the body is just indexed but not stored.

Thanx

- Andy
http://da-tek-ee.blogspot.com





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



--
Dan Funk
Software Engineer

Information Technology Solutions
Battelle Charlottesville Operations
1000 Research Park Boulevard, Suite 105
Charlottesville, Virginia 22911

434.984.0951 x244
434.984.0947 (fax)
[EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to