to filter or not to filter

Dan Funk Wed, 17 Aug 2005 12:34:09 -0700

Currently I'm working with a single index where content is indexed byit's original printed page. I have to show the total number of matchingdocuments, so I end up running through all the hits and taking an orderof magnitude hit on performance as I calculate the number of uniquedocuments. It's stupid for many many reasons.

To correct all this, I've decided to create two (maybe three) indexesfor the same set of documents: in the first index there is a one to onerelationship between the original document and the Lucene Documentobject. The other index is a paragraph index, where each lucenedocument represents a single paragraph. I may even throw in a thirdindex where each lucene document represents a logical section/chapter.

When I'm building the search results page I'll have to execute a fairnumber of queries. The first query will execute on the Document-Index,then for each of the 10 to 2o results I'm displaying at the time, I'llexecute another query to find the best paragraph and or section.

Is this a reasonable solution to the problem?

Thanks for the advice.
Dan


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

to filter or not to filter

Reply via email to