21 sep 2007 kl. 08.23 skrev Jarvis:

There is a question about the document’s length and search efficiency.

Two ways to index some html pages(ignore some information): one is both store and index the html content in lucene dictionary, the other is just
index the content . For the first method is there a efficiency problem
compare to the second besides the folder size increase?

Not sure I understand your question, but I'll give it a go.

As far as I know, storing data in a document will not affect search speed. However, loading large amounts of data to a Document will of course consume resources. Therefor it is possible to pass a FieldSelector to the IndexReader when you retrieve a Document, allowing you to define what fields to ignore, load, lazy load, et c.

I hope this helps.

--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to