21 sep 2007 kl. 08.23 skrev Jarvis:
There is a question about the document’s length and search efficiency.
Two ways to index some html pages(ignore some information): one is
both
store and index the html content in lucene dictionary, the other is
just
index the content . For the first method is there a efficiency problem
compare to the second besides the folder size increase?
Not sure I understand your question, but I'll give it a go.
As far as I know, storing data in a document will not affect search
speed. However, loading large amounts of data to a Document will of
course consume resources. Therefor it is possible to pass a
FieldSelector to the IndexReader when you retrieve a Document,
allowing you to define what fields to ignore, load, lazy load, et c.
I hope this helps.
--
karl
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]