Re: highlighter / fragmenter performance for large fields

Karsten F. Thu, 16 Oct 2008 03:04:06 -0700

Hi Brian,

I don't know the internals of highlighting („explanation“) in lucene.
But I know that XTF (
http://xtf.wiki.sourceforge.net/underHood_Documents#tocunderHood_Documents5
) can handle very large documents (above 100 Mbyte) with highlighting very
fast. The difference to your approach is, that xtf devide the document in
small (overlapping) chunks and store the original text as xml separately
with connection to lucene indexed fields via numbered xml-nodes.
For large texts (above 200 KByte), it is the best tool I know.


Best regards
  Karsten


Beard, Brian wrote:
> 
> We index some documents which have an "all" field containing all of the
> data which can be searched on.
> 
> One of the problems we're having is when this field is say 10Mbytes the
> highlighter takes about a second to calculate the best fragments. The
> search only takes 30 milliseconds. I've accomodated the load time for
> the text which is about 5-10X faster in general, so 0.1-0.2 seconds for
> loading text from the document, and the other 0.8-0.9 performing
> highlighting.
> 

-- 
View this message in context: 
http://www.nabble.com/highlighter---fragmenter-performance-for-large-fields-tp19958865p20010623.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: highlighter / fragmenter performance for large fields

Reply via email to