Hi Brian, I don't know the internals of highlighting („explanation“) in lucene. But I know that XTF ( http://xtf.wiki.sourceforge.net/underHood_Documents#tocunderHood_Documents5 ) can handle very large documents (above 100 Mbyte) with highlighting very fast. The difference to your approach is, that xtf devide the document in small (overlapping) chunks and store the original text as xml separately with connection to lucene indexed fields via numbered xml-nodes. For large texts (above 200 KByte), it is the best tool I know.
Best regards Karsten Beard, Brian wrote: > > We index some documents which have an "all" field containing all of the > data which can be searched on. > > One of the problems we're having is when this field is say 10Mbytes the > highlighter takes about a second to calculate the best fragments. The > search only takes 30 milliseconds. I've accomodated the load time for > the text which is about 5-10X faster in general, so 0.1-0.2 seconds for > loading text from the document, and the other 0.8-0.9 performing > highlighting. > -- View this message in context: http://www.nabble.com/highlighter---fragmenter-performance-for-large-fields-tp19958865p20010623.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]