We index some documents which have an "all" field containing all of the data which can be searched on.
One of the problems we're having is when this field is say 10Mbytes the highlighter takes about a second to calculate the best fragments. The search only takes 30 milliseconds. I've accomodated the load time for the text which is about 5-10X faster in general, so 0.1-0.2 seconds for loading text from the document, and the other 0.8-0.9 performing highlighting. I've over-ridden the maxDocBytesToAnalyze so it will analyze the entire field of the document. At least at the moment we need to try and match the entire document. I've also tried using a SimpleAnalyzer when the highlighting is performed, but this doesn't seem to affect performance much. Also, I've modified the QueryScorer so it can do wildcard term matches without extracting the terms from the index (Because we're using a ConstantScoreQuery which doesn't let highlighting work to get around the MaxBooleanClauses exception). Basically if the term doesn't match in the highlighter, then it will try to pattern match against the wildcard search terms, so there's some more processing there, but disabling it doesn't seem to affect the performance that much. One other thing was just doing a simple regex search without using a scorer or analyzer. This runs about 2x faster, but still is relatively slow. Has anyone had any good experience with performing fragmentation and highlighting for larger documents? Thanks, Brian Beard --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]