I'm not sure if it would help my particular situation, but is there any way to provide the option of specifying the compression level? The level used by Lucene (level 9) is the maximum possible compression level. Ideally I would like to be able to alter the compression level on the basis of the field size. This way I can smooth out the compression times across the various document sizes. I am more interested in consistent time than I am consistent compression.
I agree, we should make the compression level configurable. It's disturbing that it takes minutes to compress a 4.5 MB document! I'll open a Jira issue for this.
Or... could there some other reason my document takes this long to index? (and hold up all other threads).
You might want to try just running the command-line "zip" utility, specifying best compression, to see how long it takes? Lucene is just using java.util.zip.* APIs (which is the same compression as "zip").
One correction: this compression should not block other threads. This runs outside of "synchronized" code, meaning, if you have other threads adding documents, they can do so fully in parallel with your one thread that's doing the slow compression.
Mike --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]