Michael, Our application includes indexing and archiving documents to meet compliance requirements.
A couple of reasons that lead to the merge approach: - Source documents are written to archive media and retrieval is relatively slow. Add to that our processing pipeline (including text extraction)... Retrieving and merging minis is faster than re-processing and re-indexing from sources. - In addition to index recovery, mini indexes may be combined into custom indexes based on policy. From a compliance viewpoint the mini indexes contain logically related documents. For example: based on a retention policy, documents of type x are to be kept for y years. One example for constructing a custom index would be for legal discovery. Thanks, david. On 4/18/07, Michael D. Curtin <[EMAIL PROTECTED]> wrote:
d m wrote: > I'd like to share index merge performance data and have a couple > of questions about it... > > We (AXS-One, www.axsone.com) build one "master" index per day. > For backup and recovery purposes, we also build many individual > "mini" indexes from the docs added to the master index. > > Should one of our master indexes become unusable (for whatever > reason - and I'm glad to say this has not yet happened), we plan to > reconstruct it by merging its mini indexes. The possible merge bug notwithstanding, let's take a step back in abstraction: are you sure the relatively-complex iterative merge process you've described buys you anything over a simple backup-the-whole-index approach? Or a backup-the-source-data-and-reindex approach? Merging is I/O intensive, and the scheme you've outlined is re-reading and re-writing all the index data several times anyway -- it might not be saving you much over a full reindex. Since the scenario you're trying to protect against is a very rare occurrence (so far at least), would it be better to spend your development time on improving the application than devising (and debugging, and testing, ...) a complicated backup and recovery scheme? --MDC --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]