Hi,
we have a system to archive mails and are facing some issues that we are
having with search and indexing performance, the following is what we
are currently facing challenges with, we are currently using lucene
version 2.2 the platform is SLES10.1 and the application is written in
Java.
* Index merging and optimization.
The index merging/optimization takes too long when large
number of documents are being
added, we have resolved this by having two indexes and
keeping one writable at all times so that mails are
constantly processed. however it would be good to find a
solution to this issue. and have the index optimization take less time
and be more efficient,
also this process consumes a lot of memory. and sometime the
application runs of memory.
* email searching
o We are creating very large indexes for emails we are
processing, the size is upto +150GB for indexes only (not
including data content), this we thought would improve
search performance since less indexes to open and read from,
however the searching taking upto minutes and sometime never
returns results
We have also tried upgrading the lucene version to 2.3 in hope to
improve performance but the results were quite the opposite. but from my
research on the internet the Lucene version 2.3 is much faster and
better so why are we seeing such inconsistency.
we are now adopting a slightly different architecture where we will be
able to split the indexes and reduce the size, however I wanted to get
some expert help, to help us with the decision of which lucene version
to use, and how to use the lucene to get the best balance between index
size, merging indexes and most importantly search performance,
Any help would be very much appreciated.
Many thanks in advance.
Maz
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]