I run Lucene.Net as well, and your indexing performance is dependent on more
factors aside from whether you're using the Java or C# version.  As a basic
suggestion, learn what you can about minMergeDocs and mergeFactor as well as
the compound file format.  Try different combinations to understand what is
faster vs. slower.

As a strategy for your specific scenario, you might consider building
several indexes in parallel, then merging the indexes at the end.

Hope this helps.

-- j


On 3/22/06, hu andy <[EMAIL PROTECTED]> wrote:
>
> Hi,everyone. I have a large mount of xml files of size 1G. I use
> lucene(the
> dotNet edition) to index . There are 8 fields for a document, with 4
> keyword
> fields and 4 unstored fields. I have set the minMergeDocs to 10000 and
> mergeFactor to 100. It took about 2.5 hours (main memeory 3G, CPU p4 ) .I
> also try in-memory indexing  which is also more than 2.5hours.  Due to the
> performance requirement , I need complete the indexing in one hour without
> the use of distributing or clustering system . Cant it be possible?  Is it
> faster to use java Lucene than dotNet one? Any advice will be appreciated.
> Thank you in advance.
>
>

Reply via email to