Hi,everyone. I have a large mount of xml files of size 1G. I use lucene(the
dotNet edition) to index . There are 8 fields for a document, with 4 keyword
fields and 4 unstored fields. I have set the minMergeDocs to 10000 and
mergeFactor to 100. It took about 2.5 hours (main memeory 3G, CPU p4 ) .I
also try in-memory indexing which is also more than 2.5hours. Due to the
performance requirement , I need complete the indexing in one hour without
the use of distributing or clustering system . Cant it be possible? Is it
faster to use java Lucene than dotNet one? Any advice will be appreciated.
Thank you in advance.