One way to mitigate the cost of this kind of thing is to create a series of
indexes on portions of your corpus and then merge them. Say you have 10,000
documents. Create 10 separate indexes of 1,000 documents each then use
IndexWriter.addIndexes to make them all into a single index.

This pre-supposes you have a neat way of partitioning your files so don't
index any of them in more than one of your sub-indexes.

This doesn't really solve the underlying problem, but at least when the
thing crashes, you spend less time trying the next thing <G>....

Erick

On 1/25/07, maureen tanuwidjaja <[EMAIL PROTECTED]> wrote:

  Hi,
  I am indexing thousands of XML document,then it stops after indexing for
about 7 hrs

  ...
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37003.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37004.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37008.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\3701.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37010.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37015.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\3702.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37022.xml
  Indexing C:\sweetpea\wikipedia_xmlfiles\part-0\37027.xml
  java.io.IOException: Lock obtain timed out: [EMAIL PROTECTED]
:\sweetpea\dual_index\DI\write.lock
  java.lang.NullPointerException

  Can anyone suggest how to overcome this lock time out error?

  Thanks and Regards,
  Maureen




---------------------------------
Sucker-punch spam with award-winning protection.
Try the free Yahoo! Mail Beta.

Reply via email to