Jay Booth wrote:
I had a similar problem with threading, the problem turned out to be that in the back end of the FSDirectory class I believe it was, there was a synchronized block on the actual RandomAccessFile resource when reading a block of data from it... high-concurrency situations caused threads to stack up in front of this synchronized block and our CPU time wound up being spent thrashing between blocked threads instead of doing anything useful.
This is correct. In Lucene, multiple streams per file are created by cloning, and all clones of an FSDirectory input stream share a RandomAccessFile and must synchronize input from it. MmapDirectory does not have this limitation. If your indexes are less than a few GB or you are using 64-bit hardware, then MmapDirectory should work well for you. Otherwise it would be simple to write an nio-based Directory that does not use mmap that is also unsynchronized. Such a contribution would be welcome.
Making multiple IndexSearchers and FSDirectories didn't help because in the back end, lucene consults a singleton HashMap of some kind (don't remember implementation) that maintained a single FSDirectory for any given index being accessed from the JVM... multiple calls to FSDirectory.getDirectory actually return the same FSDirectory object with synchronization at the same point.
This does not make sense to me. FSDirectory does keep a cache of FSDirectory instances, but i/o should not be synchronized on these. One should be able to open multiple input streams on the same file from an FSDirectory. But this would not be a great solution, since file handle limits would soon become a problem.
Doug --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]