Ray,

The throughput is worse with NioFSDIrectory than with the FSDIrectory
(patched and unpatched). The bottleneck still seems to be synchronization,
this time in NioFile.getChannel (7 of the 8 threads were blocked there
during one snapshot).  I tried this with 4 and 8 channels.

The throughput with the patched FSDirectory was about the same as before the
patch.

Thanks,
Peter


On 1/26/06, Ray Tsang <[EMAIL PROTECTED]> wrote:
>
> Speaking of NioFSDirectory, I thought there was one posted a while
> ago, is this something that can be used?
> http://issues.apache.org/jira/browse/LUCENE-414
>
> ray,
>
> On 11/22/05, Doug Cutting <[EMAIL PROTECTED]> wrote:
> > Jay Booth wrote:
> > > I had a similar problem with threading, the problem turned out to be
> that in
> > > the back end of the FSDirectory class I believe it was, there was a
> > > synchronized block on the actual RandomAccessFile resource when
> reading a
> > > block of data from it... high-concurrency situations caused threads to
> stack
> > > up in front of this synchronized block and our CPU time wound up being
> spent
> > > thrashing between blocked threads instead of doing anything useful.
> >
> > This is correct.  In Lucene, multiple streams per file are created by
> > cloning, and all clones of an FSDirectory input stream share a
> > RandomAccessFile and must synchronize input from it.  MmapDirectory does
> > not have this limitation.  If your indexes are less than a few GB or you
> > are using 64-bit hardware, then MmapDirectory should work well for you.
> >   Otherwise it would be simple to write an nio-based Directory that does
> > not use mmap that is also unsynchronized.  Such a contribution would be
> > welcome.
> >
> > > Making multiple IndexSearchers and FSDirectories didn't help because
> in the
> > > back end, lucene consults a singleton HashMap of some kind (don't
> remember
> > > implementation) that maintained a single FSDirectory for any given
> index
> > > being accessed from the JVM... multiple calls to
> FSDirectory.getDirectory
> > > actually return the same FSDirectory object with synchronization at
> the same
> > > point.
> >
> > This does not make sense to me.  FSDirectory does keep a cache of
> > FSDirectory instances, but i/o should not be synchronized on these.  One
> > should be able to open multiple input streams on the same file from an
> > FSDirectory.  But this would not be a great solution, since file handle
> > limits would soon become a problem.
> >
> > Doug
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [EMAIL PROTECTED]
> > For additional commands, e-mail: [EMAIL PROTECTED]
> >
> >
>

Reply via email to