[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979189#action_12979189
]
Michael McCandless commented on LUCENE-2324:
--------------------------------------------
bq. The proposed change is simply the thread calling add doc will flush it's
DWPT if needed, take it offline while doing so, and return it when completed.
Wait -- this is the "addDocument" case right? (I thought we were still talking
about the "flush the world" case...).
bq. I think the risk is a new DWPT likely will have been created during flush,
which'd make the returning DWPT inutile?
A new DWPT will have been created only if more than one thread is indexing docs
right? In which case this is fine? Ie the old DWPT (just flushed) will just
go back into rotation, and when another thread comes in it can take it?
But, you're right: maybe we should sometimes "prune" DWPTs. Or simply stop
recycling any RAM, so that a just-flushed DWPT is an empty shell.
bq. However I think we may still need the global lock for close, eg, today
we're preventing the user from adding docs during close, after this issue is
merged that behavior would change?
Well, the threads still adding docs will hit AlreadyClosedException? (But,
that's just "best effort"). The behavior of calling IW.close while other
threads are still adding docs has never been defined (and, shouldn't be) except
that we won't corrupt your index, and we'll get all docs indexed before .close
was called, committed. So I think even for this case we don't need a global
lock.
> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>
> Key: LUCENE-2324
> URL: https://issues.apache.org/jira/browse/LUCENE-2324
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: Realtime Branch
>
> Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
> LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
> lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out
>
>
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]