[
https://issues.apache.org/jira/browse/LUCENE-2324?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981192#action_12981192
]
Michael Busch commented on LUCENE-2324:
---------------------------------------
I made some progress with the concurrency model, especially removing the need
for various locks to make everything easier.
- DocumentsWriterPerThreadPool.ThreadState now extends ReentrantLock, which
means that standard methods like lock() and unlock() can be used to reserve a
DWPT for a task.
- The max. number of DWPTs allowed (config.maxThreadStates) is instantiated
up-front. Creating a DWPT is cheap, so this is not a performance concern; this
makes it easier to push config changes to the DWPTs without synchronizing on
the pool and without having to worry about newly created DWPTs getting the same
config settings.
- DocumentsWriterPerThreadPool.getActivePerThreadsIterator() gives the caller a
static snapshot of the active DWPTs at the time the iterator was acquired, e.g.
for flushAllThreads() or DW.abort(). Here synchronizing on the pool isn't
necessary either.
- deletes are now pushed to DW.pendingDeletes() if no active DWPTs are present.
TODOs:
- fix remaining testcases that still fail
- fix RAM tracking and flush-by-RAM
- write new testcases to test thread pool, thread assignment, etc
- review if all cases that were discussed in the recent comments here work as
expected (likely not :) )
- performance testing and code cleanup
> Per thread DocumentsWriters that write their own private segments
> -----------------------------------------------------------------
>
> Key: LUCENE-2324
> URL: https://issues.apache.org/jira/browse/LUCENE-2324
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Index
> Reporter: Michael Busch
> Assignee: Michael Busch
> Priority: Minor
> Fix For: Realtime Branch
>
> Attachments: LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
> LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch, LUCENE-2324-SMALL.patch,
> lucene-2324.patch, lucene-2324.patch, LUCENE-2324.patch, test.out, test.out
>
>
> See LUCENE-2293 for motivation and more details.
> I'm copying here Mike's summary he posted on 2293:
> Change the approach for how we buffer in RAM to a more isolated
> approach, whereby IW has N fully independent RAM segments
> in-process and when a doc needs to be indexed it's added to one of
> them. Each segment would also write its own doc stores and
> "normal" segment merging (not the inefficient merge we now do on
> flush) would merge them. This should be a good simplification in
> the chain (eg maybe we can remove the *PerThread classes). The
> segments can flush independently, letting us make much better
> concurrent use of IO & CPU.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]