Hi Yonik,
Your patch has corrected the thread thrashing problem on multi-cpu systems.
I've tested it with both 1.4.3 and 1.9. I haven't seen 100X performance
gain, but that's because I'm caching QueryFilters and Lucene is caching the
sort fields.
Thanks for the fast response!
btw, I had previous
Here's the patch:
http://issues.apache.org/jira/browse/LUCENE-454
It resulted in quite a performance boost indeed!
On 10/12/05, Yonik Seeley <[EMAIL PROTECTED]> wrote:
>
> Thanks for the trace Peter, and great catch!
> It certainly does look like avoiding the construction of the docMap for a
> Mu
Thanks for the trace Peter, and great catch!
It certainly does look like avoiding the construction of the docMap for a
MultiTermEnum will be a significant optimization.
-Yonik
Now hiring -- http://tinyurl.com/7m67g
On 10/12/05, Peter Keegan <[EMAIL PROTECTED]> wrote:
>
> Here is one stack trace:
Here is one stack trace:
Full thread dump Java HotSpot(TM) Client VM (1.5.0_03-b07 mixed mode):
"Thread-6" prio=5 tid=0x6cf7a7f0 nid=0x59e50 waiting for monitor entry
[0x6d2cf000..0x6d2cfd6c]
at org.apache.lucene.index.SegmentReader.isDeleted(SegmentReader.java:241)
- waiting to lock <0x04e40278>
I'm pretty sure it doesn't solve the problem in general (it isn't a
thread-save solution for sure, you mentioned the memory barrier, I'd add
compiler optimizations). If it works it must be something
application-specific, maybe synchronization isn't really needed there,
or you just don't do an
> We've been using this in production for a while and it fixed the
> extremely slow searches when there are deleted documents.
Who was the caller of isDeleted()? There may be an opportunity for an easy
optimization to grab the BitVector and reuse it instead of repeatedly
calling isDeleted() on the
I'm not sure that looks like a safe patch.
Synchronization does more than help prevent races... it also introduces
memory barriers.
Removing synchronization to objects that can change is very tricky business
(witness the double-checked locking antipattern).
-Yonik
Now hiring -- http://tinyurl.com/
Hi Peter,
I observed the same issue on a multiprocessor machine. I included a
small fix for this in the NIO patch (against the 1.9 trunk) here:
http://issues.apache.org/jira/browse/LUCENE-414#action_12322523
The change amounts to the following methods in SegmentReader.java, to
remove the need s
> If the index is in 'search/read-only' mode, is there a way around this
bottleneck?
The obvious answer (to answer my own question) is to optimize the index.
But the question remains: why is the docMap created and never used?
Peter
On a multi-cpu system, this loop to build the docMap array can cause severe
thread thrashing because of the synchronized method 'isDeleted'. I have
observed this on an index with over 1 million documents (which contains a
few thousand deleted docs) when multiple threads perform a search with
either
Lokesh Bajaj wrote:
For a very large index where we might want to delete/replace some documents,
this would require a lot of memory (for 100 million documents, this would need
381 MB of memory). Is there any reason why this was implemented this way?
In practice this has not been an issue. A
I noticed the following code that builds the "docMap" array in
SegmentMergeInfo.java for the case where some documents might be deleted from
an index:
// build array which maps document numbers around deletions
if (reader.hasDeletions()) {
int maxDoc = reader.maxDoc();
docM
12 matches
Mail list logo