Michael McCandless created LUCENE-6381:
------------------------------------------

             Summary: DocumentsWriterStallControl's .wait() should have a time 
limit
                 Key: LUCENE-6381
                 URL: https://issues.apache.org/jira/browse/LUCENE-6381
             Project: Lucene - Core
          Issue Type: Bug
            Reporter: Michael McCandless
            Assignee: Michael McCandless
             Fix For: Trunk, 5.1


This build was hung: 
http://build-us-00.elastic.co/job/es_core_15_centos/230/testReport/junit/org.elasticsearch.index.engine/InternalEngineTests/testDeletesAloneCanTriggerRefresh/

Only one thread was stalled in DocumentsWriterStallControl, which means we have 
a bug somewhere, because that thread should have un-stalled once the other (too 
many) threads finished flushing their segments.

I think we should make a simple defensive change here: instead of wait(), which 
waits forever for a .notify/All() to wake it up, we should wait for up to a 
time limit.  This way when any concurrency bug like this strikes, we won't hang 
forever.

I cannot reproduce that particular hang... what's unique about that test is it 
uses a positively minuscule (1 KB) IW buffer.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to