[ 
https://issues.apache.org/jira/browse/LUCENE-5407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13877747#comment-13877747
 ] 

Mikhail Khludnev commented on LUCENE-5407:
------------------------------------------

bq. The test does not recreate IndexWriter instances, It is created once in the 
main thread
Yep. My fault. 

Ok. I reproduced this case. I think the problem is that

"main" ....
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
....
        at java.io.PipedReader.read(PipedReader.java:309)
        - locked <0x00000007ea28ada8> (a java.io.PipedReader)
        at 
org.apache.lucene.index.TestIndex2$ParsingReader.read(TestIndex2.java:85)
        at .....
        at 
org.apache.lucene.analysis.util.FilteringTokenFilter.incrementToken(FilteringTokenFilter.java:55)
....
        at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1179)
        at org.apache.lucene.index.TestIndex2.doIndex(TestIndex2.java:45)

"main" obtains some locks in indexWriter, blocked by the reader and waits for 
closing the stream, which is blocked on obtaining those indexWriter locks.

"Thread-1" ...
   java.lang.Thread.State: WAITING (parking)
        at sun.misc.Unsafe.park(Native Method)
        - parking to wait for  <0x00000007e9ad1b80> (a 
java.util.concurrent.locks.ReentrantLock$NonfairSync)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
        at 
....org.apache.lucene.index.DocumentsWriterFlushControl.obtainAndLock(DocumentsWriterFlushControl.java:445)
...
        at 
org.apache.lucene.index.IndexWriter.addDocument(IndexWriter.java:1179)
        at 
org.apache.lucene.index.TestIndex2$ParsingReader.run(TestIndex2.java:99)

it sounds like you expect "reenterability" (j2ee meme) from indexWriter and 
Lucene Analysis, which is never promised to be so. Sad.
Overall, I don't think that obtaining any locks under Lucene Analysis API is a 
good idea. 

I'm not aware of Tika, but I suppose you need to somehow pre-process the docs 
and feed them into Lucene one-by-one or in parallel. Once again, Lucene indexes 
concurrently really well, but without thread-chaining. 





> Deadlock? while indexing reader fields in cascaded threads
> ----------------------------------------------------------
>
>                 Key: LUCENE-5407
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5407
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/index
>    Affects Versions: 4.6
>         Environment: Windows 7 64 bits, JRE 1.7.0_25 64 bits
>            Reporter: Luis Filipe Nassif
>         Attachments: Test.java, thread_dump.txt
>
>
> Apparently I found a deadlock problem with IndexWriter using Reader Fields in 
> a cascaded thread design to add documents (I am working on an application 
> integrating Tika, which has the capability to add embedded documents to the 
> index as independent documents as they are found). The attached code 
> illustrates the problem. Sometimes it stops processing, at least one of the 
> threads remains in WAITING state. It must be executed no more than 5 times in 
> my environment to trigger the problem.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to