tor (e.g. Java 7 G1 Collector or
> >>>> Java 6
> CMS
> >> Collector). Other garbage collectors may do GCs in a single thread
> ("stop-the-
> >> world").
> >>>> Uwe
> >>>> -
> >>>> U
ollector (e.g. Java 7 G1 Collector or Java
>>>> 6 CMS
>> Collector). Other garbage collectors may do GCs in a single thread
>> ("stop-the-
>> world").
>>>> Uwe
>>>> -
>>>> Uwe Schindl
gt;> http://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> >>> -Original Message-
> >>> From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru]
> >>> Sent: Saturday, November 23, 2013 4:46 PM
> >>> To: java-user@luce
ported" setup :-) Lucene has no problem with that setup and can index.
>> Be sure:
>> >> - Don't give too much heap to your indexing app. Larger heaps create
>> much more GC load.
>> >> - Use a suitable Garbage collector (e.g. Java 7 G1 Collector or Java
gt;> -
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: u...@thetaphi.de
> >>> -Original Message-
> >>> From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru]
> &g
>> Uwe
>> -
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: u...@thetaphi.de
>>> -Original Message-
>>> From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru]
>>> Sent:
riginal Message-
From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru]
Sent: Saturday, November 23, 2013 4:46 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene multithreaded indexing problems
So we return to the initially described setup: multiple parallel workers, each
making "p
Maybe you should turn on Garbage Collection logging to confirm that you
are running into some kind of memory problem. (start JVM with -verbose:gc)
If the GC is running very often as soon as your indexing process slows
down, i would suggest you to create a heapdump and check what the memory
is us
e
eMail: u...@thetaphi.de
> -Original Message-
> From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru]
> Sent: Saturday, November 23, 2013 4:46 PM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene multithreaded indexing problems
>
> So we return to the initially de
So we return to the initially described setup: multiple parallel workers, each
making "parse + indexWriter.addDocument()" for single documents with no
synchronization at my side. This setup was also bad on memory consumption and
thread blocking, as I reported.
Or did I misunderstand you?
--
I
Hi,
Don't use addDocuments. This method is more made for so called block indexing
(where all documents need to be on a block for block joins). Call addDocument
for each document possibly from many threads. By this Lucene can better handle
multithreading and free memory early. There is really no
- uwe@
Thanks Uwe!
I changed the logic so that my workers only parse input docs into Documents,
and indexWriter does addDocuments() by itself for the chunks of 100 Documents.
Unfortunately, this behaviour reproduces: memory usage slightly increases with
the number of processed documents, and at
Hi,
why are you doing this? Lucene's IndexWriter can handle addDocuments in
multiple threads. And, since Lucene 4, it will process them almost completely
parallel!
If you do the addDocuments single-threaded you are adding an additional
bottleneck in your application. If you are doing a synchron
13 matches
Mail list logo