RE: Lucene multithreaded indexing problems

2013-11-26 Thread Uwe Schindler
tor (e.g. Java 7 G1 Collector or > >>>> Java 6 > CMS > >> Collector). Other garbage collectors may do GCs in a single thread > ("stop-the- > >> world"). > >>>> Uwe > >>>> - > >>>> U

Re: Lucene multithreaded indexing problems

2013-11-26 Thread Igor Shalyminov
ollector (e.g. Java 7 G1 Collector or Java >>>> 6 CMS >> Collector). Other garbage collectors may do GCs in a single thread >> ("stop-the- >> world"). >>>>  Uwe >>>>  - >>>>  Uwe Schindl

RE: Lucene multithreaded indexing problems

2013-11-25 Thread Uwe Schindler
gt;> http://www.thetaphi.de > >> eMail: u...@thetaphi.de > >>> -Original Message- > >>> From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru] > >>> Sent: Saturday, November 23, 2013 4:46 PM > >>> To: java-user@luce

Re: Lucene multithreaded indexing problems

2013-11-25 Thread Desidero
ported" setup :-) Lucene has no problem with that setup and can index. >> Be sure: >> >> - Don't give too much heap to your indexing app. Larger heaps create >> much more GC load. >> >> - Use a suitable Garbage collector (e.g. Java 7 G1 Collector or Java

Re: Lucene multithreaded indexing problems

2013-11-25 Thread Desidero
gt;> - > >> Uwe Schindler > >> H.-H.-Meier-Allee 63, D-28213 Bremen > >> http://www.thetaphi.de > >> eMail: u...@thetaphi.de > >>> -Original Message- > >>> From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru] > &g

Re: Lucene multithreaded indexing problems

2013-11-25 Thread Igor Shalyminov
>>  Uwe >>  - >>  Uwe Schindler >>  H.-H.-Meier-Allee 63, D-28213 Bremen >>  http://www.thetaphi.de >>  eMail: u...@thetaphi.de >>>  -Original Message- >>>  From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru] >>>  Sent:

Re: Lucene multithreaded indexing problems

2013-11-23 Thread Daniel Penning
riginal Message- From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru] Sent: Saturday, November 23, 2013 4:46 PM To: java-user@lucene.apache.org Subject: Re: Lucene multithreaded indexing problems So we return to the initially described setup: multiple parallel workers, each making "p

Re: Lucene multithreaded indexing problems

2013-11-23 Thread Daniel Penning
Maybe you should turn on Garbage Collection logging to confirm that you are running into some kind of memory problem. (start JVM with -verbose:gc) If the GC is running very often as soon as your indexing process slows down, i would suggest you to create a heapdump and check what the memory is us

RE: Lucene multithreaded indexing problems

2013-11-23 Thread Uwe Schindler
e eMail: u...@thetaphi.de > -Original Message- > From: Igor Shalyminov [mailto:ishalymi...@yandex-team.ru] > Sent: Saturday, November 23, 2013 4:46 PM > To: java-user@lucene.apache.org > Subject: Re: Lucene multithreaded indexing problems > > So we return to the initially de

Re: Lucene multithreaded indexing problems

2013-11-23 Thread Igor Shalyminov
wrong, too. >>>  Uwe >>> >>>  - >>>  Uwe Schindler >>>  H.-H.-Meier-Allee 63, D-28213 Bremen >>>  http://www.thetaphi.de >>>  eMail: u...@thetaphi.de >>>>   -Original Message- >>>>   From: Igor Shalym

Re: Lucene multithreaded indexing problems

2013-11-22 Thread Uwe Schindler
hope you will not do), things >will go wrong, too. >> >> Uwe >> >> - >> Uwe Schindler >> H.-H.-Meier-Allee 63, D-28213 Bremen >> http://www.thetaphi.de >> eMail: u...@thetaphi.de >> >>>  -Original Message- >>>  From: Igor Shalym

Re: Lucene multithreaded indexing problems

2013-11-22 Thread Igor Shalyminov
r Shalyminov [mailto:ishalymi...@yandex-team.ru] >>  Sent: Thursday, November 21, 2013 4:45 PM >>  To: java-user@lucene.apache.org >>  Subject: Lucene multithreaded indexing problems >> >>  Hello! >> >>  I tried to perform indexing multithreadedly, with a F

Lucene multithreaded indexing problems

2013-11-21 Thread Igor Shalyminov
Hello! I tried to perform indexing multithreadedly, with a FixedThreadPool of Callable workers. The main operation - parsing a single document and addDocument() to the index - is done by a single worker. After parsing a document, a lot (really a lot) of Strings appears, and at the end of the wo

RE: Lucene multithreaded indexing problems

2013-11-21 Thread Uwe Schindler
.ru] > Sent: Thursday, November 21, 2013 4:45 PM > To: java-user@lucene.apache.org > Subject: Lucene multithreaded indexing problems > > Hello! > > I tried to perform indexing multithreadedly, with a FixedThreadPool of > Callable workers. > The main operation - parsing a si

Indexing Problems

2006-04-07 Thread trupti mulajkar
hello, i have modified the IndexFiles.java to read the document numbers from within the TREC files, which are also being read correctly, however the index fails to create the .cfs file. thus the search query does not return the correct document number. any suggestions how this can be sorted? chee

Re: indexing problems

2006-03-07 Thread Apache Lucene
BTW, I could access that index using Luke. It works fine. On 3/7/06, Apache Lucene <[EMAIL PROTECTED]> wrote: > > This line is throwing a null pointer exception for the index I created as > I mentioned in my previous emails. > > searcher = new IndexSearcher(IndexReader.open(indexPath) ); > > Any

Re: indexing problems

2006-03-07 Thread Apache Lucene
This line is throwing a null pointer exception for the index I created as I mentioned in my previous emails. searcher = new IndexSearcher(IndexReader.open(indexPath) ); Any ideas? I made sure the indexPath is a valid path. thanks, lucenenator On 3/7/06, Erik Hatcher <[EMAIL PROTECTED]> wrote:

Re: indexing problems

2006-03-07 Thread Erik Hatcher
On Mar 7, 2006, at 10:41 AM, Apache Lucene wrote: Is it advisable to use compound file format? or should I revert it back to simple file format? How do I revert it back? There is a setter on IndexWriter to set it back if you like. The compound format avoids the issues that cropped up a

Re: indexing problems

2006-03-07 Thread Apache Lucene
Is it advisable to use compound file format? or should I revert it back to simple file format? How do I revert it back? thanks, lucenenator On 3/7/06, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > You are using the compound file format (the default since 1.4) and the > .cfs file contains all thos

Re: indexing problems

2006-03-07 Thread Yonik Seeley
You are using the compound file format (the default since 1.4) and the .cfs file contains all those individual parts. -Yonik On 3/7/06, Apache Lucene <[EMAIL PROTECTED]> wrote: > Hi, >I am using Lucene 1.9.1 to index the files. The index writer created > the following files > (1) segment

indexing problems

2006-03-07 Thread Apache Lucene
Hi, I am using Lucene 1.9.1 to index the files. The index writer created the following files (1) segment file "segments" (2) deletable file "deletable" (3) compound file "cfs" None of the other files like term info, frequency..etc were created. Is there something obvious, I am doing wrong?

Re: Indexing problems in a dictionary

2005-09-03 Thread Ahmet Aksoy
Hi Paul, I decided to use a minimum number of stop words in my application. I hope, it will work better. According to your suggestion, I made a few trials, and found my optimum values. The following values look like best in my case: mergeFactor = 100; minMergeDocs = 500; maxMergeDocs = 1000

Re: Indexing problems in a dictionary

2005-09-03 Thread Paul Elschot
Ahmet, On Saturday 03 September 2005 10:12, Ahmet Aksoy wrote: > Hi, > I'm using Lucene in an open source java project at > http://belletmen.dev.java.net . > In the project there are several dictionaries with a simple structure. > All items are composed of a "phrase", and a "definition". Both pa

Indexing problems in a dictionary

2005-09-03 Thread Ahmet Aksoy
Hi, I'm using Lucene in an open source java project at http://belletmen.dev.java.net . In the project there are several dictionaries with a simple structure. All items are composed of a "phrase", and a "definition". Both parts might contain a single word, or have lots of words. Since both part