Re: IndexWriter.addIndexes with LeafReader parameter

2016-01-18 Thread Manner Róbert
Hi Christoph, Thanks for the suggestion, it seems to work fine. I have somehow missed this class. Br, Robert On Wed, Jan 13, 2016 at 9:30 AM, Christoph Kaser wrote: > You could try using the org.apache.lucene.index.SlowCodecReader to wrap > your index reader: > SlowCodecReaderWrapper.wrap(ind

Re: IndexWriter.addIndexes with LeafReader parameter

2016-01-13 Thread Christoph Kaser
You could try using the org.apache.lucene.index.SlowCodecReader to wrap your index reader: SlowCodecReaderWrapper.wrap(indexReader) returns a CodecReader from an index reader. Regards Christoph Am 13.01.2016 um 09:09 schrieb Manner Róbert: Unfortunately I can not use that, because I do not wa

Re: IndexWriter.addIndexes with LeafReader parameter

2016-01-13 Thread Manner Róbert
Unfortunately I can not use that, because I do not want to copy all the indexes. Our use case is "archiving" of indexes: we would like to copy to separate file (and remove) part of the indexes, for example which are more than a month old. We achieved it by writing a Reader which does the filtering,

Re: IndexWriter.addIndexes with LeafReader parameter

2016-01-12 Thread Dawid Weiss
You can addIndexes(Directory... dirs) -- then you don't have to deal with CodecReader? Dawid On Tue, Jan 12, 2016 at 4:43 PM, Manner Róbert wrote: > Hi, > > we have used lucene 4.7.0 before, we are on the way to upgrade to 5.4.0. > > The problem I have is that writer.addIndexes now needs CodecRe

RE: IndexWriter.addIndexes() multithread correct?

2014-05-22 Thread Uwe Schindler
e Schindler > Cc: java-user@lucene.apache.org > Subject: Re: IndexWriter.addIndexes() multithread correct? > > Hi Uwe, > > thanks for the reply. > I see slightly different order in the results, and I was thinking are docs > with > the same score and the differe

Re: IndexWriter.addIndexes() multithread correct?

2014-05-22 Thread Erick Erickson
right, for docs with the same score, ties are broken by the internal Lucene ID. This may even change _on the same node_ due to merges! If you want to control this, consider always specifying a secondary sort by, say, your id field if you have one, or date stamp or.. Best, Erick On Thu, May 2

Re: IndexWriter.addIndexes() multithread correct?

2014-05-22 Thread Nicola Buso
Hi Uwe, thanks for the reply. I see slightly different order in the results, and I was thinking are docs with the same score and the difference between the two index (one with "addIndexes" multithreaded and one no) is because of the order of insert in the merged index; could be this the difference

RE: IndexWriter.addIndexes() multithread correct?

2014-05-22 Thread Uwe Schindler
Hi Nicola, Yes, it is thread safe, like all other methods in IndexWriter. In the case of the one taking IndexReaders, Lucene will also do the merging concurrently. Say hello to Jo McEntyre! :-) Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@theta

Re: indexWriter.addIndexes, Disk space, and open files

2010-06-08 Thread Michael McCandless
On Mon, Jun 7, 2010 at 7:19 AM, Regan Heath wrote: > >>> That's pretty much exactly what I suspected was happening.  I've had the > same >>> problem myself on another occasion... out of interest is there any way to >>> force the file closed without flushing? >> >>No, IndexOutput has no such method

Re: indexWriter.addIndexes, Disk space, and open files

2010-06-07 Thread Regan Heath
>> That's pretty much exactly what I suspected was happening.  I've had the same >> problem myself on another occasion... out of interest is there any way to >> force the file closed without flushing? > >No, IndexOutput has no such method. We could consider adding one... That sounds useful in ge

Re: indexWriter.addIndexes, Disk space, and open files

2010-06-07 Thread Michael McCandless
On Mon, Jun 7, 2010 at 6:18 AM, Regan Heath wrote: > > That's pretty much exactly what I suspected was happening.  I've had the same > problem myself on another occasion... out of interest is there any way to > force the file closed without flushing? No, IndexOutput has no such method. We could

Re: indexWriter.addIndexes, Disk space, and open files

2010-06-07 Thread Regan Heath
That's pretty much exactly what I suspected was happening. I've had the same problem myself on another occasion... out of interest is there any way to force the file closed without flushing? From memory I tried everything I could think of at the time but couldn't manage it. Best I could do was

Re: indexWriter.addIndexes, Disk space, and open files

2010-06-07 Thread Michael McCandless
This is a bug in how Lucene handles IOException while closing files. Look at SegmentMerger's sources, for 2.3.2: https://svn.apache.org/repos/asf/lucene/java/tags/lucene_2_3_2/src/java/org/apache/lucene/index/SegmentMerger.java Look at the finally clause in mergeTerms: } finally {

Re: indexWriter.addIndexes, Disk space, and open files

2010-06-07 Thread Regan Heath
If you don't want to use the ImDisk software, a small flash drive will do just as well... Regan Heath wrote: > > Windows XP. > > The problem occurs on the local file system, but to replicate it more > easily I am using http://www.ltr-data.se/opencode.html#ImDisk to mount a > virtual 10mb dis

Re: indexWriter.addIndexes, Disk space, and open files

2010-05-25 Thread Regan Heath
Windows XP. The problem occurs on the local file system, but to replicate it more easily I am using http://www.ltr-data.se/opencode.html#ImDisk to mount a virtual 10mb disk on F:\. It is formatted as an NTFS file system. The files can be removed normally (delete from explorer or command promp

Re: indexWriter.addIndexes, Disk space, and open files

2010-05-25 Thread Erick Erickson
What op system and what file system are you using? Is the file system local or networked? What does it take to remove the files. That is, can you do it manually after the program shuts down? Best Erick On Tue, May 25, 2010 at 5:42 AM, Regan Heath < regan.he...@bridgeheadsoftware.com> wrote: > >

Re: IndexWriter.addIndexes & optimizatio

2006-06-27 Thread Karel Tejnora
depends of the document type, look at method setOmitNorms in Field class. heritrix.lucene wrote: Hi, Aprrox 50 Million i have processed upto now. I kept maxMergeFactor and maxBufferedDoc's value 1000. This value i got after several round of test runs. Indexing rate for each document in 50 M, is

Re: IndexWriter.addIndexes & optimizatio

2006-06-12 Thread heritrix . lucene
Hi, Aprrox 50 Million i have processed upto now. I kept maxMergeFactor and maxBufferedDoc's value 1000. This value i got after several round of test runs. Indexing rate for each document in 50 M, is 1 Document per 4.85 ms. I am only using fsdirectory. Is there any other way to reduce this time??

Re: IndexWriter.addIndexes & optimizatio

2006-06-12 Thread Erick Erickson
a billion? Wow! First, I really, really, really doubt you can use a RAMdir to index a billion documents. I'd be interested in the parameters of your problem if you could. I'd be especially interested in providing a home for any of your old hardware, since I bet it beats mine all to hell . Second,

Re: IndexWriter.addIndexes & optimizatio

2006-06-12 Thread heritrix . lucene
: vipin sharma [mailto:[EMAIL PROTECTED] > Sent: Monday, June 12, 2006 12:31 PM > To: java-user@lucene.apache.org; Otis Gospodnetic > Subject: Re: IndexWriter.addIndexes & optimizatio > > - > Just set your maxBufferedDocs to as high a number as your RAM/heap will > let you, a

RE: IndexWriter.addIndexes & optimizatio

2006-06-11 Thread Flik Shen
per values according to your box physical setting. > -Original Message- > From: vipin sharma [mailto:[EMAIL PROTECTED] > Sent: Monday, June 12, 2006 12:31 PM > To: java-user@lucene.apache.org; Otis Gospodnetic > Subject: Re: IndexWriter.addIndexes & optimizatio > >

Re: IndexWriter.addIndexes & optimizatio

2006-06-11 Thread vipin sharma
ut doesn't get you in trouble with open files. Otis - Original Message From: Dan Armbrust <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, June 7, 2006 4:05:49 PM Subject: Re: IndexWriter.addIndexes & optimization Benjamin Stein wrote: > >

Re: IndexWriter.addIndexes & optimizatio

2006-06-08 Thread Yonik Seeley
On 6/8/06, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: When writing a unit test that comapres RAMDirectory and FSDirectory performance for Lucene in Action I had a very hard time showing that RAMDirectory really is faster. :) For indexing, even if you open IndexWriter with a FSDirectory, it i

Re: IndexWriter.addIndexes & optimization

2006-06-08 Thread Otis Gospodnetic
doesn't get you in trouble with open files. Otis - Original Message From: Dan Armbrust <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Wednesday, June 7, 2006 4:05:49 PM Subject: Re: IndexWriter.addIndexes & optimization Benjamin Stein wrote: > > I cou

Re: IndexWriter.addIndexes & optimization

2006-06-07 Thread Dan Armbrust
Benjamin Stein wrote: I could probably store the little RAMDirectories to disk as many FSDirectories, and then addIndexes() of *all* the FSDirectories at the end instead of every time. That would probably be smart. Glad I asked myself! That was what I was going to suggest - you may also wa

Re: IndexWriter.addIndexes & optimization

2006-06-07 Thread Grant Ingersoll
My understanding of the IndexWriter code is that it more or less manages this for you. It has an internal RAMDirectory which it uses to index in memory and then periodically flushes to disk based on your merge factor settings (amongst other settings). So I am not sure if the extra work you ar

Re: IndexWriter.addIndexes & optimization

2006-06-07 Thread Benjamin Stein
On 6/7/06, Benjamin Stein <[EMAIL PROTECTED]> wrote: During indexing, I have been using a RAMDirectory to store many thousands of documents in memory before flushing the buffer to disk using IndexWriter.addIndexes. For the most part this works very well, except that performance degrades tremendo

RE: IndexWriter.addIndexes

2006-03-21 Thread Frank Kunemann
Ok, thank you Otis! Frank -Original Message- From: Otis Gospodnetic [mailto:[EMAIL PROTECTED] Sent: Tuesday, March 21, 2006 4:44 PM To: java-user@lucene.apache.org Subject: Re: IndexWriter.addIndexes Hi, Yes, no IOException means all went well, I believe. Otis - Original

Re: IndexWriter.addIndexes

2006-03-21 Thread Otis Gospodnetic
Hi, Yes, no IOException means all went well, I believe. Otis - Original Message From: Frank Kunemann <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Tuesday, March 21, 2006 7:29:16 AM Subject: IndexWriter.addIndexes Hi, all I want to know about IndexWriter.addIndexes() is