Re: Index Merging

2012-02-20 Thread Ian Lea
There is nothing in core lucene to do this and I don't recall seeing anything in contrib. One approach would be to loop through all the docs in the second index deleting them if present in first index, commit that change, then merge the two indexes. -- Ian. On Mon, Feb 20, 2012 at 11:58 AM, Kar

Re: Index Merging Space Requirements

2008-03-13 Thread Michael McCandless
Well ... yes and no? Yes, the Log*MergePolicy will still at certain times merge the index all the way down to one segment. If mergeFactor is 10 then this will happen every "power of 10" flushed segments. Ie, after 10 flushes a merge will merge them down to 1 segment, then after 100 flush

Re: Index Merging Space Requirements

2008-03-13 Thread Mark Miller
Thanks a lot Mike...one more question: I remember reading that a regular addDocument call could basically trigger an optimize on a given call. Is this true? Maybe not true anymore? It doesnt sound right to me, but I do remember reading about it. This was pre background merging when it was men

Re: Index Merging Space Requirements

2008-03-13 Thread Michael McCandless
Yes this should reduce transient (while merging) disk usage. However, optimize disregards this parameter, so it will still use the same disk space. However, if you call optimize(N) then that should use less space since it does not merge all the way down to 1 segment. Note that the limit

RE: Index merging and optimizing

2008-01-15 Thread spring
> But it also seems that the parallel/not parallel decision is > something you control on the back end, so I'm not sure the user > is involved in the merge question at all. In other words, you could > easily split the indexing task up amongst several machines and/or > processes and combine all the

Re: Index merging and optimizing

2008-01-14 Thread Erick Erickson
OK, I think I'm getting a better handle here. I can't imagine how it would work to combine indexes that use *different* analyzers on the *same* field. Regardless of what Lucene did, you simply could NOT explain this to a user. To take a simple example, index part of your data for field1 with Keywor

RE: Index merging and optimizing

2008-01-14 Thread spring
> Then why would you want to combine them? > > I really think you need to explain what you're trying to accomplish > rather then obsess on the details. I have to create indexes in parallel because the amount of data is very high. Then I want to merge them into bigger indexes an move them to the s

Re: Index merging and optimizing

2008-01-14 Thread Erick Erickson
Then why would you want to combine them? I really think you need to explain what you're trying to accomplish rather then obsess on the details. Erick On Jan 14, 2008 10:17 AM, <[EMAIL PROTECTED]> wrote: > > I admit I've never used IndexMergeTool, I've always used > > IndexWriter.AddIndexex and

RE: Index merging and optimizing

2008-01-14 Thread spring
> I admit I've never used IndexMergeTool, I've always used > IndexWriter.AddIndexex and then execute > IndexWriter.optimize(). > > And I've seen no problems. That call takes no > analyzer. So you take the first index an add a remaining indexes via addIndexes? What happens if the indexes were crea

Re: Index merging and optimizing

2008-01-14 Thread Erick Erickson
I admit I've never used IndexMergeTool, I've always used IndexWriter.AddIndexex and then execute IndexWriter.optimize(). And I've seen no problems. That call takes no analyzer. Erick On Jan 14, 2008 6:12 AM, <[EMAIL PROTECTED]> wrote: > > See org.apache.lucene.misc.IndexMergeTool > > Thank you.

RE: Index merging and optimizing

2008-01-14 Thread spring
> See org.apache.lucene.misc.IndexMergeTool Thank you. But this uses a hardcoded analyzer and deprecated API-Calls. How does the used analyzer effect the merge process? Is everything reindexed with this new analyzer again? Does this make sense? What if the sources indexes had other analyzers us

Re: Index merging and optimizing

2008-01-13 Thread Erick Erickson
See IndexWriter.AddIndexes. See org.apache.lucene.misc.IndexMergeTool Erick On Jan 13, 2008 12:10 PM, <[EMAIL PROTECTED]> wrote: > Hi, > > are there any ready to use tools out there which I can use for merging and > optimzing? > > I have seen that Luke can optimize, but not merge? > > Or do I ha

Re: index merging

2006-02-15 Thread Daniel Noll
Omar Didi wrote: I have tried to use the isCurrent() method IndexReader to figure out if an index is merging. but since I have to do this evrytime I need to add a document, the performance got s slow. here is what I am doing, I create 4 indexs and I am running with 4 threads. I do a round r

RE: index merging

2006-02-15 Thread Omar Didi
rging? thanks for any hints, - Omar -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: Monday, February 06, 2006 10:03 AM To: java-user@lucene.apache.org Subject: Re: index merging On 2/6/06, Vanlerberghe, Luc <[EMAIL PROTECTED]> wrote: > Sorry to contradict you

Re: index merging

2006-02-06 Thread Yonik Seeley
On 2/6/06, Vanlerberghe, Luc <[EMAIL PROTECTED]> wrote: > Sorry to contradict you Yonik, but I'm pretty sure the commit lock is > *not* locked during a merge, only while the "segments" file is being > updated. Oops, you're right. Good thing too... if the commit lock was held during merges, one co

RE: index merging

2006-02-05 Thread Vanlerberghe, Luc
ile yet. Luc -Original Message- From: Yonik Seeley [mailto:[EMAIL PROTECTED] Sent: donderdag 2 februari 2006 22:25 To: java-user@lucene.apache.org Subject: Re: index merging On 2/2/06, Omar Didi <[EMAIL PROTECTED]> wrote: > Thanks Yonik, > I can't set the merge factor to

Re: index merging

2006-02-02 Thread Yonik Seeley
On 2/2/06, Omar Didi <[EMAIL PROTECTED]> wrote: > Thanks Yonik, > I can't set the merge factor too high because I will end up with the too many > files open problem. Right. I meant only for adding a lot of documents. After a lot of adds, then you could set the mergefactor back down to a reasona

RE: index merging

2006-02-02 Thread Omar Didi
eeley [mailto:[EMAIL PROTECTED] Sent: Thursday, February 02, 2006 3:53 PM To: java-user@lucene.apache.org Subject: Re: index merging No, there isn't anything in the API to tell you that. A merge may be triggered on any IndexWriter.add() call. If you want to avoid merges, you can set the merge fac

Re: index merging

2006-02-02 Thread Yonik Seeley
No, there isn't anything in the API to tell you that. A merge may be triggered on any IndexWriter.add() call. If you want to avoid merges, you can set the merge factor really high so that merges will never happen, and set maxBufferedDocs to the size of the segments you want. -Yonik On 2/2/06, Om