If you want to stick with the approach of multiple indexes you'll have
to add some logic to work round it.
Option 1.
Post merge, loop through all docs identifying duplicates and deleting
the one(s) you don't want.
Option 2.
Pre merge, read all indexes in parallel, identifying and deleting as a
Hello
Have a peculiar problem to deal with and I am sure there must be some
way to handle it.
1. Indexes exist on the server for existing files.
2. Generating indexing is automated so files when generated will also
lead to index generation.
3. I am merging the newly generated indexes and exi
ptimizing the index just after merging it, no matter if I use the lucene
> 3.X addIndexes or addIndexesNoOptimize as the sum of time of doing both
> things will be the same in one case or other. Am I right?
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble
ssage in context:
http://lucene.472066.n3.nabble.com/performance-merging-indexes-with-addIndexesNoOptimize-tp1889378p1890595.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
-
To unsubscribe, e-mail: java
dex using compound file (the merged one). If i do:
> w.optimze() ;
> w.close (false) ;
> Would I be getting any benefit if the w.close (false) ?
>
> Thanks in advance
>
>
> --
> View this message in context:
> http://luce
e (false) ;
Would I be getting any benefit if the w.close (false) ?
Thanks in advance
--
View this message in context:
http://lucene.472066.n3.nabble.com/performance-merging-indexes-with-addIndexesNoOptimize-tp1889378p1890077.html
Sent from t
into 6... till geting
> a single big index. Could this be faster?
>
> Does anyone have experience with this? Any advice?
> Thanks in advance
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/performance-merging-indexes-with-addIndexesNoOptimize-tp18893
uld this be faster?
Does anyone have experience with this? Any advice?
Thanks in advance
--
View this message in context:
http://lucene.472066.n3.nabble.com/performance-merging-indexes-with-addIndexesNoOptimize-tp1889378p1889378.html
Sent from the Lucene - Java Users mailing list archive
Store.YES,
>> Field.Index.ANALYZED));
>>
>> doc.add(new Field("contents", indexForm.getContent(),
>> Field.Store.YES,
>> Field.Index.ANALYZED));
>>
>> writer.updateDocument(new Term(""+i), doc);
>>
>&g
re.YES,
>> Field.Index.ANALYZED));
>>
>> doc.add(new Field("contents", indexForm.getContent(),
>> Field.Store.YES,
>> Field.Index.ANALYZED));
>>
>> writer.updateDocument(new Term(""+i), doc);
>&
ld.Store.YES,
> Field.Index.ANALYZED));
>
> writer.updateDocument(new Term(""+i), doc);
>
> no changes still .. Am i doing wrong??? help me
> --
> View this message in context:
> http://old.nabble.com/rem
ALYZED));
doc.add(new Field("contents", indexForm.getContent(),
Field.Store.YES, Field.Index.ANALYZED));
writer.updateDocument(new Term("id"), doc);
but still no change .. where am doing wrong??
--
View this message in context:
http://old.nabble.com/remove-
doc.add(new Field("contents", indexForm.getContent(),
Field.Store.YES, Field.Index.ANALYZED));
writer.updateDocument(new Term(""+i), doc);
no changes still .. Am i doing wrong??? help me
--
View this message in context:
http://ol
> help
>>> how do i remove duplicate documents by updating the existing index??
>>> --
>>> View this message in context:
>>> http://old.nabble.com/remove-duplicate-when-merging-indexes-tp26280244p26280244.html
>>> Sent from the Lucene - Java Users m
ting the existing index??
>> --
>> View this message in context:
>> http://old.nabble.com/remove-duplicate-when-merging-indexes-tp26280244p26280244.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> -
pends with the existing indexes , anyone help
> how do i remove duplicate documents by updating the existing index??
> --
> View this message in context:
> http://old.nabble.com/remove-duplicate-when-merging-indexes-tp26280244p26280244.html
> Sent from the Lucene -
ng indexes , anyone help
how do i remove duplicate documents by updating the existing index??
--
View this message in context:
http://old.nabble.com/remove-duplicate-when-merging-indexes-tp26280244p26280244.html
Sent from the Lucene - Java Users mailing list archive at
Pretty sure you can delete the small indexes after the merge.
BTW: How long does your indexing and merging take respectively?
--
--
Chris Lu
-
Instant Scalable Full-Text Search On Any Database/Application
site: http://www.dbsight.net
demo: http://search.dbsight.com
Lucen
Hello:
Currently, I have a large index being build in sections. The indexing
program has multiple threads which it uses to optimize time; each thread
makes its own, separate index to avoid threads fighting over resources. At
the end of the program, the indexes are merged into a single index.
...I
With ConcurrentMergeScheduler, adding all indexes at once to a single
IndexWriter will use multiple threads to do the merging, assuming you
have enough total segments that need merging (> 2 X mergeFactor will
use 2 threads; > 3 X mergeFactor will use 3, etc.; CMS defaults to max
3 merge t
:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2008 1:17 PM
To: java-user@lucene.apache.org
Subject: Merging indexes & multicore/multithreading
Let's say I have 8 indexes on a 4 core system and I want to merge them
(inside a single vm instance).
Is it better to do a single merge of all 8,
Let's say I have 8 indexes on a 4 core system and I want to merge them
(inside a single vm instance).
Is it better to do a single merge of all 8, or to in parallel threads
merge in pairs, until there is only a single index left? I guess the
question involves how multi-threaded merging is and if it
Thanks Karsten,
I decided first to delete all duplicates from master(iW) and then to insert
all temporary indices(other).
I reached the same conclusion. As your code shows, it's a simple enough
solution. You had a good point with the iW.abort() in the rollback case.
Antony
---
t; UIDs
> of the incoming documents.
>
> b) iterate through the Documents in the temporary index and add them to
> the master
>
> b sounds worse as it seems an IndexWriter's Analyzer cannot be null and I
> guess
> there's a penalty in assembling
I am creating several temporary batches of indexes to separate indices and
periodically will merge those batches to a set of master indices. I'm using
IndexWriter#addIndexesNoOptimise(), but problem that gives me is that the master
may already contain the index for that document and I get a dup
eletion is done just get a
new IndexReader instance to access the new documents.
Aviran
http://www.aviransplace.com
-Original Message-
From: Volodymyr Bychkoviak [mailto:[EMAIL PROTECTED]
Sent: Monday, August 08, 2005 1:50 PM
To: java-user@lucene.apache.org
Subject: merging indexes toge
viran
http://www.aviransplace.com
-Original Message-
From: Volodymyr Bychkoviak [mailto:[EMAIL PROTECTED]
Sent: Monday, August 08, 2005 1:50 PM
To: java-user@lucene.apache.org
Subject: merging indexes together
Hello All.
In my program I index new information to temporary dir and after then I
delet
Hello All.
In my program I index new information to temporary dir and after then I
delete outdated information from main index and add new information by
calling indexWriter.addIndexes() method. This works fine when doc number
is relatively small but when index size grows, every call to addInd
28 matches
Mail list logo