Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-11 Thread Wojtek
;   queryBuilder.add(new TermQuery(new >>   >>   Term(MAILBOX_ID_FIELD, mailboxId.serialize())), >>   >>   BooleanClause.Occur.MUST); >>   >>   queryBuilder.add(createQuery(MessageRange.one(uid) >>  [http://MessageRange.one(uid)]), >>   >

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Gautam Worah
doc = searcher.doc(sDoc.doc); > > doc.removeFields(FLAGS_FIELD); > > indexFlags(doc, f); > > // somehow the document getting from the search > lost DocValues data for the uid field, we need to re-define the field > with proper DocValues. > > long uid

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
;     writer.updateDocument(new Term(ID_FIELD, doc.get(ID_FIELD)), doc);     }     } } ``` I was wondering if Lucene/writer configuration is not a culprit (that would result in tokenizing even StringField) but it looks fairly straightforward: ``` this.directory = directory; this.wri

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Gautam Worah
Hey, I don't think I understand the email well but I'll try my best. In your printed docs, I see that the flag data is still tokenized. See the string that you printed: DOCS stored,indexed,tokenized,omitNorms. What does your code for adding the doc look like? Are you using StringField for adding

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Addendum, output is: ``` maxDoc: 3 maxDoc (after second flag): 3 Document stored,indexed,tokenized,omitNorms,indexOptions=DOCS stored,indexed,tokenized,omitNorms,indexOptions=DOCS stored> Document stored,indexed,tokenized,omitNorms,indexOptions=DOCS stored,indexed,tokenized,omitNorms,indexOpti

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Thank you Gautam! This works. Now I went back to Lucene and I'm hitting the wall. In James they set document with "id" being constructed as "flag--" (e.g. ""). I run the code that updates the documents with flags and afterwards check the result. The code simple code I use new reader from the wri

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Gautam Worah
Hey, Use a StringField instead of a TextField for the title and your test will pass. Tokenization which is enabled for TextFields, is breaking your fancy title into tokens split by spaces, which is causing your docs to not match. https://lucene.apache.org/core/9_11_0/core/org/apache/lucene/documen

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-10 Thread Wojtek
Hi Froh, thank you for the information. I updated the code and re-open the reader - it seems that the update is reflected and search for old document doesn't yield anything but the search for new term fails. I output all documents (there are 2) and the second one has new title but when searching

Re: Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-09 Thread Michael Froh
Hi Wojtek, Thank you for linking to your test code! When you open an IndexReader, it is locked to the view of the Lucene directory at the time that it's opened. If you make changes, you'll need to open a new IndexReader before those changes are visible. I see that you tried creating a new IndexS

Updating document with IndexWriter#updateDocument doesn't seem to take effect

2024-08-09 Thread Wojtek
Hi all! There is an effort in Apache James to update to a more modern version of Lucene (ref: https://github.com/apache/james-project/pull/2342). I'm digging into the issue as other have done but I'm stumped - it seems that `org.apache.lucene.index.IndexWriter#updateDocument` doesn't update th

Re: IndexWriter concurrent flushing

2019-02-17 Thread Michael McCandless
perfection :) Mike McCandless http://blog.mikemccandless.com On Fri, Feb 15, 2019 at 4:11 PM Michael Sokolov wrote: > I noticed that commit() was taking an inordinately long time. It turned out > IndexWriter was flushing using only a single thread because it relies on > its caller to sup

IndexWriter concurrent flushing

2019-02-15 Thread Michael Sokolov
I noticed that commit() was taking an inordinately long time. It turned out IndexWriter was flushing using only a single thread because it relies on its caller to supply it with threads (via updateDocument, deleteDocument, etc), which it then "hijacks" to do flushing. If (as we do

Re: Issue with re-opening IndexWriter

2019-02-07 Thread damian.pawski
Hi I am having a similar issue on Win2012, we have 1 Master and N+ Slaves. This issue intermittently happens on the Slaves, sometimes when the Master Core is reloaded due to the schema update, but not always, there no single trigger for this issue (at least from what I have noticed). Currently,

SearcherManager not seeing changes in IndexWriter

2018-11-09 Thread Boris Petrov
Hi all, I'm using Lucene version 7.5.0. We have a test that does something like: Thread 1:             Field idStringField = new StringField("id", id, Field.Store.YES);             Field contentsField = new TextField("contents", reader);             Document document = new Document();           

Re: IndexWriter updateDocument is removing doc from index

2018-03-16 Thread Michael McCandless
wrote: > While writing some tools to build and maintain lucene indexes I noticed > some strange behavior during testing. > A doc disappears from lucene index while using IndexWriter updateDocument. > > The API of lucene 6.4.2 states: > "Updates a document by first deleting th

IndexWriter updateDocument is removing doc from index

2018-03-15 Thread Bernd Fehling
While writing some tools to build and maintain lucene indexes I noticed some strange behavior during testing. A doc disappears from lucene index while using IndexWriter updateDocument. The API of lucene 6.4.2 states: "Updates a document by first deleting the document(s) containing term and

Re: Open IndexWriter to prior commit

2017-09-05 Thread Michael McCandless
m/2012/03/transactional-lucene.html > > I'm interested in the part that references distributed transactions and > says: > > "if Lucene completed its 2nd phase commit but the database's 2nd phase > hit some error or crash or power loss, you can easily rollback > Luc

Open IndexWriter to prior commit

2017-09-05 Thread Bryan Bende
database's 2nd phase hit some error or crash or power loss, you can easily rollback Lucene's commit by opening an IndexWriter on the prior commit. " I see that you can pass in an IndexWriterConfig with an IndexCommit which will tell the IndexWriter where to open from... What is the

Re: Issue with re-opening IndexWriter

2017-03-15 Thread Michael McCandless
Hi, This is a known windows specific issue when you close IndexWriter while IndexReaders are still open, because windows prevents deletion of still-open-for-reading files. Can you close all open IndexReaders before closing the first IndexWriter? This way IndexWriter will be able to delete the

Issue with re-opening IndexWriter

2017-03-13 Thread Vitaly Stroutchkov
Hello, We are using Lucene 6.4.2 with a file-based index, Oracle JDK 1.8.0_121, Windows 10. We found that the following steps generate unrecoverable error (we have to restart our JVM in order to resume normal work which is not acceptable): 1. Create an index directory, open IndexWriter

Re: making realtime deletion through indexWriter, deletion not synced to indexReader

2017-02-20 Thread ximing
esponsible for the index generating and >> deletion, and Searcher.java is responsible for doing the search. >> >> In order to make the deletion immediately change the search result, I >> expose the indexWriter used in the generation/deletion period and >&

Re: making realtime deletion through indexWriter, deletion not synced to indexReader

2017-02-20 Thread Michael McCandless
> > As trying to make a real-time deletion, I made two singletons, > IndexGenerator.java is responsible for the index generating and > deletion, and Searcher.java is responsible for doing the search. > > In order to make the deletion immediately change the search result, I > expo

making realtime deletion through indexWriter, deletion not synced to indexReader

2017-02-19 Thread ximing
expose the indexWriter used in the generation/deletion period and build the indexSearcher on that. I have checked the documentation of IndexWriter [https://lucene.apache.org/core/5_0_0/core/org/apache/lucene/index/DirectoryReader.html] and quoting: "flushing just moves the internal buffered

Re: IndexWriter and IndexReader in a shared environment

2016-12-14 Thread Michael McCandless
> From: Michael McCandless [mailto:luc...@mikemccandless.com] > Sent: Thursday, July 07, 2016 8:52 AM > To: Lucene Users ; myshar...@gmail.com > Subject: Re: IndexWriter and IndexReader in a shared environment > > The API is pretty simple. > > Create IndexWriter and leave it open forev

RE: IndexWriter and IndexReader in a shared environment

2016-12-14 Thread Siraj Haider
m: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, July 07, 2016 8:52 AM To: Lucene Users ; myshar...@gmail.com Subject: Re: IndexWriter and IndexReader in a shared environment The API is pretty simple. Create IndexWriter and leave it open forever, using it to index/delete docu

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-28 Thread William Moss
a machine or JVM crashes we are > > in a > > > coherent state. To that end, we need to call commit on Lucene and then > > > commit back what we've read so far to Kafka. Calling commit is the only > > way > > > to ensure this, right? > > > > Correct: com

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-28 Thread Shai Erera
t; > to ensure this, right? > > Correct: commit in Lucene, then notify Kafka what offset you had > indexed just before you called IW.commit. > > But you may want to replicate the index across machines if you don't > want to have a single point of failure. We recently

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-28 Thread Michael McCandless
e sure I understand how maybeRefresh works, ignoring whether or not > we commit for a second, if I add a document via IndexWriter, it will not be > reflected in IndexSearchers I get by calling acquire on SearcherAndTaxonomy > until I call maybeRefresh? Correct. > Now, on to the concur

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-28 Thread Michael McCandless
;> >> > I've been >> > digging into our realtime indexing code and how we use Lucene and I >> wanted >> > to check a few assumptions around synchronization, since we see some >> > periodic exceptions[1] that I can't quite explain. >> &

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-27 Thread William Moss
I understand how maybeRefresh works, ignoring whether or not we commit for a second, if I add a document via IndexWriter, it will not be reflected in IndexSearchers I get by calling acquire on SearcherAndTaxonomy until I call maybeRefresh? Now, on to the concurrency issue. I was thinking a lit

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-26 Thread Shai Erera
o our realtime indexing code and how we use Lucene and I > wanted > > to check a few assumptions around synchronization, since we see some > > periodic exceptions[1] that I can't quite explain. > > > > First, a tiny bit of background > > 1. We use facet

Re: IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-26 Thread Michael McCandless
ns around synchronization, since we see some > periodic exceptions[1] that I can't quite explain. > > First, a tiny bit of background > 1. We use facets and therefore are writing realtime updates using both > a IndexWriter and DirectoryTaxonomyWriter. > 2. We have multiple upda

IndexWriter, DirectoryTaxonomyWriter & SearcherTaxonomyManager synchronization

2016-09-26 Thread William Moss
we see some periodic exceptions[1] that I can't quite explain. First, a tiny bit of background 1. We use facets and therefore are writing realtime updates using both a IndexWriter and DirectoryTaxonomyWriter. 2. We have multiple update threads, consuming messages (from Kafka) and updating the in

Re: Reduce index size without reopening IndexWriter

2016-07-13 Thread Jaime
em to be erased until I close and reopen the Index Writer. I've tried calling IndexWriter.deleteUnusedFiles after each commit, but didn't help. Is there any way to free space without closing the IndexWriter? - To unsu

Re: Reduce index size without reopening IndexWriter

2016-07-12 Thread Adrien Grand
dex Writer. > > I've tried calling IndexWriter.deleteUnusedFiles after each commit, but > didn't help. > > Is there any way to free space without closing the IndexWriter? > > > > -- > Jaime Pardos > ESTRUCTURE MEDIA SYSTEMS, S.L. > Avda. de Madrid

Reduce index size without reopening IndexWriter

2016-07-11 Thread Jaime
e and reopen the Index Writer. I've tried calling IndexWriter.deleteUnusedFiles after each commit, but didn't help. Is there any way to free space without closing the IndexWriter? -- Jaime Pardos ESTRUCTURE MEDIA SYSTEMS, S.L. Avda. de Madrid nº 120 nave 10, 28500, Arganda del Rey

Re: IndexWriter and IndexReader in a shared environment

2016-07-07 Thread Michael McCandless
The API is pretty simple. Create IndexWriter and leave it open forever, using it to index/delete documents, and periodically calling IW.commit when you need durability. Create a SearcherManager, passing it the IndexWriter, and use it per-search to acquire/release the searcher. Periodically

IndexWriter and IndexReader in a shared environment

2016-07-05 Thread Desteny Child
Hi, In my Spring Boot application I have implemented 2 API endpoints - one for Lucene(I use 5.2.1) document indexing and another one for searching. Right now I open every time on each request IndexWriter and IndexReader. With a big index it works pretty slow. I know that there is a possibility

Obtain list of indexed fields from IndexWriter

2016-07-01 Thread Ishan Chattopadhyaya
Hi, If an update to a non-existent dv field is attempted, IndexWriter throws an exception: "can only update existing numeric-docvalues fields!". This exception is thrown after checking with the globalFieldNumberMap (which is obtained from the SegmentInfos). Is there a way, given an I

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-07-01 Thread Michael McCandless
That looks like correct usage of MultiSearcher ... just be sure the "release" happens in a finally clause. But have a look at my prior email ... it's best to keep a single IndexWriter open per index, and pass that writer when you create your SearcherManagers. Mike McCandless

RE: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Mukul Ranjan
: Re: LockFactory issue observed in lucene while getting instance of indexWriter But do you open any near-real-time readers from this writer? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 16, 2016 at 1:01 PM, Mukul Ranjan mailto:mran...@egain.com>> wrote: Hi Michael, Than

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Michael McCandless
But do you open any near-real-time readers from this writer? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 16, 2016 at 1:01 PM, Mukul Ranjan wrote: > Hi Michael, > > > > Thanks for your reply. > > I’m running it on windows. I have checked my code, I’m closin

RE: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Mukul Ranjan
Hi Michael, Thanks for your reply. I’m running it on windows. I have checked my code, I’m closing IndexWriter after adding document to it. We are not getting this issue always but it’s frequency is high in our application. Can you please provide your suggestion? Thanks, Mukul From: Michael

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Michael McCandless
Are you running on Windows? This is not a LockFactory issue ... it's likely caused because you closed IndexWriter, and then opened a new one, before closing NRT readers you had opened from the first writer? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 16, 2016 at 6:19 AM,

Re: LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Ian Lea
Sounds to me like it's related to the index not having been closed properly or still being updated or something. I'd worry about that. -- Ian. On Thu, Jun 16, 2016 at 11:19 AM, Mukul Ranjan wrote: > Hi, > > I'm observing below exception while getting

LockFactory issue observed in lucene while getting instance of indexWriter

2016-06-16 Thread Mukul Ranjan
Hi, I'm observing below exception while getting instance of indexWriter- java.lang.IllegalArgumentException: Directory MMapDirectory@"directoryName" lockFactory=org.apache.lucene.store.NativeFSLockFactory@1ec79746 still has pending deleted files; cannot initialize IndexWriter I

Re: IndexWriter is not closing the FDs (deleted files)

2015-09-01 Thread Marcio Napoli
try { > > > > final FSDirectory directory = FSDirectory.open(indexFullPath); > > > > this.openIndex(directory); > > > > } catch (Exception e) { > > > > logger.error("Problema no índice lucene para

Re: IndexWriter is not closing the FDs (deleted files)

2015-09-01 Thread András Péteri
e) { > > > logger.error("Problema no índice lucene para Cidadão", e); > > > throw EJBUtil.rollbackNE(e); > > > } > > > } > > > > > > private void openIndex(Directory directory) throws IOException { > > > final IndexWriterConfig config = new IndexW

Re: IndexWriter is not closing the FDs (deleted files)

2015-09-01 Thread Marcio Napoli
, e); > > throw EJBUtil.rollbackNE(e); > > } > > } > > > > private void openIndex(Directory directory) throws IOException { > > final IndexWriterConfig config = new IndexWriterConfig(LUCENE_4_10_3, > > DEFAULT_ANALYZER); > > config.setMaxThreadStates(2

Re: IndexWriter is not closing the FDs (deleted files)

2015-09-01 Thread Ian Lea
rror("Problema no índice lucene para Cidadão", e); > throw EJBUtil.rollbackNE(e); > } > } > > private void openIndex(Directory directory) throws IOException { > final IndexWriterConfig config = new IndexWriterConfig(LUCENE_4_10_3, > DEFAULT_ANALYZER); > config.set

Re: IndexWriter is not closing the FDs (deleted files)

2015-09-01 Thread Marcio Napoli
ew IndexWriterConfig(LUCENE_4_10_3, DEFAULT_ANALYZER); config.setMaxThreadStates(2); config.setCheckIntegrityAtMerge(true); this.writer = new IndexWriter(directory, config); this.reader = DirectoryReader.open(this.writer, true); } private void addToIndex(final CidadaoBean cidadaoBean, boolean commi

Re: IndexWriter is not closing the FDs (deleted files)

2015-08-31 Thread Anton Zenkov
Are you sure you are not holding open readers somewhere? On Mon, Aug 31, 2015 at 7:46 PM, Marcio Napoli wrote: > Hey! :) > > It seems IndexWriter is not closing the descriptors of the removed files, > see the log below. > > Thanks, > Napoli > > [root@server01 log]# l

IndexWriter is not closing the FDs (deleted files)

2015-08-31 Thread Marcio Napoli
Hey! :) It seems IndexWriter is not closing the descriptors of the removed files, see the log below. Thanks, Napoli [root@server01 log]# ls -l /proc/59491/fd | grep index l-wx--. 1 wildfly wildfly 64 Ago 31 11:26 429 -> /usr/local/wildfly-2.0/standalone/data/index/cidadao/write.lock l

Re: Changing analyzer in an indexwriter

2015-04-21 Thread Michael McCandless
gt;> > On Sunday, April 19, 2015 1:37 PM, Lisa Ziri wrote: >> > Hi, >> > I'm upgrading to lucene 5.1.0 from lucene 4. >> > In our index we have documents in different languages which are analyzed >> > with the correct analyzer. >> > We used the me

Re: Changing analyzer in an indexwriter

2015-04-20 Thread Anna Elisabetta Ziri
; > On Sunday, April 19, 2015 1:37 PM, Lisa Ziri wrote: > > Hi, > > I'm upgrading to lucene 5.1.0 from lucene 4. > > In our index we have documents in different languages which are analyzed > > with the correct analyzer. > > We used the method addDocume

Re: Changing analyzer in an indexwriter

2015-04-20 Thread Michael McCandless
ferent languages which are analyzed > with the correct analyzer. > We used the method addDocument of IndexWriter giving the correct analyzer > for every different document. > Now I see that I can define the analyzer used by the IndexWriter only in > the creation and I cannot switch ana

Re: Changing analyzer in an indexwriter

2015-04-19 Thread Ahmet Arslan
h are analyzed with the correct analyzer. We used the method addDocument of IndexWriter giving the correct analyzer for every different document. Now I see that I can define the analyzer used by the IndexWriter only in the creation and I cannot switch analyzer on the same IndexWriter. We allow to do

Changing analyzer in an indexwriter

2015-04-19 Thread Lisa Ziri
Hi, I'm upgrading to lucene 5.1.0 from lucene 4. In our index we have documents in different languages which are analyzed with the correct analyzer. We used the method addDocument of IndexWriter giving the correct analyzer for every different document. Now I see that I can define the analyzer

Lucene IndexWriter created write.lock and on commit the thread locks permanently.

2014-10-31 Thread aditya_pratap
Hi, When IndexWriter initializes it creates write.lock into the index directory. And after the writer.close() it automatically deleted from there and I am able to see several files containing extension fdx, fdt, si and many more. This process works fine in windows but in case of linux (Red Hat

Re: re-use IndexWriter

2014-07-08 Thread Ian Lea
IndexWriter's close operation is really costly , and the Lucene's doc > sugguest to re-use IndexWriter instance , i did it , i kept the indexWriter > instance , and give it back to every request thread , But there comes a big > problem , i never search the index changes beca

re-use IndexWriter

2014-07-08 Thread Jason.H
nowadays , i've been trying every way to improve the performance of indexing , IndexWriter's close operation is really costly , and the Lucene's doc sugguest to re-use IndexWriter instance , i did it , i kept the indexWriter instance , and give it back to every request thr

AW: IndexWriter#updateDocument(Term, Document)

2014-06-19 Thread Clemens Wyss DEV
18:59 An: Lucene Users Betreff: Re: IndexWriter#updateDocument(Term, Document) There is a bug in your test: you cannot use reader.maxDoc(). It's expected this would be 2 when (*) is commented out, because you have 2 docs, one of which is deleted. Use numDocs instead? Mike McCandless

Re: IndexWriter#updateDocument(Term, Document)

2014-06-19 Thread Michael McCandless
s DEV wrote: > directory = new SimpleFSDirectory( indexLocation ); > IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_47, new > WhitespaceAnalyzer( Version.LUCENE_47 )); > indexWriter = new IndexWriter( directory, config ); > Document doc = new Document(); > String va

AW: IndexWriter#updateDocument(Term, Document)

2014-06-19 Thread Clemens Wyss DEV
directory = new SimpleFSDirectory( indexLocation ); IndexWriterConfig config = new IndexWriterConfig(Version.LUCENE_47, new WhitespaceAnalyzer( Version.LUCENE_47 )); indexWriter = new IndexWriter( directory, config ); Document doc = new Document(); String value = "hello"; String k

Re: IndexWriter#updateDocument(Term, Document)

2014-06-18 Thread Michael McCandless
order to omit > duplicate entries I am making use of IndexWriter#updateDocument(Term, > Document) > > open an IndexWriter; > foreach( element in elementsToBeUpdatedWhichHaveDuplicates ) > { > doc = element.toDoc(); > indexWriter.updateDocument( uniqueTermForElement, doc ); > }

IndexWriter#updateDocument(Term, Document)

2014-06-18 Thread Clemens Wyss DEV
I would like to perform a batch update on an index. In order to omit duplicate entries I am making use of IndexWriter#updateDocument(Term, Document) open an IndexWriter; foreach( element in elementsToBeUpdatedWhichHaveDuplicates ) { doc = element.toDoc(); indexWriter.updateDocument

Aw: RE: 回复: Never close IndexWriter/Reader?

2014-05-06 Thread Sascha Janz
Hello, many thanks to your answers. i modified our sources to leave the IndexWriter open. Also we changed the commit strategy. may be we will do commit only at night. We used to create own IndexReader for search. Now we changed this using SearcherManager which performs quite well. We

Re: Never close IndexWriter/Reader?

2014-05-04 Thread Denis Bazhenov
Hello, Sascha. That's right. You should close IndexWriter instance only when the applications itself is stopping. To make documents visible to newly created IndexReader instances commit() call is enough. On May 4, 2014, at 2:46 AM, Sascha Janz wrote: > Hi, > > > > We

RE: 回复: Never close IndexWriter/Reader?

2014-05-04 Thread Uwe Schindler
> Sent: Sunday, May 04, 2014 6:05 AM > To: java-user > Subject: 回复: Never close IndexWriter/Reader? > > Hi, Mike. Instead of periodically reopen NRT reader , I open/close it for > every > search query , will this result in performance issue? > > > Thanks > lubi

?????? Never close IndexWriter/Reader?

2014-05-03 Thread 308181687
ot;Lucene Users"; ????: Re: Never close IndexWriter/Reader? Just leave your IW open forever and periodically reopen your NRT reader. Be sure you close your old NRT reader after opening a new one; SearcherManager makes this easy when multiple threads are using the readers. Committing every

Re: Never close IndexWriter/Reader?

2014-05-03 Thread Michael McCandless
> Mostly emails. We need the update near real time, so we open the IndexReader > with Directory.open and IndexWriter. > > > > Periodically we do a commit, e.g. every 200 documents. > > > > We used to close the IndexWriter on commit, and then open a new one. I read

Never close IndexWriter/Reader?

2014-05-03 Thread Sascha Janz
Hi, We use lucene 4.6, our application receives continuously new documents. Mostly emails. We need the update near real time, so we open the IndexReader with Directory.open and IndexWriter. Periodically we do a commit, e.g. every 200 documents. We used to close the IndexWriter on

Re: IndexReplication Client and IndexWriter

2014-04-16 Thread Christoph Kaser
NRT replication branch (LUCENE-5438), InfosRefCounts (weird name), whose purpose is to do what IndexFileDeleter does for IndexWriter, ie keep track of which files are still referenced, delete them when they are done, etc. This could used on the client side to hold a lease for another client. Mike M

Re: IndexReplication Client and IndexWriter

2014-04-15 Thread Shai Erera
>> (LUCENE-5438), InfosRefCounts (weird name), whose purpose is to do >> what IndexFileDeleter does for IndexWriter, ie keep track of which >> files are still referenced, delete them when they are done, etc. This >> could used on the client side to hold a lease for another client. >>

Re: IndexReplication Client and IndexWriter

2014-04-11 Thread Christoph Kaser
UCENE-5438), InfosRefCounts (weird name), whose purpose is to do what IndexFileDeleter does for IndexWriter, ie keep track of which files are still referenced, delete them when they are done, etc. This could used on the client side to hold a lease for another client. Mike McCand

Re: IndexReplication Client and IndexWriter

2014-04-08 Thread Michael McCandless
You might be able to use a class on the NRT replication branch (LUCENE-5438), InfosRefCounts (weird name), whose purpose is to do what IndexFileDeleter does for IndexWriter, ie keep track of which files are still referenced, delete them when they are done, etc. This could used on the client side

Re: IndexReplication Client and IndexWriter

2014-04-08 Thread Shai Erera
IndexRevision uses the IndexWriter for deleting unused files when the revision is released, as well as to obtain the SnapshotDeletionPolicy. I think that you will need to implement two things on the "client" side: * Revision, which doesn't use IndexWriter. * Replicator which kee

Re: IndexReplication Client and IndexWriter

2014-04-08 Thread Michael McCandless
It's not safe also opening an IndexWriter on the client side. But I agree, supporting tree topology would make sense; it seems like we just need a way for the ReplicationClient to also be a Replicator. It seems like it should be possible, since it's clearly aware of the SessionToken i

IndexReplication Client and IndexWriter

2014-04-08 Thread Christoph Kaser
Hi all, I am trying out the (highly useful) index replicator module (with the HttpReplicator) and have stumbled upon a question: It seems, the IndexReplicationHandler is working directly on the index directory, without using an indexwriter. Could there be a problem if I open an IndexWriter on

Re: Sending a document to IndexWriter field by field

2014-02-20 Thread Michael McCandless
mbinatorially in size and blow the index > up in some sense. > I definitely want to think about breaking them into pieces, thank you for the > advice! > > > -- > Best Regards, > Igor Shalyminov > > > 21.02.2014, 00:50, "Michael McCandless" : >> Yes, i

Re: Sending a document to IndexWriter field by field

2014-02-20 Thread Igor Shalyminov
in some sense. I definitely want to think about breaking them into pieces, thank you for the advice! -- Best Regards, Igor Shalyminov 21.02.2014, 00:50, "Michael McCandless" : > Yes, in 4.x IndexWriter now takes an Iterable that enumerates the > fields one at a time. > &g

Re: Sending a document to IndexWriter field by field

2014-02-20 Thread Michael McCandless
Yes, in 4.x IndexWriter now takes an Iterable that enumerates the fields one at a time. You can also pass a Reader to a Field. That said, there will still be massive RAM required by IW to hold the inverted postings for that one document, likely much more RAM than the original document's S

Sending a document to IndexWriter field by field

2014-02-20 Thread Igor Shalyminov
t to IndexWriter, and Document is just a collection of all the fields, all in RAM. With my huge fields, it would be so much better to have the ability of sending document fields for writing one by one, keeping no more than a single field in RAM. Is it possible in the latest Lucene? -- Best Regards,

Re: IndexWriter croaks on large file

2014-02-19 Thread Tri Cao
tion. Will there be a problem with adding multiple Document objects to the IndexWriter that have the same field names and values for the StoredFields ? They all have different TextFields (the content). I've tried doing this and haven't found any problems with it, but I'm just wonder

Re: IndexWriter croaks on large file

2014-02-19 Thread John Cecere
Thanks Tri. I've tried a variation of the approach you suggested here and it appears to work well. Just one question. Will there be a problem with adding multiple Document objects to the IndexWriter that have the same field names and values for the StoredFields ? They all have diff

Re: IndexWriter croaks on large file

2014-02-14 Thread Tri Cao
ng exception:java.lang.IllegalArgumentException: startOffset must be non-negative, andendOffset must be >= startOffset,startOffset=-2147483648,endOffset=-2147483647Essentially, I'm doing this:Directory directory = new MMapDirectory(indexPath);Analyzer analyzer = new StandardAnalyzer()

Re: IndexWriter croaks on large file

2014-02-14 Thread Glen Newton
with the following exception: >>> >>> java.lang.IllegalArgumentException: startOffset must be non-negative, and >>> endOffset must be >= startOffset, >>> startOffset=-2147483648,endOffset=-2147483647 >>> >>> Essentially, I'm doing this: &g

Re: IndexWriter croaks on large file

2014-02-14 Thread John Cecere
doing this: Directory directory = new MMapDirectory(indexPath); Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_45, analyzer); IndexWriter iw = new IndexWriter(directory, iwc); InputStream is = ; InputStreamReader reader = new InputStreamReader(is); Do

Re: IndexWriter croaks on large file

2014-02-14 Thread Michael McCandless
Analyzer analyzer = new StandardAnalyzer(); > IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_45, analyzer); > IndexWriter iw = new IndexWriter(directory, iwc); > > InputStream is = ; > InputStreamReader reader = new InputStreamReader(is); > > Document doc = new Document(

IndexWriter croaks on large file

2014-02-14 Thread John Cecere
-2147483647 Essentially, I'm doing this: Directory directory = new MMapDirectory(indexPath); Analyzer analyzer = new StandardAnalyzer(); IndexWriterConfig iwc = new IndexWriterConfig(Version.LUCENE_45, analyzer); IndexWriter iw = new IndexWriter(directory, iwc); InputStream is = ; InputStreamRea

Re: IndexWriter and IndexReader

2014-02-13 Thread Michael McCandless
ig.OpenMode.CREATE seems enough. I am considering that >SearcherFactory can warm and prepare my IndexReader for live system. Is >this a good way or am I totally in wrong direction? >3. During IndexWriter operations such as overwriting indexes, what are >the consequences of "s

IndexWriter and IndexReader

2014-02-13 Thread Cemo
, I want to overwrite them. Opening my indexes with IndexWriterConfig.OpenMode.CREATE seems enough. I am considering that SearcherFactory can warm and prepare my IndexReader for live system. Is this a good way or am I totally in wrong direction? 3. During IndexWriter operations such as

Re: IndexWriter flush/commit exception

2013-12-19 Thread Michael McCandless
On Wed, Dec 18, 2013 at 11:34 PM, Ravikumar Govindarajan wrote: >> You could make a custom Dir wrapper that always caches in RAM, but >> that sounds a bit terrifying :) > > This was exactly what I implemented:) I see :) > A commit-thread runs periodically > every 30 seconds, while RAM-Monitor th

Re: IndexWriter flush/commit exception

2013-12-18 Thread Ravikumar Govindarajan
> You could make a custom Dir wrapper that always caches in RAM, but > that sounds a bit terrifying :) This was exactly what I implemented:) A commit-thread runs periodically every 30 seconds, while RAM-Monitor thread runs every 5 seconds to commit data in-case sizeInBytes>=70%-of-maxCachedBytes.

Re: IndexWriter flush/commit exception

2013-12-18 Thread Michael McCandless
On Wed, Dec 18, 2013 at 3:15 AM, Ravikumar Govindarajan wrote: > Thanks Mike for a great explanation on Flush IOException You're welcome! > I was thinking on the perspective of a HDFSDirectory. In addition to the > all causes of IOException during flush you have listed, a HDFSDirectory > also ha

Re: IndexWriter flush/commit exception

2013-12-18 Thread Ravikumar Govindarajan
Thanks Mike for a great explanation on Flush IOException I was thinking on the perspective of a HDFSDirectory. In addition to the all causes of IOException during flush you have listed, a HDFSDirectory also has to deal with network issues, which is not lucene's problem at all. But I would ideally

Re: IndexWriter flush/commit exception

2013-12-17 Thread Michael McCandless
On Mon, Dec 16, 2013 at 7:33 AM, Ravikumar Govindarajan wrote: > I am trying to model a transaction-log for lucene, which creates a > transaction-log per-commit > > Things work fine during normal operations, but I cannot fathom the effect > during > > a. IOException during Index-Commit > > Will th

IndexWriter flush/commit exception

2013-12-16 Thread Ravikumar Govindarajan
I am trying to model a transaction-log for lucene, which creates a transaction-log per-commit Things work fine during normal operations, but I cannot fathom the effect during a. IOException during Index-Commit Will the index be restored to previous commit-point? Can I blindly re-try operations f

Re: 2 exceptions in IndexWriter

2013-07-25 Thread Michael McCandless
Are you sure each test always closes the IndexWriter? Still, it's best to have each test uses its own directory to avoid any risk of confusing failures ... Mike McCandless http://blog.mikemccandless.com On Thu, Jul 25, 2013 at 12:24 PM, Yonghui Zhao wrote: > My test case first c

Re: 2 exceptions in IndexWriter

2013-07-25 Thread Yonghui Zhao
"? It's best to open IndexWriter with OpenMode.CREATE to purge > (rather than remove the files yourself). > > Lock obtain timed out means another IndexWriter is currently using > that directory. > > > > Mike McCandless > > http://blog.mikemccandless.com

  1   2   3   4   5   6   >