Hi,
thanks for the link and indeed
https://issues.apache.org/jira/browse/LUCENE-7171 /
https://github.com/apache/lucene/issues/8226 seems to be the issue
here.
> Maybe try a simple `new TermQuery(new Term("id", "flags-1-1"))`
query
during update and see if it returns the correct ans?
That was t
I'm confused as to what could be happening.
Google led me to this StackOverflow link:
https://stackoverflow.com/questions/36402235/lucene-stringfield-gets-tokenized-when-doc-is-retrieved-and-stored-again
which references some longstanding old issues about fields changing their
"types" and so on.
Th
Hi,
thank you for reply and apologies for being somewhat "all over the
place".
Regarding "tokenization" - should it happen if I use StringField?
When the document is created (before writing) i see in the debugger
it's not tokenized and is of type StringField:
```
doc = {Document@4830}
"Documen
Hey,
I don't think I understand the email well but I'll try my best.
In your printed docs, I see that the flag data is still tokenized. See the
string that you printed: DOCS
stored,indexed,tokenized,omitNorms. What does your code for adding the doc
look like?
Are you using StringField for adding
Addendum, output is:
```
maxDoc: 3
maxDoc (after second flag): 3
Document
stored,indexed,tokenized,omitNorms,indexOptions=DOCS
stored,indexed,tokenized,omitNorms,indexOptions=DOCS
stored>
Document
stored,indexed,tokenized,omitNorms,indexOptions=DOCS
stored,indexed,tokenized,omitNorms,indexOpti
Thank you Gautam!
This works. Now I went back to Lucene and I'm hitting the wall.
In James they set document with "id" being constructed as
"flag--" (e.g. "").
I run the code that updates the documents with flags and afterwards
check the result. The code simple code I use new reader from the
wri
Hey,
Use a StringField instead of a TextField for the title and your test will
pass.
Tokenization which is enabled for TextFields, is breaking your fancy title
into tokens split by spaces, which is causing your docs to not match.
https://lucene.apache.org/core/9_11_0/core/org/apache/lucene/documen
Hi Froh,
thank you for the information.
I updated the code and re-open the reader - it seems that the update
is reflected and search for old document doesn't yield anything but
the search for new term fails.
I output all documents (there are 2) and the second one has new title
but when searching
Hi Wojtek,
Thank you for linking to your test code!
When you open an IndexReader, it is locked to the view of the Lucene
directory at the time that it's opened.
If you make changes, you'll need to open a new IndexReader before those
changes are visible. I see that you tried creating a new IndexS
Hi all!
There is an effort in Apache James to update to a more modern version of Lucene (ref:
https://github.com/apache/james-project/pull/2342). I'm digging into the issue as other have done
but I'm stumped - it seems that `org.apache.lucene.index.IndexWriter#updateDocument` doesn't update
th
ED]>
> To: java-user@lucene.apache.org
> Sent: Friday, January 5, 2007 12:53:12 AM
> Subject: efficient ways of updating document
>
> It seems to me that updating a document is rather tedious and slow in
lucene, especially for updating large number of documents. Before opening
an In
at the state of those changes is (whether they
are still in Lucene's JIRA, or whether they are in CVS).
Otis
- Original Message
From: John Song <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, January 5, 2007 12:53:12 AM
Subject: efficient ways of updating document
are in CVS).
Otis
- Original Message
From: John Song <[EMAIL PROTECTED]>
To: java-user@lucene.apache.org
Sent: Friday, January 5, 2007 12:53:12 AM
Subject: efficient ways of updating document
It seems to me that updating a document is rather tedious and slow in lucene,
especially fo
It seems to me that updating a document is rather tedious and slow in lucene,
especially for updating large number of documents. Before opening an
IndexWriter to add documents, one has to open an IndexReader/IndexSearcher to
search for the document of a particular id. Upon finding its docnum,
Im sending a snippet of code how to reconstruct UNSTORED fields.
It has two parts:
DB+terms
Class.forName("org.postgresql.Driver").newInstance();
con = DriverManager.getConnection("jdbc:postgresql:lucene",
"lucene", "lucene");
PreparedStatement psCompany=con.prepareStatemen
Well, you can have! :-) Even I have not tested, just an idea.
You can get document id after add - numDocs() and insert if DB fails,
you can delete document from RAMDir.
Or in my case of batches - im adding documents in DB with savepoint,
than create clear index (create=true) and at the end if
This strategy can also be nicely abstracted from your main app. Whilst I
haven't yet implemented it, my plan is to create a template style structure
which tells me which fields are in lucene, and which are externalized. This
way I don't bother storing data in lucene that it stored elsewhere, but
Jason is right. I think, even Im not expert on lucene too, your newly
added document cann't recreate terms for field with analyzer, because
field text in empty.
There is very hairy solution - hack a IndexReader, FieldInfosWriter and
use addIndexes.
Lucene is "only" a fulltext search library, n
ess). So
when
> you retrieve the document, you lose non-stored fields.
>
Yes we have some important fields that are not stored in the index. Is
there a way to overcome this problem? while updating document. Will i
face the same problem with IndexModifier ? (Now I am using IndexReader
and Ind
l.. they can be reconstructed but it's a lossy process). So when
> you retrieve the document, you lose non-stored fields.
>
Yes we have some important fields that are not stored in the index. Is
there a way to overcome this problem? while updating document. Will i
face the same problem with Index
Hi,
I'm facing similar problem. I found a possible way, how to copy a
part of index (w/o copy whole index,delete,optimize), but don't know how
to change/add/remove field (or add term vector in my case) to existing
index.
To copy a part of index override methods in IndexReader
/** Returns
Are your storing the contents of the fields in the index? That is,
specifying Field.Store.YES when creating the field?
In my experience fields which are not stored are not recoverable from the
index (well.. they can be reconstructed but it's a lossy process). So when
you retrieve the document,
On Thu, 2006-08-10 at 09:16 -0400, Erick Erickson wrote:
> You say "Those documents that we updated are not searchable now". I've got
> to ask the obvious question, did you close and re-open the *searcher*
> (really, the indexreader you use in your searcher)? I suspect you have, but
> thought I'd a
You say "Those documents that we updated are not searchable now". I've got
to ask the obvious question, did you close and re-open the *searcher*
(really, the indexreader you use in your searcher)? I suspect you have, but
thought I'd ask explicitly.
I'd also get a copy of Luke (http://www.getopt.o
Hi Deepan, The steps below seems correct, given that all the fields of the
original document are also stored - the javadoc for
indexReader.document(int n) (which I assume is what you are using) says: "
Returns the stored fields of the nth Document in this index." - so, only
stored fields would exis
Hi,
We have to update few documents in our index. We have add a additional
field to them. We did as follows
1)read the documents of our interest using IndexReader
2)copy them to a temporary doc object (temp_doc)
3)delete the document in the index
4)close the IndexReader
5)open the IndexWriter
6)
26 matches
Mail list logo