Non-tokenized fields become tokenized when a document is deleted and added back
-------------------------------------------------------------------------------
Key: LUCENE-3854
URL: https://issues.apache.org/jira/browse/LUCENE-3854
Project: Lucene - Java
Issue Type: Bug
Components: core/index
Affects Versions: 4.0
Reporter: Benson Margulies
https://github.com/bimargulies/lucene-4-update-case is a JUnit test case that
seems to show a problem with the current trunk. It creates a document with a
Field typed as StringField.TYPE_STORED and a value with a "-" in it. A
TermQuery can find the value, initially, since the field is not tokenized.
Then, the case reads the Document back out through a reader. In the copy of the
Document that gets read out, the Field now has the tokenized bit turned on.
Next, the case deletes and adds the Document. The 'tokenized' bit is respected,
so now the field gets tokenized, and the result is that the query on the term
with the - in it no longer works.
So I think that the defect here is in the code that reconstructs the Document
when read from the index, and which turns on the tokenized bit.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]