Seems like you need to scrutinize exactly what documents were indexed in step 3?

How exactly did you copy documents out of the old index?  Note that
when Lucene's IndexReader returns a Document, it's not the same
Document that was indexed in the first place: it will only have fields
that were stored, and it does not store certain metadata about how
those field values were indexed.  But I don't see how that alone can
lead to indexing an empty string token.

Mike McCandless

http://blog.mikemccandless.com


On Sun, Aug 28, 2016 at 7:56 PM, Trejkaz <trej...@trypticon.org> wrote:
> Updating this with newly-obtained info.
>
> 1. The original index was created in Lucene 3.x. In 3.x, if I call
> getMin(), it returns non-empty values. So far so good.
>
> 2. The index then gets migrated to 5.x using multiple IndexUpgrader
> steps. Now, when I call getMin(), it still returns a non-empty value.
>
> 3. At some point, the user performs an operation where we copy
> documents out of the current index into a new index. When we get the
> Document, it has the field in question, even though no value was set
> into the field. This then gets indexed, and when the destination index
> is finally opened, getMin() returns an empty string.
>
> Something doesn't quite add up though.
>
> Surely if we had put an empty string into a field back in 3.x, it
> would have indexed it, and then getMin() would have always returned
> the empty string, but that isn't what we're seeing at all. Even after
> upgrading the index to the 5.x format, getMin() still returns the
> lowest real value. Therefore, it seems reasonable to assume that we
> weren't putting the empty field into the document. But if we didn't
> put it into the document, why is the field now coming back in Lucene
> 5.x?
>
> TX
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to