Or, if it's hard to reduce this to a compact test, can you post the
code you are using and describe where/how it finds a difference? I'd
like to get to the bottom of this.
Mike
Grant Ingersoll wrote:
Can you reduce this down to a unit test?
Thanks,
Grant
On May 16, 2008, at 11:37 AM, Dan Rugg wrote:
After upgrading to version 2.3.x from 2.2.0, we started experiencing
issues with our index searches. Some searches produced false
positives,
while others produce no hits for terms known to be in specific
documents
that where digested. After setting up tests that created indexes
containing single documents we found that version 2.3.x did not
add all
the tokens from a document the index while 2.2.0 did. The only thing
that changed between the tests were the lucene jar being used, and a
fresh index was created for each test.
It seems to be some random action that 2.3.x is taking, or not
taking.
While tokens such as 'traffic' will not be digested in one
document, it
will in another. Token frequency, order, and relative position
seem to
not matter, as indexed and non-indexed tokens where across the board.
The documents being ingested where XML, and the tokenizer for the
documents were the same for 2.2.0 and 2.3.x. We even did a token
dump
of the documents and verified the documents where being tokenized
correctly.
I did notice rebuilding the index was quicker with 2.3.x and the
index
was smaller, but I guess if you aren't adding tokens to the index
it is
bound to smaller. BTW, we tested versions 2.3.1, 2.3.2, and
2.2.0. We
are now back to using 2.2.0.
Daniel Rugg
--------------------------
Grant Ingersoll
Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]