[
https://issues.apache.org/jira/browse/LUCENE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Uwe Schindler updated LUCENE-2690:
----------------------------------
Attachment: LUCENE-2690.patch
Updated patch, that also checks for duplicate terms in the fuzzy rewrite. This
should be fine now, but we need to fix the FuzzyQuery tests to checks for
multiple segments with the same terms that should fail with this patch.
Maybe we need a separate MTQ tests that creates two IndexWriters which add
documents with an overlapping term set to both indexes. Queries are then ran
using MzultiReader, so we can control merging and make sure the term appears
really in two "segments". I will work on a test for that.
> Do MultiTermQuery boolean rewrites per segment
> ----------------------------------------------
>
> Key: LUCENE-2690
> URL: https://issues.apache.org/jira/browse/LUCENE-2690
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: 4.0
>
> Attachments: LUCENE-2690.patch, LUCENE-2690.patch
>
>
> MultiTermQuery currently rewrites FuzzyQuery (using
> TopTermsBooleanQueryRewrite), the auto constant rewrite method and the
> ScoringBQ rewrite methods using a MultiFields wrapper on the top-level
> reader. This is inefficient.
> This patch changes the rewrite modes to do the rewrites per segment and uses
> some additional datastructures (hashed sets/maps) to exclude duplicate terms.
> All tests currently pass, but FuzzyQuery's tests should not, because it
> depends for the minimum score handling, that the terms are collected in
> order..
> Robert will fix FuzzyQuery in this issue, too. This patch is just a start.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]