[
https://issues.apache.org/jira/browse/LUCENE-2690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921066#action_12921066
]
Yonik Seeley commented on LUCENE-2690:
--------------------------------------
bq. This has a big speed degradion for lots of MTQs if we don't reorder clauses
intelligent.
Seems like the right place for sorting is in the MTQ rewrite to a BQ.
The current patch makes BQ rewrite quite a bit more expensive... a clone is
always made, and equals is always called on the clone after.
For normal boolean queries (caused by someone typing in a few words), it seems
like a real-world speedup is unlikely (since the terms would need to be in the
same tii block). People generating very large boolean queries should also be
able to pre-sort them and not have the overhead imposed every time.
> Do MultiTermQuery boolean rewrites per segment
> ----------------------------------------------
>
> Key: LUCENE-2690
> URL: https://issues.apache.org/jira/browse/LUCENE-2690
> Project: Lucene - Java
> Issue Type: Improvement
> Affects Versions: 4.0
> Reporter: Uwe Schindler
> Assignee: Uwe Schindler
> Fix For: 4.0
>
> Attachments: LUCENE-2690-attributes.patch,
> LUCENE-2690-attributes.patch, LUCENE-2690-attributes.patch,
> LUCENE-2690-hack.patch, LUCENE-2690.patch, LUCENE-2690.patch,
> LUCENE-2690.patch, LUCENE-2690.patch, LUCENE-2690.patch, LUCENE-2690.patch,
> LUCENE-2690.patch, LUCENE-2690.patch, LUCENE-2690.patch, LUCENE-2690.patch,
> LUCENE-2690.patch, LUCENE-2690.patch, LUCENE-2690.patch, LUCENE-2690.patch
>
>
> MultiTermQuery currently rewrites FuzzyQuery (using
> TopTermsBooleanQueryRewrite), the auto constant rewrite method and the
> ScoringBQ rewrite methods using a MultiFields wrapper on the top-level
> reader. This is inefficient.
> This patch changes the rewrite modes to do the rewrites per segment and uses
> some additional datastructures (hashed sets/maps) to exclude duplicate terms.
> All tests currently pass, but FuzzyQuery's tests should not, because it
> depends for the minimum score handling, that the terms are collected in
> order..
> Robert will fix FuzzyQuery in this issue, too. This patch is just a start.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]