[
https://issues.apache.org/jira/browse/LUCENE-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated LUCENE-5579:
---------------------------------
Attachment: spatial.alg
LUCENE-5579_CompositeSpatialStrategy.patch
This is the latest patch, along with a benchmark .alg file (which depends on
LUCENE-6399 to make a comparison in one run). For whatever reason, I'm only
seeing a 40% increase in speed now; I'm not sure what changed in the benchmark
or recent trunk changes. Again, YMMV a ton.
* Added missing equals/hashcode, and fixed our QueryEqualsHashCodeTest to
actually work ;-) (note assertNotSame() does NOT call .equals).
* Added support for most predicates to CompositeSpatialStrategy. I left
Disjoint as a TODO; it could be implemented using DocValues.getDocsWithField,
but I don't see it as worth bothering with right now.
* Added optimized path for when all hits are exact -- no geometries need
double-checking.
* Pulled out the inner Query instances to live in the "composite" package, so
that they might be used by users who want to build with these constructs.
[~jpountz] In this patch I added a method to BitDocIdSet.Builder:
{code:java}
/**
* Is this builder definitely empty? If so, {@link #build()} will return
null. This is usually the same as
* simply being empty but if this builder was constructed with the {@code
full} option or if an iterator was passed
* that iterated over no documents, then we're not sure.
*/
public boolean isDefinitelyEmpty() {
return sparseSet == null && denseSet == null;
}
{code}
Cool? Another non-spatial change in this patch is making ConstantScoreWeight
public, not package-local. Should be be \@lucene.experimental or
@lucene.internal? It seems generic enough.
At this point I think it's ready to commit. I haven't cared enough about where
the code should live to bother changing it from it's current form as an
independent SpatialStrategy.
> Spatial, enhance RPT to differentiate confirmed from non-confirmed hits, then
> validate with SDV
> -----------------------------------------------------------------------------------------------
>
> Key: LUCENE-5579
> URL: https://issues.apache.org/jira/browse/LUCENE-5579
> Project: Lucene - Core
> Issue Type: New Feature
> Components: modules/spatial
> Reporter: David Smiley
> Assignee: David Smiley
> Attachments: LUCENE-5579_CompositeSpatialStrategy.patch,
> LUCENE-5579_CompositeSpatialStrategy.patch,
> LUCENE-5579_SPT_leaf_covered.patch, spatial.alg
>
>
> If a cell is within the query shape (doesn't straddle the edge), then you can
> be sure that all documents it matches are a confirmed hit. But if some
> documents are only on the edge cells, then those documents could be validated
> against SerializedDVStrategy for precise spatial search. This should be
> *much* faster than using RPT and SerializedDVStrategy independently on the
> same search, particularly when a lot of documents match.
> Perhaps this'll be a new RPT subclass, or maybe an optional configuration of
> RPT. This issue is just for the Intersects predicate, which will apply to
> Disjoint. Until resolved in other issues, the other predicates can be
> handled in a naive/slow way by creating a filter that combines RPT's filter
> and SerializedDVStrategy's filter using BitsFilteredDocIdSet.
> One thing I'm not sure of is how to expose to Lucene-spatial users the
> underlying functionality such that they can put other query/filters
> in-between RPT and the SerializedDVStrategy. Maybe that'll be done by simply
> ensuring the predicate filters have this capability and are public.
> It would be ideal to implement this capability _after_ the PrefixTree term
> encoding is modified to differentiate edge leaf-cells from non-edge leaf
> cells. This distinction will allow the code here to make more confirmed
> matches.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]