[ 
https://issues.apache.org/jira/browse/LUCENE-5579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-5579:
---------------------------------
    Attachment: spatial.alg
                LUCENE-5579_CompositeSpatialStrategy.patch

This is the latest patch, along with a benchmark .alg file (which depends on 
LUCENE-6399 to make a comparison in one run).  For whatever reason, I'm only 
seeing a 40% increase in speed now; I'm not sure what changed in the benchmark 
or recent trunk changes.  Again, YMMV a ton.

* Added missing equals/hashcode, and fixed our QueryEqualsHashCodeTest to 
actually work ;-)  (note assertNotSame() does NOT call .equals).
* Added support for most predicates to CompositeSpatialStrategy.  I left 
Disjoint as a TODO; it could be implemented using DocValues.getDocsWithField, 
but I don't see it as worth bothering with right now.
* Added optimized path for when all hits are exact -- no geometries need 
double-checking.
* Pulled out the inner Query instances to live in the "composite" package, so 
that they might be used by users who want to build with these constructs.

[~jpountz] In this patch I added a method to BitDocIdSet.Builder:
{code:java}
    /**
     * Is this builder definitely empty?  If so, {@link #build()} will return 
null.  This is usually the same as
     * simply being empty but if this builder was constructed with the {@code 
full} option or if an iterator was passed
     * that iterated over no documents, then we're not sure.
     */
    public boolean isDefinitelyEmpty() {
      return sparseSet == null && denseSet == null;
    }
{code}
Cool?  Another non-spatial change in this patch is making ConstantScoreWeight 
public, not package-local.  Should be be \@lucene.experimental or 
@lucene.internal?  It seems generic enough.

At this point I think it's ready to commit.  I haven't cared enough about where 
the code should live to bother changing it from it's current form as an 
independent SpatialStrategy.

> Spatial, enhance RPT to differentiate confirmed from non-confirmed hits, then 
> validate with SDV
> -----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-5579
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5579
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>         Attachments: LUCENE-5579_CompositeSpatialStrategy.patch, 
> LUCENE-5579_CompositeSpatialStrategy.patch, 
> LUCENE-5579_SPT_leaf_covered.patch, spatial.alg
>
>
> If a cell is within the query shape (doesn't straddle the edge), then you can 
> be sure that all documents it matches are a confirmed hit. But if some 
> documents are only on the edge cells, then those documents could be validated 
> against SerializedDVStrategy for precise spatial search. This should be 
> *much* faster than using RPT and SerializedDVStrategy independently on the 
> same search, particularly when a lot of documents match.
> Perhaps this'll be a new RPT subclass, or maybe an optional configuration of 
> RPT.  This issue is just for the Intersects predicate, which will apply to 
> Disjoint.  Until resolved in other issues, the other predicates can be 
> handled in a naive/slow way by creating a filter that combines RPT's filter 
> and SerializedDVStrategy's filter using BitsFilteredDocIdSet.
> One thing I'm not sure of is how to expose to Lucene-spatial users the 
> underlying functionality such that they can put other query/filters 
> in-between RPT and the SerializedDVStrategy.  Maybe that'll be done by simply 
> ensuring the predicate filters have this capability and are public.
> It would be ideal to implement this capability _after_ the PrefixTree term 
> encoding is modified to differentiate edge leaf-cells from non-edge leaf 
> cells. This distinction will allow the code here to make more confirmed 
> matches.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to