David Smiley created LUCENE-5779:
------------------------------------

             Summary: Improve BBox AreaSimilarity algorithm to consider lines 
and points
                 Key: LUCENE-5779
                 URL: https://issues.apache.org/jira/browse/LUCENE-5779
             Project: Lucene - Core
          Issue Type: Improvement
          Components: modules/spatial
            Reporter: David Smiley


GeoPortal's area overlap algorithm didn't consider lines and points; they end 
up turning the score 0.  I've thought about this for a bit and I've come up 
with an alternative scoring algorithm.  (already coded and tested and 
documented):
New Javadocs:
{code:java}
/**
 * The algorithm is implemented as envelope on envelope overlays rather than
 * complex polygon on complex polygon overlays.
 * <p/>
 * <p/>
 * Spatial relevance scoring algorithm:
 * <DL>
 *   <DT>queryArea</DT> <DD>the area of the input query envelope</DD>
 *   <DT>targetArea</DT> <DD>the area of the target envelope (per Lucene 
document)</DD>
 *   <DT>intersectionArea</DT> <DD>the area of the intersection between the 
query and target envelopes</DD>
 *   <DT>queryTargetProportion</DT> <DD>A 0-1 factor that divides the score 
proportion between query and target.
 *   0.5 is evenly.</DD>
 *
 *   <DT>queryRatio</DT> <DD>intersectionArea / queryArea; (see note)</DD>
 *   <DT>targetRatio</DT> <DD>intersectionArea / targetArea; (see note)</DD>
 *   <DT>queryFactor</DT> <DD>queryRatio * queryTargetProportion;</DD>
 *   <DT>targetFactor</DT> <DD>targetRatio * (1 - queryTargetProportion);</DD>
 *   <DT>score</DT> <DD>queryFactor + targetFactor;</DD>
 * </DL>
 * Note: The actual computation of queryRatio and targetRatio is more 
complicated so that it considers
 * points and lines. Lines have the ratio of overlap, and points are either 1.0 
or 0.0 depending on wether
 * it intersects or not.
 * <p />
 * Based on Geoportal's
 * <a 
href="http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java";>
 *   SpatialRankingValueSource</a> but modified. GeoPortal's algorithm will 
yield a score of 0
 * if either a line or point is compared, and it's doesn't output a 0-1 
normalized score (it multiplies the factors).
 *
 * @lucene.experimental
 */
{code}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to