[jira] [Commented] (LUCENE-3320) Explore Proximity Scoring

Andrzej Bialecki (JIRA) Fri, 15 Jul 2011 13:59:27 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-3320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066199#comment-13066199
 ]


Andrzej Bialecki  commented on LUCENE-3320:
-------------------------------------------

An interesting concept to consider under this topic is sentence-level proximity 
scoring. This is based on the assumption that often a proximity of terms within 
a single sentence is enough to treat this as a stronger-than-average 
association of terms, so when sentence boundaries are known the term positions 
can be reduced to just sentence numbers (i.e. postings from the same sentence 
use the same position that is a sentence number).

This is a middle ground between the no-proximity data (omitPositions) and the 
full-proximity data. There is some literature available on this that indicates 
this approach is promising: 
http://www.springerlink.com/content/t5355418276v7115 , it's also mentioned in 
the papers on static index pruning.

> Explore Proximity Scoring 
> --------------------------
>
>                 Key: LUCENE-3320
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3320
>             Project: Lucene - Java
>          Issue Type: Sub-task
>          Components: core/search
>    Affects Versions: Positions Branch
>            Reporter: Simon Willnauer
>             Fix For: Positions Branch
>
>
> Positions will be first class citizens rather sooner than later. We should 
> explore proximity scoring possibilities as well as collection / scoring 
> algorithms like proposed on LUCENE-2878 (2 phase collection)
> This paper might provide some basis for actual scoring implementation: 
> http://plg.uwaterloo.ca/~claclark/sigir2006_term_proximity.pdf

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-3320) Explore Proximity Scoring

Reply via email to