: Sorry, here's the example I meant to show. Doc 1 and doc 2 both contain the
: terms "hey look, the quick brown fox jumped very high", but in Doc 1 all the
: terms are indexed at the same position. In doc 2, the terms are indexed in
: adjacent positions (normal way). For the query "the quick brow
Sorry, here's the example I meant to show. Doc 1 and doc 2 both contain the
terms "hey look, the quick brown fox jumped very high", but in Doc 1 all the
terms are indexed at the same position. In doc 2, the terms are indexed in
adjacent positions (normal way). For the query "the quick brown fox", d
: > I suppose SpanTermQuery could override the weight/scorer methods so that
: > it behaved more like a TermQuery if it was executed directly ... but
: > that's really not what it's intended for.
:
: This is currently the only way to boost a term via payloads.
: BoostingTermQuery extends SpanTerm
> I suppose SpanTermQuery could override the weight/scorer methods so that
> it behaved more like a TermQuery if it was executed directly ... but
> that's really not what it's intended for.
This is currently the only way to boost a term via payloads.
BoostingTermQuery extends SpanTermQuery.
> if
: For a SpanNearQuery that contains SpanTermQueries, the score for a match on
: "the quick brown fox" would be lower than a match on "brown fox" because of
: the edit distance (4 vs 2). This seems counter intuitive, too.
you have to clarify what you mean ...
if you're talking about a SpanNearQu
: For a 'SpanNearQuery', this reduces the effect of the term frequency on the
: score as the number of terms in the span increases. So, for a simple phrase
: query (using spans), the longer the phrase, the lower the TF. For a simple
: SpanTermQuery, the TF is reduced in half (1.0f / 1 + 1).
:
: I
The reason I asked about Span scoring is that the behavior changed when I
switched from TermQuery to BoostingTermQuery to take advantage of payloads.
It seems to me that a SpanTermQuery and BoostingTermQuery should behave the
same as TermQuery with respect to term frequency. The 'edit distance' is
The DefaultSimilarity class defines sloppyFreq as:
public float sloppyFreq(int distance) {
return 1.0f / (distance + 1);
}
For a 'SpanNearQuery', this reduces the effect of the term frequency on the
score as the number of terms in the span increases. So, for a simple phrase
query (using spans),