Re: Document Boost

Erick Erickson Thu, 19 Apr 2007 17:24:22 -0700

I hate to ask this (actually, I don't hate it, but...) "what behavior
of the scoring are you actually finding doesn't fit your needs?".


The reason I ask is that I've been asked to change the scoring,
that is, set boosts, based on some vague notion of "how
things should work" that is often just the fevered imagination
of someone rather than an actual set of search results that are
unsatisfactory.

Of course, a geek's response is "Sure, I can make it do that."
without the concomitant analysis of whether the users would
ever notice.

Anyway, just a rant on implementing features that do no real
good and delay delivery for no good reason.

You may very well have very good reasons to want to fiddle with
boosts, but I thought I'd mention it <G>....

Erick

On 4/19/07, Les Fletcher <[EMAIL PROTECTED]> wrote:


Oooooo.... I like the BAR_significant field idea.  It seems that you'd
have to have one of those for every different level of boosting in your
document.  But that is significantly easier than reforming a query for
30-odd fields.

The next quersion would be should you omit the boosted field words from
the default searchg field to get the "appropriate" score for that
document.  In other words, would matching the query in the
BAR_significant and the default search field return a different score
than just matching the BAR_significant field for the query
"BAR_significant:queryword queryword"? They should return the same set
of results, just curious about how the ranking order might differ.  Not
overly important, more for my own personal curiosity.

Thanks Chris for the quick response to the previous question, it helped
a lot.

Les


Chris Hostetter wrote:

>The full post Erick alluded too may be helpful...
>
>
http://mail-archives.apache.org/mod_mbox/lucene-java-user/200609.mbox/[EMAIL 
PROTECTED]
>
>in general, if your goal is that words in the "metadata" of a document
>should be worth more then words in the "body" then you should have two
>fields: "metadata" and "body", you shoudl query for the same word in both
>fields, and your query should have a query boost on the "metadata" part.
>
>: way it seems to work, is that if you boost a field then you have to
>: actually specify that field in your query to benefit from that field
>
>correct ... you are saying "this documents field_X is better then other
>document's field_X" .. but that info only comes into play if you actually
>use "field_X"
>
>: ignored.  I hacked around this by just adding the field's text to the
>: default search field n times where n is the boost for that field.  I
>: seriously doubt that this the ideal way to do it, but I couldn't figure
>: out how to do it other than reforming all of my queries to search all
>: the fields.
>
>that is a feasible solution to the use case where "this doc's use of the
>word FOO in field BAR is more significant then the use of the word FOO in
>field BAR for any other docs" but 90% of the time those use cases can be
>better solved by making a BAR_significant field that only contains the
>significant words in BAR and querying both.
>
>for the other 10% of the time, you'll have to wait for Payload based
>queries to come along, so when you add FOO to BAR you can give it a
>payload that says how important it is.
>
>(or so the Query Payload people say ... i choose to believe them even
>though i don't understand them)
>
>
>
>-Hoss
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: [EMAIL PROTECTED]
>For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Document Boost

Reply via email to