related application

Antoine Baudoux Sun, 17 Jun 2007 09:07:04 -0700

Hi chris,


I've really only had a chnce to skim this thread so far, but if i
understand correctly, the goal is to get documents back in a "blended"
order based on:
  1) textual relevancy to the search input
  2) recentness

3) a mapping of field values to arbitrary numeric weights whichneed to

     be specified at query time (ie: score collection:A better then
     collection:C better then collectoin:Q etc...)


You have perfectly understood my question, thanks for trying to help!

In that case i think a "function query" is the way to go ... Ihaven'trelaly had a chance to catch up on the way the Solr FunctionQueryclassmorphed when it was adopted into the Lucene core, but i believe alltherelevent pieces are in the org.apache.lucene.search.functionpackage, and
it seems to have some good package level javadocs...

Thats what i discovered. The question is : Is the ValueSourceQuerystrong and fast enough to beused confidently in a production environment? I looked at the sourcecode and it seem spretty straightforward,so I would say yes, as long as i use the caches correctly. Can youconfirm?

http://lucene.zones.apache.org:8080/hudson/job/Lucene-Nightly/javadoc/org/apache/lucene/search/function/package-summary.html
You seemed to be on the right track asking aboutValueSourceQuery ... but
thta's only part of hte puzzle: for the "recentness" aspect a
ValueSourceQuery composed on a ReverseOrdFieldSource should takecare ofthings ... but the arbitrary weighting by "collection" will reallyrequireyou to provide your own ValueSource implementation -- most likelyyou'llwant to leverage the FieldCache, but map your"collectionIds" (whatever
they are) to the numeric values you want to use.

then you'll have all the pieces, the only thing left to do will be to
decide if you want to combine them with a regular BooleanQuery oruse a
CustomScoreQuery.

Yes, I will have to implement my own ValueSource, but it seemsit'really not complicated, looking at the existing

ValueSource implementations.

As for your comments about "random scoring" ... this is really,Really,REALLY hard to get "right" for a variety of reasons that i don'treally
want to go into right now ... my advice: don't attempt to commit to
"random" ordering.   Instead commit to promoting N randomly selected
documents to the front of the results ... this is easy to do bywritting a
custom query (again ValueSourceQuery can probably help you) where you
pick N random numbers between 0 and maxDoc and score them reallyhigh ...
then let the rest of the docs score as they normally would.

What's wrong with this idea :

Each day i generate an shuffle a vector of Maxdoc integers from 0 toMaxdoc.

Then i use a valueSource query with a valueSource that uses thisvector to randomly score the documents.Of course I have to somehow normalize those random scores so thattheir "contribution factor" remains constant when MaxDocs increases.



Thanks for your advices !


Antoine

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Several questions about scoring/sorting + random sorting in an image/related application

Reply via email to