Re: Scoring formula - Average number of terms in IDF

2009-12-18 Thread Michael McCandless
re: >>>> >>>>     https://issues.apache.org/jira/browse/LUCENE-2091 >>>> >>>> about how Lucene could track avg field/doc length, but they are just >>>> brainstorming type discussions now. >>>> >>>> You could always

Re: Scoring formula - Average number of terms in IDF

2009-12-18 Thread kdev
ust >>> brainstorming type discussions now. >>> >>> You could always do something approximate outside of Lucene? EG, make >>> a TokenFilter that counts how many tokens are produced for each >>> field/doc, aggregate & store that yourself, and use it in

Re: Scoring formula - Average number of terms in IDF

2009-12-17 Thread Michael McCandless
kenFilter that counts how many tokens are produced for each >> field/doc, aggregate & store that yourself, and use it in your >> similarity impl? >> >> Mike >> >> On Tue, Dec 15, 2009 at 5:04 AM, kdev wrote: >>> >>> any ideas please? >>> --

Re: Scoring formula - Average number of terms in IDF

2009-12-17 Thread kdev
ty impl? > > Mike > > On Tue, Dec 15, 2009 at 5:04 AM, kdev wrote: >> >> any ideas please? >> -- >> View this message in context: >> http://old.nabble.com/Scoring-formula---Average-number-of-terms-in-IDF

Re: Scoring formula - Average number of terms in IDF

2009-12-17 Thread Michael McCandless
how many tokens are produced for each field/doc, aggregate & store that yourself, and use it in your similarity impl? Mike On Tue, Dec 15, 2009 at 5:04 AM, kdev wrote: > > any ideas please? > -- > View this message in context: > http://old.nabble.com/Scoring-formula---Average

Re: Scoring formula - Average number of terms in IDF

2009-12-15 Thread kdev
any ideas please? -- View this message in context: http://old.nabble.com/Scoring-formula---Average-number-of-terms-in-IDF-tp26282578p26792364.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. - To

Scoring formula - Average number of terms in IDF

2009-11-10 Thread kdev
Hi, I want to change the default scoring formula of lucene and one of the changes I want to perform is on the idf term. What I want to do is to include the average number of terms of the documents indexed in the collection in the idf method of the Similarity class. In order to change the

Re: changing scoring formula

2008-03-08 Thread John Wang
time > for a given query the results will be sorted according to "lucene scoring > formula + an equation". > how can i do that...i saw that lucene scoring page but i am not getting > exactly how to do that... > please advice me > -- > View this message in context:

Re: changing scoring formula

2008-03-05 Thread Michael Stoppelman
a given query the results will be sorted according to "lucene scoring > formula + an equation". > how can i do that...i saw that lucene scoring page but i am not getting > exactly how to do that... > please advice me > -- > View this message in context: > http://www.nab

changing scoring formula

2008-03-05 Thread sumittyagi
is there any way to change the score of the documents. Actually i want to modify the scores of the documents dynamically, everytime for a given query the results will be sorted according to "lucene scoring formula + an equation". how can i do that...i saw that lucene scoring page bu

Re: Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-12 Thread Doron Cohen
"Karl Koch" <[EMAIL PROTECTED]> wrote: > For the documents Lucene employs > its norm_d_t which is explained as: > > norm_d_t : square root of number of tokens in d in the same field as t Actually (by default) it is: 1 / sqrt(#tokens in d with same field as t) > basically just the square root

Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-12 Thread Soeren Pekrul
Hello Karl, I’m very interested in the details of Lucene’s scoring as well. Karl Koch wrote: For this reason, I do not understand why Lucene (in version 1.2) normalises the query(!) with norm_q : sqrt(sum_t((tf_q*idf_t)^2)) which is also called cosine normalisation. This is a technique that

Re: Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-12 Thread Karl Koch
etreff: Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed) > > Well it doesn't since there is not justification of why it is the > > way it is. Its like saying, here is that car with 5 weels... enjoy > driving. > > > > - I think the e

Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-11 Thread Doron Cohen
> Well it doesn't since there is not justification of why it is the > way it is. Its like saying, here is that car with 5 weels... enjoy driving. > > - I think the explanations there would also answer at least some of your > > questions. I hoped it would answer *some* of the questions... (not al

Re: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-11 Thread Karl Koch
.apache.org Betreff: Re: Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed) > [EMAIL PROTECTED] wrote: > > According to these sources, the Lucene scoring formula in version 1.2 > is: > > > > score(q,d) = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t /

Questions about Lucene scoring (was: Lucene 1.2 - scoring formula needed)

2006-12-09 Thread TheRanger
/200307.mbox/[EMAIL PROTECTED] ). According to these sources, the Lucene scoring formula in version 1.2 is: score(q,d) = sum_t(tf_q * idf_t / norm_q * tf_d * idf_t / norm_d_t * boost_t) * coord_q_d where * score (q,d) : score for document d given query q * sum_t : sum for all terms t in

Re: scoring formula

2006-08-04 Thread Zhao, Xin
Hi, Erik, What do you think about the difference? Thank you very much for your reply, Xin - Original Message - From: "Erik Hatcher" <[EMAIL PROTECTED]> To: Sent: Wednesday, August 02, 2006 3:56 PM Subject: Re: scoring formula Please disregard my previous quick re

Re: scoring formula

2006-08-02 Thread Erik Hatcher
Please disregard my previous quick reply as I did not fully read your message before replying. *ugh* Erik On Aug 2, 2006, at 2:32 PM, Zhao, Xin wrote: Hi, I noticed the scoring formula in the errata of book "Lucene in Action" is a little different from the one in

Re: scoring formula

2006-08-02 Thread Erik Hatcher
Xin, You're correct. This was noted as an errata here: www.lucenebook.com/blog/errata/2005/01/24/scoring_formula_omission.html> All other known errata is here: <http://www.lucenebook.com/blog/ errata/> (and searchable via Lucene, as in <http://lucenebook.com/ search?quer

scoring formula

2006-08-02 Thread Zhao, Xin
Hi, I noticed the scoring formula in the errata of book "Lucene in Action" is a little different from the one in Javadoc. I enclosed the one in Javadoc at the end of email. getBoost(t in q) is in Javadoc's formula (which I assume is the correct one), but not in "lucene in a

Re: Scoring formula

2005-11-05 Thread Otis Gospodnetic
case? > > Karl > > > --- Ursprüngliche Nachricht --- > > Von: Yonik Seeley <[EMAIL PROTECTED]> > > An: java-user@lucene.apache.org > > Betreff: Re: Scoring formula > > Datum: Sat, 5 Nov 2005 17:49:40 -0500 > > > > Lucene 1.2 is before my time, but check if

Re: Scoring formula

2005-11-05 Thread Otis Gospodnetic
s the score is always > between > 0.0 and 1.0 (without any boosting)... Is this the case? > > Karl > > > --- Ursprüngliche Nachricht --- > > Von: Otis Gospodnetic <[EMAIL PROTECTED]> > > An: java-user@lucene.apache.org > > Betreff: Re: Scoring formula &g

Re: Scoring formula

2005-11-05 Thread Karl Koch
I always thought that Lucene search is always returning a Hits object. In what occation would this not be the case? Karl > --- Ursprüngliche Nachricht --- > Von: Yonik Seeley <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: Re: Scoring formula > Datum: Sat

Re: Scoring formula

2005-11-05 Thread Yonik Seeley
Lucene 1.2 is before my time, but check if the functions are implemented the same as the current version (they probably are). Scores are not naturally <= 1, but for most search methods (including all that return Hits) they are normalized to be between 1 and 0 if the highest score is greater than 1

Re: Scoring formula

2005-11-05 Thread Karl Koch
> --- Ursprüngliche Nachricht --- > Von: Otis Gospodnetic <[EMAIL PROTECTED]> > An: java-user@lucene.apache.org > Betreff: Re: Scoring formula > Datum: Fri, 4 Nov 2005 12:12:52 -0800 (PST) > > The formula should also be in the javadoc for Similarity class, if it > was there in 1.2. &

Re: Scoring formula

2005-11-04 Thread Otis Gospodnetic
The formula should also be in the javadoc for Similarity class, if it was there in 1.2. Otis --- Karl Koch <[EMAIL PROTECTED]> wrote: > Hello group, > > the scoring formula for Lucene is well explained in "Lucene in > Action". > However, is this formula also v

Scoring formula

2005-11-04 Thread Karl Koch
Hello group, the scoring formula for Lucene is well explained in "Lucene in Action". However, is this formula also valid for Lucene 1.2 (which I am using). I need to know that for documentation purposes. If not, where can I find the currect formula since I do not want to interpret i