If you can, I think there has been enough interest in the past on this, a patch that exposes the wi information would probably be useful to others (not that I am saying it would be committed, as I can't speak for the committers on the project)
>>> [EMAIL PROTECTED] 6/3/2005 8:19:16 AM >>> Thanks for the reply. It looks like I can use parts of Similarity. I'll post back once I get it working or at least closer ;-) Andrew -----Original Message----- From: Grant Ingersoll <[EMAIL PROTECTED]> Sent: Jun 3, 2005 6:51 AM To: java-user@lucene.apache.org Subject: RE: calculate wi = tfi * IDFi for each document. I think the TermFreqVector (reader.getTermVector) has the info you want per document. You will need to sort it by frequency to get the top terms in each document. It doesn't give you the wi, just tfi, but the whole score is implied by the fact that you have the top 10 documents, I think. -Grant >>> [EMAIL PROTECTED] 6/2/2005 3:21:35 PM >>> Ok. So if I get 10 Documents back from a search and I want to get the top 5 weighted terms for each of the 10 documents what API call should I use? I'm unable to find the connection between Similarity and a Document. I know I'm missing the elephant that must be in the middle of the room. Or maybe it's not there. Is what I'm trying to do do-able? Thanks, Andrew -----Original Message----- From: Max Pfingsthorn <[EMAIL PROTECTED]> Sent: Jun 2, 2005 5:33 AM To: java-user@lucene.apache.org Subject: RE: calculate wi = tfi * IDFi for each document. Hi, DefaultSimilarity uses exactly this weighting scheme. Makes sense since it's a pretty standard relevance measure... Bye! max -----Original Message----- From: Andrew Boyd [mailto:[EMAIL PROTECTED] Sent: Thursday, June 02, 2005 11:39 To: java-user@lucene.apache.org Subject: calculate wi = tfi * IDFi for each document. If I have search results how can I calculate, using lucene's API, wi = tfi * IDFi for each document. wi = term weight tfi = term frequency in a document IDFi = inverse document frequency = log(D/dfi) dfi = document frequency or number of documents containing term i D = number of documents in my search result Thanks, Andrew --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Andrew Boyd Software Architect Sun Certified J2EE Architect B&B Technical Services Inc. 205.422.2557 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]