Thanks for bearing with me Max. I do understand that the hits come back sorted by decending score after their Similarity has been computed relative to the query vector. What I was hoping to do was use the built in fuctionality of lucene to calculate some term weights specifically wi = ti * IDFi.
Assuming I had Hits I was <b>hoping</b> to do something like this: for(int idx = 0; idx < hits.lingth(); idx++){ int id = hits.id(idx); TermFreqVector[] termFreqVec = indexReader.getTermFreqVectors(id); // Using the termFreqVec calculate the wi for each term in that document. for(termFreqVec){ TermWeight wi = Similarity.wi(termFreqVec[], termFreqVec.length); ... } } Andrew -----Original Message----- From: Max Pfingsthorn <[EMAIL PROTECTED]> Sent: Jun 3, 2005 4:13 AM To: java-user@lucene.apache.org Subject: RE: calculate wi = tfi * IDFi for each document. Hi, when IndexSearcher.search gives you a Hits object back, all results are already sorted by their score, which is computed internally using the Similarity. You can access it via Hits.score(n) (see http://lucene.apache.org/java/docs/api/org/apache/lucene/search/Hits.html). This is also shown in the demo in org.apache.lucene.demo.SearchFiles from SVN. (see http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/demo/org/apache/lucene/demo/SearchFiles.java?rev=150739&view=markup). Hope that helps. max -----Original Message----- From: Andrew Boyd [mailto:[EMAIL PROTECTED] Sent: Thursday, June 02, 2005 21:22 To: java-user@lucene.apache.org Subject: RE: calculate wi = tfi * IDFi for each document. Ok. So if I get 10 Documents back from a search and I want to get the top 5 weighted terms for each of the 10 documents what API call should I use? I'm unable to find the connection between Similarity and a Document. I know I'm missing the elephant that must be in the middle of the room. Or maybe it's not there. Is what I'm trying to do do-able? Thanks, Andrew -----Original Message----- From: Max Pfingsthorn <[EMAIL PROTECTED]> Sent: Jun 2, 2005 5:33 AM To: java-user@lucene.apache.org Subject: RE: calculate wi = tfi * IDFi for each document. Hi, DefaultSimilarity uses exactly this weighting scheme. Makes sense since it's a pretty standard relevance measure... Bye! max -----Original Message----- From: Andrew Boyd [mailto:[EMAIL PROTECTED] Sent: Thursday, June 02, 2005 11:39 To: java-user@lucene.apache.org Subject: calculate wi = tfi * IDFi for each document. If I have search results how can I calculate, using lucene's API, wi = tfi * IDFi for each document. wi = term weight tfi = term frequency in a document IDFi = inverse document frequency = log(D/dfi) dfi = document frequency or number of documents containing term i D = number of documents in my search result Thanks, Andrew --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] Andrew Boyd Software Architect Sun Certified J2EE Architect B&B Technical Services Inc. 205.422.2557 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]