Hi Grant Ingersoll and everybody. > The Term Vector code can be used to get the term frequencies from a > specific document. Search this list, see the Lucene In > Action book or > look at http://www.cnlp.org/apachecon2005 for examples on how to use > Term Vectors
Maybe I didn't explain well my question. Following is the code we are using now: we was considering the possiblity to have more informations from Lucene (for example the maximum term frequency in one document) to optimized the calculations. The first method is the one that start the calculation of Tf/Idf using the class TTfIdf whose constructor is reported below. public TTfIdf getFieldTfIdf(long tid, long aid, String field) throws RisorseMultipleException, IOException, RisorsaNonTrovataException, TTfIdfException { reader= IndexReader.open(indexDir); int id=getDocumentId(tid,aid); TermFreqVector tfv = reader.getTermFreqVector(id,field); int[] freqs=tfv.getTermFrequencies(); String[] terms=tfv.getTerms(); int[] df=new int[terms.length]; for(int i=0;i<df.length;i++) df[i]=reader.docFreq(new Term(field,terms[i])); TTfIdf tfidf = new TTfIdf(terms,freqs,df,reader.numDocs()); reader.close(); return tfidf; } public TTfIdf(String[] terms,int[] freqs, int[] df,int docs) throws TTfIdfException{ if(terms.length!=freqs.length||terms.length!=df.length) throw new TTfIdfException("I vettori dei termini e delle frequenze sono di diversa lunghezza!"); this.terms=terms; int l=freqs.length; int maxfreq=0; for(int i=0;i<l;i++){ // CAN BE OPTIMIZED IN SOME WAY? if(freqs[i]>maxfreq) maxfreq=freqs[i]; } this.freqs=new double[l]; double tf; double idf; for(int i=0;i<l;i++){ // CAN BE OPTIMIZED IN SOME WAY? tf=(double)freqs[i]/(double)maxfreq; idf=Math.log((double)docs/(double)df[i]); this.freqs[i]=tf*idf; } } Have you got some suggestions? **** 1000 KBye **** [) /\ |\| | |_ () web: www.ciconet.it Web Portal Now: www.webportalnow.com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]