Re: Max Frequency and Tf/Idf

2006-04-18 Thread karl wettin
18 apr 2006 kl. 11.45 skrev Danilo Cicognani: Following is the code we are using now: we was considering the possiblity to have more informations from Lucene (for example the maximum term frequency in one document) to optimized the calculations. The first method is the one that start the ca

Re: Max Frequency and Tf/Idf

2006-04-18 Thread Danilo Cicognani
Hi Grant Ingersoll and everybody. > The Term Vector code can be used to get the term frequencies from a > specific document. Search this list, see the Lucene In > Action book or > look at http://www.cnlp.org/apachecon2005 for examples on how to use > Term Vectors Maybe I didn't explain well my

Re: Max Frequency and Tf/Idf

2006-04-14 Thread Grant Ingersoll
The Term Vector code can be used to get the term frequencies from a specific document. Search this list, see the Lucene In Action book or look at http://www.cnlp.org/apachecon2005 for examples on how to use Term Vectors Danilo Cicognani wrote: Hello everybody. We are building a complex autom

Max Frequency and Tf/Idf

2006-04-14 Thread Danilo Cicognani
Hello everybody. We are building a complex automatic classification system using Lucene. We need to manage normalized Tf/Idf (Term Frequency / Inverse Document Frequency). We understood that Lucene can give us Tf and Df and we are using these values to calculate the normalized Tf/Idf but we would l