Re: Max Frequency and Tf/Idf

Grant Ingersoll Fri, 14 Apr 2006 04:30:52 -0700

The Term Vector code can be used to get the term frequencies from aspecific document. Search this list, see the Lucene In Action book orlook at http://www.cnlp.org/apachecon2005 for examples on how to useTerm Vectors


Danilo Cicognani wrote:

Hello everybody.
We are building a complex automatic classification system using Lucene.
We need to manage normalized Tf/Idf (Term Frequency / Inverse Document
Frequency).
We understood that Lucene can give us Tf and Df and we are using these
values to calculate the normalized Tf/Idf but we would like to optimize this
calculation for better performance.
Is there any way to expose the maximum term frequency in a document from
Lucene, and maybe to obtain the normalized Tf/Idf from Lucene?
There aren't a public methods to get these values, but maybe Lucene holds
these informations privately and with a modify on Lucene source we could
have the work done to fasten the system.


P.S. Sorry for MY English: I hope I explained clearly my question.

**** 1000 KBye ****

 [) /\ |\| | |_ ()

web: www.ciconet.it
Web Portal Now: www.webportalnow.com


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

--

Grant IngersollSr. Software EngineerCenter for Natural Language ProcessingSyracuse UniversitySchool of Information Studies335 Hinds HallSyracuse, NY 13244http://www.cnlp.orgVoice: 315-443-5484Fax: 315-443-6886


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Max Frequency and Tf/Idf

Reply via email to