Re: Calculating term and document frequency for multiple word terms

Erik Hatcher Mon, 10 Apr 2006 13:20:45 -0700

Have a look at using SpanNearQuery for phrases, and walking the spans(via getSpans, I believe).


        Erik



On Apr 10, 2006, at 12:12 PM, Vishal Bathija wrote:

Hi,
I was wondering how I can get the document frequency and term
frequency of a phrase in a corpus. I am currently  using


IndexReader rd = IndexReader.open("C:\\Documents and
Settings\\Owner\\My Documents\\Thesis\\luceneTest\\index");
Term t1 = new Term("contents","\"increases aesthetic\"");
TermDocs  tdTest2= rd.termDocs(t1);
while(tdTest2.next() )
                {

                System.out.println(tdTest2.freq()  ) ;          
                }       
                



This seems to work for a single word term such as "increases", but not
for multiple word terms such as "increases aesthetic".

Any suggestions would be greatly appreciated.


Kind Regards
Vishal Bathija
                
                

--
Vishal Bathija
Graduate Student
Department of Computer Science & Systems Analysis
Miami University
Oxford,Ohio
Phone: (513)-461-9239

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Calculating term and document frequency for multiple word terms

Reply via email to