Have a look at using SpanNearQuery for phrases, and walking the spans
(via getSpans, I believe).
Erik
On Apr 10, 2006, at 12:12 PM, Vishal Bathija wrote:
Hi,
I was wondering how I can get the document frequency and term
frequency of a phrase in a corpus. I am currently using
IndexReader rd = IndexReader.open("C:\\Documents and
Settings\\Owner\\My Documents\\Thesis\\luceneTest\\index");
Term t1 = new Term("contents","\"increases aesthetic\"");
TermDocs tdTest2= rd.termDocs(t1);
while(tdTest2.next() )
{
System.out.println(tdTest2.freq() ) ;
}
This seems to work for a single word term such as "increases", but not
for multiple word terms such as "increases aesthetic".
Any suggestions would be greatly appreciated.
Kind Regards
Vishal Bathija
--
Vishal Bathija
Graduate Student
Department of Computer Science & Systems Analysis
Miami University
Oxford,Ohio
Phone: (513)-461-9239
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]