This may be of interest: http://issues.apache.org/jira/browse/LUCENE-474
Cheers Mark ----- Original Message ---- From: Ryan McKinley <[EMAIL PROTECTED]> To: java-user@lucene.apache.org Sent: Friday, 23 March, 2007 3:25:02 AM Subject: Re: How can I index Phrases in Lucene? Is there any way to find frequent phrases without knowing what you are looking for? I could index "A B C D E" as "A B C", "B C D", "C D E" etc, but that seems kind of clunky particularly if the phrase length is large. Is there any position offset magic that will surface frequent phrases automatically? thanks ryan On 3/22/07, Erick Erickson <[EMAIL PROTECTED]> wrote: > Well, you don't index phrases, it's done for you. You should try > something like the following.... > > Create a SpanNearQuery with your terms. Specify an appropriate > slop (probably 0 assuming you want them all next to each other). > > Now use call getSpans and count <G>... You may have to do > something with overlapping spans, but you'll need to experiment > a bit to understand it. > > Erick > > On 3/22/07, Maryam <[EMAIL PROTECTED]> wrote: > > > > Hi, > > > > I know how to index terms in lucene, now I wanna see > > how can I index phrases like "information retreival" > > in lucene and calculate the number of times that > > phrase has appeared in the document. Is there any way > > to do it in Lucene? > > > > Thanks > > > > > > > > > > ____________________________________________________________________________________ > > It's here! Your new message! > > Get new email alerts with the free Yahoo! Toolbar. > > http://tools.search.yahoo.com/toolbar/features/mail/ > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ___________________________________________________________ New Yahoo! Mail is the ultimate force in competitive emailing. Find out more at the Yahoo! Mail Championships. Plus: play games and win prizes. http://uk.rd.yahoo.com/evt=44106/*http://mail.yahoo.net/uk --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]