EMAIL PROTECTED]>
: Reply-To: java-user@lucene.apache.org
: To: java-user@lucene.apache.org
: Subject: RE: n-gram indexing
:
: Hi Chris,
: The method you suggested is definitely a good solution. However
there is one more reason I would like to do n-gram generation at indexing time.
There might be other ways to do which I am not aware of. Let me know
what your thoughts on this. I would really appreciate any suggestions you might
have.
thanks,
Rajesh Munavalli
-Original Message-
From: [EMAIL PROTECTED] on behalf of Chris Hostetter
Sent: Fri 7/2
stions like this...
: -Original Message-
: Sent: Monday, July 18, 2005 5:11 PM
: To: java-user@lucene.apache.org
: Subject: RE: n-gram indexing
:
:
: Your intuition is right, but i can't think of any reason why you need to
: add the n-grams at indexing time -- or why using phrase q
I was wondering how Lucene's phrase query would work in case of n-gram
indexing. There are two scenarios for popsition increments while adding
the index for n-grams. For example consider tri-grams of "united states
of america".
Scenario 1:
Index position token
0
Hi Rajeev,
I wrote a filter for generating n-grams a while back; I intended to
use it for statistics, but I guess you can also use it for search. I
also thought of the "boosting effect" you describe when I implemented
it, though I never actually tried whether it works that way.
It's in the Lucene
Quoting Rajesh Munavalli <[EMAIL PROTECTED]>:
> Let me explain a scenario where I would need to add the n-grams at
> indexing time.
I see your point and I do agree. As it stands, Lucene does not innately support
n-gram indexing. However it is not impossible to adapt Lucene to serve
se
: queries. I am not sure if there is a better way to achieve the same
: effect.
:
: Thanks,
:
: Rajesh
:
:
: -Original Message-----
: From: Andy Roberts [mailto:[EMAIL PROTECTED]
: Sent: Monday, July 18, 2005 5:56 PM
: To: java-user@lucene.apache.org
: Subject: Re: n-gram indexing
:
: On Mo
d without using phrase
> queries. I am not sure if there is a better way to achieve the same
> effect.
>
> Thanks,
>
> Rajesh
>
>
> -Original Message-
> From: Andy Roberts [mailto:[EMAIL PROTECTED]
> Sent: Monday, July 18, 2005 5:56 PM
> To: java-user@lucene.apa
On Monday 18 Jul 2005 22:06, Rajesh Munavalli wrote:
> Intution behind adding n-grams is to boost naturally occurring larger
> phrases versus using phrase queries. For example, if I am searching for
> "united states of america", I want the search results to return the
> documents ordered as follows
e same
: effect.
:
: Thanks,
:
: Rajesh
:
:
: -Original Message-
: From: Andy Roberts [mailto:[EMAIL PROTECTED]
: Sent: Monday, July 18, 2005 5:56 PM
: To: java-user@lucene.apache.org
: Subject: Re: n-gram indexing
:
: On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote:
: > At what point
Message-
From: Andy Roberts [mailto:[EMAIL PROTECTED]
Sent: Monday, July 18, 2005 5:56 PM
To: java-user@lucene.apache.org
Subject: Re: n-gram indexing
On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote:
> At what point do I add n-grams? Does the order in which I add n-grams
> affec
On Monday 18 Jul 2005 21:27, Rajesh Munavalli wrote:
> At what point do I add n-grams? Does the order in which I add n-grams
> affect exact phrase queries later? My questions are
>
> (1) Should I add all the 1-grams followed by 2-grams followed by
> 3-grams..etc sentence by sentence OR
>
> (2) Add
At what point do I add n-grams? Does the order in which I add n-grams
affect exact phrase queries later? My questions are
(1) Should I add all the 1-grams followed by 2-grams followed by
3-grams..etc sentence by sentence OR
(2) Add all the 1 grams of entire document first before starting 2-grams
13 matches
Mail list logo