Hi Erick, Thanks for the great idea, it's exactly the kind of suggestion I was looking for!
Lucifer On Dec 12, 2007 2:34 PM, Erick Erickson <[EMAIL PROTECTED]> wrote: > I faced a very similar requirement and solved it by indexing multiple > tokens at the same place. For instance, say you're indexing > the word "foxes". Index something like fox$ and foxes at the same > position (see SynonymAnalyzer in Lucene In Action for an example). > You probably MUST index the multiple terms with an increment gap > of 0 (more later). > > Example phrase "red foxes are plentiful" > > Now you have the capability of distinguishing between a stemmed > and unstemmed version of the word and can search for exactly > "red foxes". But if instead you want to search for the stemmed > version, you can search for "red fox$". But "red fox" will NOT match. > > The reason you need to index these with an increment gap of 0 is so > phrase queries work. If you let the gap increment for each token, and > indexed a phrase like "red foxes are plentiful", then did a > proximity search on "red plentiful"~2, it would fail because > you'd have fox$ and foxes each taking up one position. But if > fox$ and foxes both have the same position, it'll work. > > And it's all in the same index, one field, etc. > > Hope this helps > Erick > > On Dec 12, 2007 1:25 PM, Lucifer Hammer <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > We've got a requirement that we need to give our users the ability to > > search on exact phrases within a field, or, if they prefer, they can > match > > on plurals(either via stems, or another plural algorithm). However, the > > cases are mutually exclusive, for example given the following field in > the > > index: > > > > IndexField1: "The quick brown dog jumped over the lazy fox" > > > > If the user chooses an exact phrase search such as "lazy dog jumped", > then > > it'll match, however, if they also choose an exact phrase and search > for: > > "lazy dogs" it shouldn't match. > > > > If the user chooses a plural search, then both of the above searches > > should > > match. > > > > So... the question really is: Can I do this all in one field, or will I > > have to index the data twice, once in a field that has the exact text, > and > > in a second field in which I index the terms, and wordstack stems(or > > plurals). > > > > If it's possible to do this in a single field, that would be much > > preferred... > > > > Thanks for any help! > > Lucifer > > >