Ah, very cool. Thanks for the tip. -M
On Feb 11, 2008 10:58 AM, Erick Erickson <[EMAIL PROTECTED]> wrote: > You have to bet a bit clever. You can certainly inject the original with > an > increment of 0. See SynonymAnalyzer in Lucene In Action. This will not > break phrase queries since your two tokens occupy the same position. > > But you'll have to do something like add a $ to the original at index > time. > That way, for exact matches you can search on olive$, boosted however > you want. When you want the stemmed version you can search for olive. > Or you could add a clause with the unstemmed version boosted. Or > something like that <G>.... Note that whether you add the $ to the stemmed > or unstemmed version is up to you....... > > Watch what analyzer you use to be sure it doesn't strip out the special > symbol.... > > Best > Erick > > On Feb 11, 2008 12:56 PM, Michael Stoppelman <[EMAIL PROTECTED]> wrote: > > > Hi all, > > I've got an index with tokens that are stemmed. Sometimes I really need > to > > boost the unstemmed > > version of a query word to get the most relevant documents. > > > > Example: > > Query: [olives]. > > > > I don't want to match documents with the words: oliver, oliver's, etc... > > > > Since I'm stemming when creating the index is there a way to store both > > versions (stemmed/unstemmed) with > > setIncrementPosition()? Is that the correct way to deal with this? I was > > reading old archives and this didn't seem > > to be a great way decision since it breaks PhraseQuery [1]. > > > > It seems like it would be useful if at query scoring time if I could see > > the > > original string values of the tokens in this case > > at least. > > > > Thanks in advance, > > > > -M > > > > [1] > > http://www.mail-archive.com/[EMAIL PROTECTED]/msg07416.html > > >