Thank you, Chris. You have confirmed what I had all but resigned myself to (and you summarized my goal precisely). I am sticking with the two versions of the field and just accepting the fact that the search clients will need to use my custom query parser.
Even if one doesn't get the answer one wants, it's at least comforting to get a definitive answer so that the speculation can stop and the work can get done. thanks again, -- Robert On Mon, 14 Aug 2006, Chris Hostetter wrote:
Therre's a lot of information in your email, and a lot of questions that relate to similar topics and address different ways of acomplishing similar but different things ... too much for me to digest all at once, so lemme start by seeing if i can summarize your goal, and then give you my suggestion based on the goal as i see it... You want simple term matches to be "stemmed" but you want phrase ueries to be "unstemmed" so if i user queries for the word... jumped ...you want that to match any of the words: jump, jumps, jumped, etc... if a user queries for... "the dogs" ...you want that to only match the exact phrase and not something with the tokens "the dog" you want these ideas to work, even if phrases and terms are mixed in the users query... foo:jumped bar:"the dogs" My first though is that you kepe using two versions of hte field (one stemmed and one unstemmed) and you then subclass QueryParser and override the getFieldQuery(String field, String queryText) method ... if the second arg looks like a phrase to you (ie: it has spaces or what not) them return super.getField(field, queryText). If it's not a phrase, then call super.getField(field + "_STEMMED", queryText). where this breaks down is if you want the non-stemmed behavior even if hte users "phrase" only contains one word, ie... foo:jumped bar:"dogs" ...because the information that "dogs" was in quotes is lost by the time getFieldQuery is called. You'd have to write a lot more QueryParsing code to get that behavior. In general, for your goal, i would not attempt to put both teh stemmed and unstemmed tokens in the same field -- because as i think you mentioned, there is not way to tell them apart at query time.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]