I added the following to both TestStandardAnalyzer and TestClassicAnalyzer in branches/lucene_solr_3_6/, and it passed in both cases:
public void testWhitespaceHyphenWhitespace() throws Exception { BaseTokenStreamTestCase.assertAnalyzesTo (a, "drinks - water", new String[]{"drinks", "water"}); } So I'm not seeing the same behavior as you guys - the hyphen is not part of any emitted token. Steve -----Original Message----- From: lis...@alphamatrix.org [mailto:lis...@alphamatrix.org] Sent: Monday, June 25, 2012 11:33 AM To: java-user@lucene.apache.org Subject: Re: how to remove the dash A Segunda, 25 de Junho de 2012 16:10:38 Ian Lea escreveu: > My apologies - you are right. > > With both ClassicAnalyzer and StandardAnalyzer, "drinks - water" comes > out as "drinks -water" whereas "drinks-water" comes out as "drinks > water", as I'd expected. > > I guess this is fixable in JFlex, or I think there is some replace > tokenizer somewhere that can replace character X with character Y e.g. > "-" with " ". Or pre-process your text/queries with a regexp. Maybe > someone else has better ideas. I guess the same... I'am already using my own Tokenizer(based on StandardTokenizer) to mark some strings for replacement or removal and i'am using a a filter to replace them and the filter to remove... And tried to do that with the "-" but didn't worked... I can't even mark the "-". I'am avoiding pre-process... I'am hoping that somebody could tell what can I change on StandardTokenizer JFlex to changes this behavior. Thanks > > > -- > Ian. > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org