Re: Solr SynonymFilter in Lucene analyzer

2010-08-26 Thread Arun Rangarajan
Thanks, Lance. After exploring for a while, I used lucene's ShingleFilter followed by the SynonymFilter in Lucene in Action book. Then using the type attribute, I removed all the shingles which did not belong to any category. On Wed, Aug 18, 2010 at 10:28 PM, Lance Norskog wrote: > Yes, you need

Re: Solr SynonymFilter in Lucene analyzer

2010-08-18 Thread Lance Norskog
Yes, you need an analyzer that leaves successive words together as one long term. This might be easier to do with the new CharFilter tool, which processes text before it goes to the tokenizer. What you are doing here is similar to Parts-Of-Speech analysis, where text analysis software parses a sen

Re: Solr SynonymFilter in Lucene analyzer

2010-08-18 Thread Arun Rangarajan
I think the lucene WhitespaceAnalyzer I am using inside Solr's SynonymFilter is the one that prevents multi-word synonyms like "New York" from getting mapped to the generic synonym name like CONCEPTYcity. It appears to me that an analyzer which recognizes that a white-space is inside a synonym like