On 6/28/05, Aigner, Thomas <[EMAIL PROTECTED]> wrote: > Thanks for the info Chris. > > > > I'd thought I'd provide some more infomation. One problem is the > descriptions are not easily formatted. In other words, the description > doesn't follow a certain set of rules (num num - alpha alpha etc). They > are literally anything a supplier has put in for them. > > > > The example below (21-MA-GAB) is stored differently by these analyzers: > > WhitespaceAnalyzer: [21-MA-GAB] > > SimpleAnalyzer: [ma] [gab] > > StopAnalyzer: [ma] [gab] > > StandardAnalyzer: [21-ma] [gab] > > SynonymAnalyzer: [21-ma] [gab] > > (One I created for synonyms.. much like the standard one) > > SnowballAnalyzer: [21-ma] [gab] > > > > My problem is searching for 21magab returns nothing as well as 21ma* > etc.. > > > > This is just one of my punctuation problems.. there can be "" for inches > and 1/2 items etc.. > > > > I am currently using my SynonymnAnalyzer for some aliases to build the > index and the SnowballAnalyzer to query the index (nice stemming in it) > > > > Tom
You can write an analyzer to do your tokenizing so that you end up with 21-MA-GAB being stored as [21magab] in the index. Assuming the codes are formatted mostly the same. That's what I was suggesting, not use a different analyzer. (If they're not the same then it becomes more difficult) The other problems you're describing with the descriptions could also be solved with a proper analyzer. Add a "FRACTION" type to the lexical grammar, and don't strip punctuation like the quote. Or synonym "1/2" to "half" I guess (I haven't done much work with synonyms). Lastly, and someone should correct me if I'm wrong, but you should always use the same analyzer to create and to query the index. Otherwise queries that should return hits wont. For instance the following. The canoist paddles Could be indexed as [boater] [strokes]... And the query contents:paddles would be parsed to [paddle] and likely would not get the hit you expect. Cheers, Chris --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]