> a.b.c.d.e.f.g.h is not broken apart like how the snowball demo > indicates it should do.
I am not sure about the "should" here - the way I see it, this is just how the demo works: Snowball stemmers operate on words, so the demo first breaks the input text into words and only then applies stemming. > For my lucene testing, I indexed one text file with one > "a.b.c.d.e.f.g.h" string in it and opened the index up using Luke. > It only indexed the string a.b.c.d.e.f.g.h (and didn't parse the > string based on the periods). In Lucene the way text is "broken" into words is up to application - and depends on the analyzer being used. WhitespaceAnalyzer would break on white space. StandardAnalyzer would do more sophisticated work. Analyzers are extendable, so you could modify their behavior. The wiki page "AnalysisParalysis" has some relevant info. Using Lucene's SimpleAnalyzer btw would break "a.b.c" into "a b c" which seems to be what you are looking for? HTH, Doron --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]