Hi! I ran into a strange behaviour of the StandardTokenizer. Terms containing a '-' are tokenized differently depending on the context. For example, the term 'nl-lt' is split into 'nl' and 'lt'. The term 'nl-lt0' is tokenized into 'nl-lt0'. Is this a bug or a feature? Can I avoid it somehow? I'm using Lucene 3.0.0.
Best, Anna --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org