Simon Willnauer wrote: > > I already responded... again... > sorry, I've been in answering and seen your post right after sending.
Simon Willnauer wrote: > > Tokenizer splits the input stream into tokens (Token.java) and > TokenFilter subclasses operate on those. I expect from a Tokenizer > that is provides me a stream of tokens :) - how those tokens are > created is the responsibility of the Tokenizer. According to your requirements: * one programmer will write a simplistic Tokenizer that converts a whole char input into a 1 huge token. * another programmer will write a simplistic Tokenizer that converts each single char of the input into a 1-char token. It will end up in a huge number of 1-char tokens. Moreoever, both claim the job is done in a brilliant way, because the Tokenizer is based on a 1-line statement in Java... Who did the work better? Said that, I'd love to hear more specific requirements about Tokenizer to avoid the above odd deliveries :) regards Valery -- View this message in context: http://www.nabble.com/Any-Tokenizator-friendly-to-C%2B%2B%2C-C-%2C-.NET%2C-etc---tp25063175p25078755.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org