Simon Willnauer wrote:
> 
> I already responded... again...
> 
sorry, I've been in answering and seen your post right after sending.


Simon Willnauer wrote:
> 
> Tokenizer splits the input stream into tokens (Token.java) and
> TokenFilter subclasses operate on those. I expect from a Tokenizer
> that is provides me a stream of tokens :) - how those tokens are
> created is the responsibility of the Tokenizer.

According to your requirements:

 * one programmer will write a simplistic Tokenizer that converts a whole
char input into a 1 huge token. 

 * another programmer will write a simplistic Tokenizer that converts each
single char of the input into a 1-char token.  It will end up in a huge
number of 1-char tokens.

Moreoever, both claim the job is done in a brilliant way, because the
Tokenizer is based on a 1-line statement in Java...

Who did the work better?

Said that, I'd love to hear more specific requirements about Tokenizer to
avoid the above odd deliveries :)

regards
Valery

-- 
View this message in context: 
http://www.nabble.com/Any-Tokenizator-friendly-to-C%2B%2B%2C-C-%2C-.NET%2C-etc---tp25063175p25078755.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to