> Is 256 some inner maximum too > in some > lucene internal that causes this? What is happening is that > the long > word is split into smaller words up to 256 and then the min > and max > limit applied. Is that correct? I have removed LengthFilter > and still > see the splitting at 256 happen. I would like not to have > this, and > removed altogheter any word longer than max, wihtout > decomposing into > smaller ones. Is there a way to achieve this? > > Using lucene 3.0.1
Assuming your Tokenizer extends CharTokenizer: CharTokenizer.java has this field: private static final int MAX_WORD_LEN = 255; you can modify CharTokenizer.java according to your needs. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org