Hi Uwe, Thanks for your immediate response and sorry for my late reply. I managed to solve my problem. Your comment was enough to "guide" me in the right direction.
The problem was indeed inside my custom Analyzers/Tokenizers. The key point here is that createComponents() is called only once while the old tokenStream was called for each token, right? It seems that I had code that was executed only for the 1st token so refactoring was needed. I'll provide an example in case it might help someone else out there. Below is the design evolution till finally making it work properly in 4.3 * 3.6 WORKING* -> CustomAnalyzer.tokenStream() return new CustomTokenizer(tokenize(input)) (tokenize function performs some analysis basically on the input) *4.3 NOT WORKING* -> CustomAnalyzer.createComponents() return new TokenStreamComponents(new CustomTokenizer(tokenize(input))) (tokenize function actually called only once) *4.3 WORKING* -> CustomAnalyzer.createComponents() return new TokenStreamComponents(new CustomTokenizer(), new CustomFilter()) where all logic from tokenize(input) has been moved into a separate filter (CustomFilter) inside incrementToken() Hope it makes sense! And yes, of course and lucene analyzers are working just fine, shouldn't even mention that especially without actually testing against them. -- View this message in context: http://lucene.472066.n3.nabble.com/Lucene-4-0-tokenstream-logic-tp4077203p4078427.html Sent from the Lucene - Java Users mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org