Hi, Standard tokenizer works pretty well for me... but i found one problem with my usage...
I want to tokenize..."TheRing6,Proposal6,GuyandGirl6" as a three saparate tokens.. while standard analyzer considering it as a one word because it has one digit in token. Expected three tokens: 1. thering6 2. proposal6 3. guyandgirl6 i want to change this behaviour of standard tokenizer for this purpose.... But i dont know where to change.... Do i need to comment some rule in StandardTokenizer.jj file ??? I am confused with this file.... Any pointer... - Bhavin