Re: Lucene tokenization

2012-03-27 Thread Paul Libbrecht
Nilesh, the StandardAnalyzer is full of generally useful special cases, including emails and numbers detection. I am supposing you met one such special case which has a justification of some sort. I can't tell you why but I can tell it's really hard to change because others rely on this somehow

RE: Lucene tokenization

2012-03-27 Thread Steven A Rowe
Hi Nilesh, Which version of Lucene are you using? StandardTokenizer behavior changed in v3.1. Steve -Original Message- From: Nilesh Vijaywargiay [mailto:nilesh.vi...@gmail.com] Sent: Tuesday, March 27, 2012 2:04 PM To: java-user@lucene.apache.org Subject: Lucene tokenization I have a