Ah, you want to do it the hard way. Sorry, can't help you there - I
prefer to do things the simple way - easier to write and to maintain
and, in my experience, usually more robust in the long run.
--
Ian.
On Tue, Feb 17, 2015 at 11:42 AM, Ravikumar Govindarajan
wrote:
> Thanks Ian
>
> What I
Thanks Ian
What I am currently doing is duplicating the data into 2 different fields
and having my own PerFieldAnalyzerWrapper just like you pointed out
Is there a good way to do this in a single-pass? Like how Bi-Grams or
Common-Grams do…
--
Ravi
On Tue, Feb 17, 2015 at 3:08 PM, Ian Lea wrote
Sounds like a job for
org.apache.lucene.analysis.miscellaneous.PerFieldAnalyzerWrapper.
--
Ian.
On Tue, Feb 17, 2015 at 8:51 AM, Ravikumar Govindarajan
wrote:
> We have a requirement in that E-mail addresses need to be added in a
> tokenized form to one field while untokenized form is added to
We have a requirement in that E-mail addresses need to be added in a
tokenized form to one field while untokenized form is added to another field
Ex:
"I have mailed a...@xyz.com" . It should tokenize as below
body = {"I", "have", "mailed", "abc", "xyz", "com"};
I also have a body-addr field. To