Re: Email id tokenizer (actual email id & multiple terms)

2016-12-21 Thread Trejkaz
On Wed, Dec 21, 2016 at 11:23 PM, suriya prakash wrote: > Hi, > > Thanks for your reply. > > I might have one or more emailds in a single record. Just so you know, you can add the same field more than once with the field analysed by KeywordAnalyzer, and it will still become multiple tokens. This

Re: Email id tokenizer (actual email id & multiple terms)

2016-12-21 Thread suriya prakash
Hi, Thanks for your reply. I might have one or more emailds in a single record. So I have to index it with white space analyser after filtering emailid alone(may be using email id tokenizer). Tokenization will happen twice( for normal indexing and for special emailid field indexing) which is co

Re: Email id tokenizer (actual email id & multiple terms)

2016-12-20 Thread Trejkaz
On Wed, Dec 21, 2016 at 1:21 AM, Ahmet Arslan wrote: > Hi, > > You can index whole address in a separate field. > Otherwise, how would you handle positions of the split tokens? > > By the way, speed of phrase search may be just fine, so consider trying first. Speed aside, phrase search is difficu

Re: Email id tokenizer (actual email id & multiple terms)

2016-12-20 Thread Ahmet Arslan
Hi, You can index whole address in a separate field. Otherwise, how would you handle positions of the split tokens? By the way, speed of phrase search may be just fine, so consider trying first. Ahmet On Tuesday, December 20, 2016 5:15 PM, suriya prakash wrote: Hi, I am using standard anal

Email id tokenizer (actual email id & multiple terms)

2016-12-20 Thread suriya prakash
Hi, I am using standard analyzer and want to split token for email_id " luc...@gmail.com" as "lucene", "gmail","com","luc...@gmail.com" in a single pass. I have already changed jflex to split email id as separate words(lucene, gmail, com). But we need to do phrase search which will not be efficie