Re: indexing multiple email addresses in one field

Phil Whelan Thu, 30 Jul 2009 11:40:16 -0700

On Thu, Jul 30, 2009 at 11:22 AM, Matthew Hall
<mh...@informatics.jax.org> wrote:
>
> 1. Sure, just have an analyzer that splits on all non letter characters.
> 2. Phrase queries keep the order intact.  (And yes, the positional 
> information for the terms is kept, which is what allows span queries to work)
>
> So searching on the following "foo bar com" will match f...@bar.com but not 
> b...@foo.com


Thanks, I really appreciate your help with this. That's great to know.
Can I take this a little further...

If I have "f...@bar.com b...@foo.com c...@bar.foo" and analyze it I get
"foo bar com bar foo com com bar foo", so perhaps I need a different
way of delimiting the emails, as it will match some other combinations
here, eg. f...@com.com which is not one of the emails.

Has anyone done anything similar? I can imagine that one option would
be to filter the returned docs based on the original content of the
string I'm analyzing. Does Lucene do this for me?

Thanks,
Phil

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Re: indexing multiple email addresses in one field

Reply via email to