Are you using StandardAnalyzer in 3.1+? You may want to use ClassicAnalyzer instead. I can't see where in the 3.5 javadocs it says that email addresses are recognized, but it does sound vaguely familiar.
-- Ian. On Thu, Feb 16, 2012 at 5:18 PM, Charlie Hubbard <charlie.hubb...@gmail.com> wrote: > This is a pretty simple question to answer, but I have customers asking me > how this is suppose to work and I'm having trouble explaining it. I have > an app that indexes emails so there are plenty of email addresses in there. > Reading the StandardAnalyzer javadoc it says it "recognizes" email > addresses when it is creating the token list. What tokens will it produce > exactly? What I'm seeing when I perform searches is the email address > looks like its being tokenized into its parts. Searching by an email > address like: > > to:charlie.hubb...@gmail.com > > pulls back more hits that haven't been addressed to > charlie.hubb...@gmail.com. Other messages with gmail.com in them are > returned. If I use the following: > > to:charlie.hubbard > > in them. It also finds gmail.com, and other domains. And I can search for > strings like > > to:"charlie.hubb...@gmail.com" > > it will pull back only emails addressed to that address. Further proof it > seems to token the parts of an email is if I search for a very specific > email address like: > > to:"charlie.hubbard+sometag" > > That will pull back only emails addressed to that email, but it's not a > full email address. Which leads me to think it will parse parts of the > email addresses. Can someone explain this a little more? > > I'm having trouble with some emails that can't be pulled back using the > username like searching for to:chubbard where the email was addressed to > chubb...@somedomain.com, but it fails to show up in the search results. I > can't explain why that's happening. In all of my tests I can't reproduce > it and I think I might have to reindex everything because this was an index > built with 2.4 and I upgraded to 3.1 so I'm worried it might be corrupted. > > Thoughts? --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org