I think you meant extract emails from document, right?

I'd probably find `@` and iterate before and after unless I get posix
punct, space, characters.  But it'll probably give some false matches. So
its really hard to find 100% emails from an arbitrary text. This is because
valid email can contain many different type of characters. According to RFC
822 space is a valid character in email. So finding all the valid emails is
tough.
In a *trivial situation* an email would be separated by space. So find @
first. Then go back and front to find the first space. You'll get most
common emails. Something like using this regex pattern
[^[:space:]<@]+@[^[:space:]>]+ would suffice.
But keep in mind, it'll work on trivial cases. Not on special cases.
Regular expression can not be used on special cases. Here is full RFC-822
compliant email matching regular expression
http://ex-parrot.com/~pdw/Mail-RFC822-Address.html


More information can be found on
http://stackoverflow.com/questions/201323/using-a-regular-expression-to-validate-an-email-address


On Sat, Jan 26, 2013 at 10:24 PM, Tedd Sperling <t...@sperling.com> wrote:

> Hi gang:
>
> I thought I had a function to strip emails from a document, but I can't
> find it.
>
> So, before I start writing a common script, do any of you have a simple
> script to do this?
>
> Here's an example of the problem:
>
> Before:
>
> "Will Alex" <ale...@cit.msu.edu>;"Moita Zact" <za...@cit.msu.edu>;"Bob
> Arms" <ar...@cit.msu.edu>;"Meia Terms" <term...@cit.msu.edu>;
>
> After:
>
> ale...@cit.msu.edu
> za...@cit.msu.edu
> ar...@cit.msu.edu
> term...@cit.msu.edu
>
> Cheers,
>
> tedd
>
>
> _____________________
> t...@sperling.com
> http://sperling.com
>
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>


-- 
Shiplu.Mokadd.im
ImgSign.com | A dynamic signature machine
Innovation distinguishes between follower and leader

Reply via email to