On May 3, 2010, at 9:06 AM, andrew cooke wrote:
Hi,
The latest Lepl release includes an implementation of RFC 3696 - the
RFC that describes how best to validate email addresses and HTTP
URLs. For more information please see http://www.acooke.org/lepl/rfc3696.html
Lepl's main page is http://www.acooke.org/lepl
Because Lepl compiles to regular expressions wherever possible, the
library is quite fast - in testing I was seeing about 1ms needed to
validate a URL.
Please bear in mind that this is the very first release of this
module, so it may have some bugs... If you find any problems contact
me and I'll fix them ASAP.
Thanks, Andrew, for contributing that to the open source community.
FYI, Fourthought's PyXML has a module called uri.py that contains
regexes for URL validation. I've over a million URLs (harvested from
the Internet) through their code. I can't say I checked each and every
result, but I never saw anything that would lead me to believe it was
misbehaving.
It might be interesting to compare the results of running a large list
of URLs through your code and theirs.
Good luck
Philip
--
http://mail.python.org/mailman/listinfo/python-list