> FYI, Fourthought's PyXML has a module called uri.py that contains > regexes for URL validation. I've over a million URLs (harvested from > the Internet) through their code. I can't say I checked each and every > result, but I never saw anything that would lead me to believe it was > misbehaving. > > It might be interesting to compare the results of running a large list > of URLs through your code and theirs. > > Good luck > Philip
It's getting a set of URLs that's the main problem. I've tested it with URL examples in RFC 3696, and with a few extra ones that test particular issues, but when I looked around I couldn't find any public, obvious list of URLs for general testing. Could I use your list? Also, same for emails... Cheers, Andrew Cheers, Andrew -- http://mail.python.org/mailman/listinfo/python-list