I'm trying to throw out URLs with any invalid characters in them, like '@". According to http://www.ietf.org/rfc/rfc1738.txt : Thus, only alphanumerics, the special characters "$-_.+!*'(),", and reserved characters used for their reserved purposes may be used unencoded within a URL.
I'd like to throw out a URL like 'http://jncicancerspectrum.oupjournals.org/cgi/content/full/jnci;91/3/252' (even though this one works perfectly fine. Go figure.). I've tried: if ($url =~ /^[^A-Za-z0-9$-_.+!*'(),]+$/) { #if there are any invalid URL characters in the string # Remember, special regex characters lose their meaning inside [] print "Invalid character in URL at line $.: $url\n"; next; } According to my Camel, special regex characters are supposed to lose their special functioning inside []. Yet, that obviously isn't true for '-' used to separate the start and end of a range. I thought the fourth '-' at '$-' was probably indicating a range, so I tried to escape it by preceding it with a backslash or '\Q' but both gave strange errors about uninitiated strings in concatenations. Any suggestions? Thanks for your help and thoughts. -Kevin Zembower ----- E. Kevin Zembower Unix Administrator Johns Hopkins University/Center for Communications Programs 111 Market Place, Suite 310 Baltimore, MD 21202 410-659-6139 -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>