[ Please type your reply below the quoted part of the message you
reply to. ]

Colin Johnstone wrote:
Jim wrote:
Colin Johnstone wrote:

I firstly need to remove any invalid characters (including spaces)

$filename = 'News & events';
$filename =~
s/[<sup>\w\&amp;%'[EMAIL PROTECTED](\)&amp;_\</sup>\+,\.=\[\]]//g;

then convert it to lowercase.

$filename =~ s/[^A-Za-z]//g; #use only alpha chars print lc($filename); # convert to lower case

# or if you want to strip out non printing (control) chars:
$filename = " News \r \n \x01 even ts";
$filename =~ s/[[:cntrl:]]//g;
$filename =~ s/\s//g;

rather than re-invent the wheel I would prefer if you could fix this regex I believe it covers all invalid characters one would encounter

s/[<sup>\w\&amp;%'[EMAIL PROTECTED](\)&amp;_\</sup>\+,\.=\[\]]//g;

In this case, re-inventing the wheel, as you put it, is much more convenient than using that regex as the starting-point for solving your problem.

I would then use it as a general purpose regex for validating
filenames.

The main problem is that your approach, i.e. trying to identify "invalid characters", is not the best choice. Generally it is advisable to do it the other way around: Decide which characters you want to accept, and remove the rest.

Drop the regex you "found on the web" and start listen to the
suggestions given.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to