Brandon McCaig <bamcc...@gmail.com> writes: > On Thu, Feb 3, 2011 at 4:59 PM, Harry Putnam <rea...@newsguy.com> wrote: >> May I ask how that formulation servers the purpose better? Is it >> processed more easily or quicker in that formulation as against the >> one I posted? >> >> Or does mine leave too many possibilities for poor results? > > Yours just wasn't very precise: > >> if ( !/^.*\.[bjgtp][gimnps][gfadp]$/) { > > Instead of specifying valid extensions, you're specifying valid > first-characters, valid second-characters, and valid third-characters > for the file extension. Pick any random character from each bracket > expression (i.e., '[expression]') and you can generate file names that > would match and shouldn't. For example: > > foo.jga > bar.ppp > baz.gma
[...] > In addition to the stricter matching rules, it's also rather more easy > to read, IMHO. Looking at my regular expression, it should be obvious > to any programmer and many computer users that it's matching image > file extensions. Yours appears much more random to a human. It might > suggest that it's matching file extensions, but it certainly doesn't > communicate well which ones it is supposed to match. Nicely put... and thanks for the effort to explain it. One further question. In your formulation shown below: ,---- | unless($filename =~ m(.+\.(bmp|gif|jpg|png|psd|tga|tif)$)) | { | print STDERR "The filename {$filename} has an unsupported | extension. Skipping..."; | next; | } `---- I see that specifying exact possible extensions is good, but don't really see what the `m' does there. I'm not that informed on all incantations of perl regex but does that not anticipate filenames using multiple lines? My take on `m' (from perlre) is that it basically replaces the meaning of ^ and $ from the common start of string and end of string, to start and end of any line anywhere in multiple lines. Further it seems that the use of parens for regex delimiters (at least in this instance) is somewhat confusing when its thrown in with at least 2 other uses of parens, making 3 different kinds of uses in one clause. Its used to enclose arguments to a function, to enclose uses of the alternation char, and to delimit a regex. And finally the `.+' at the start of the regex seems to allow names such as $$.psd or ##.psd or %.psd and the like. Kind of undoing your effort to enforce strict extensions by allowing weird and even unusable names on the other end. Would `\w+' have served better or am I really missing the boat all the way round? -- To unsubscribe, e-mail: beginners-unsubscr...@perl.org For additional commands, e-mail: beginners-h...@perl.org http://learn.perl.org/