Re: testing on email characters

Chas Owens Tue, 12 Jun 2001 10:56:02 -0700
On 12 Jun 2001 18:45:23 +0200, Jos Boumans wrote:
> Please, if you try and flame posts, get your facts straight.
> 
> 1st:     - is a range operator, and hence needs not be escaped when it's not
> indicating a range, ie, at the beginning or end of a []
>           so this regex is not 'wrong'. feel free to try it.

I am thourghly ashamed of myself for not checking before posting.  I am
sorry.

> 2nd:    the regex is purposely written verbose, seeing this is a newbie list and we
> want to teach them, not scare them away by hard to follow code

Since I started reading _Learning Perl_ a month and half ago, just
finished reading up on some of the stranger stuff in _Perl Programming_
(2nd edition, If only I had know 3rd was coming out!), and have barely
begun Conway's _Object Oriented Perl_ I consider myself to be a newbie.
I want to be scared by new and strange things.  It encourages me to hit
the docs and figure out what is going on.  That said I don't think it is
wrong to give simple answers to simple questions; I just believe that
the simple answers (the ones thay can use immediately) should be
followed up by scarier, tighter answers.  This method solves the first
problem (How do I do this?) and gives them (hopefully, although I fudged
it this time by fscking up the - bit) more to think about for the
future. 

>           on that note: ^\w might confuse, since, when used inside a range ( [] ),
> means something totally different then outside it... hence, for comprehension i
> teach \W instead
>           (for those wondering: outside a range, ^ means 'match beginning of
> string', inside a range it means 'not')
> 
> so the short version would be:
> if (/[\W.-/]){ print "illegal string" }

Correct me if I am wrong, but I don't think ^\w is the same as \W.  It
is my understanding (and I checked the docs on this one <grin />) that
/[\W.-]/ expands to /[[^a-zA-Z0-9_].-]/ which, when translated to
English, means match any non-word character, period, or underbar.  As
opposed to /[^\w.-]/ which means, again in English, match any character
that is not a word character, period, or underbar.

<code>
#!/usr/bin/perl -w

use strict;

$_ = "this_should.pass";

print "testing ($_) with ", '/[\W.-]/ = ';
if (/[\W.-]/)  {                      #fixed typo (]/ instead of /]) 
        print "illegal string\n";
} else { 
        print "Good!\n";
}

print "testing ($_) with ", '/[^\w.-]/ = ';
if (/[^\w.-]/) { 
        print "you cheated!\n";
} else {
        print "Good!\n";
}
</code>

<output>
testing (this_should.pass) with /[\W.-]/ = illegal string
testing (this_should.pass) with /[^\w.-]/ = Good!
</output>

This output would seem to say that /[\W.-]/ is not equal to /[^\w.-]/.

> 
> the verbose version would be an explicit check of every character, as described
> below
> 
> hope this clears things up,
> 
> Jos Boumans
> 
> PS html tags end </tag>

Not for single tags (ie <br> and <hr>) in the latest version.  Since XML
is now HTML's base (replacing SGML) <br> is illegal.  You must say
<br></br> or <br />.

> 
<snip />
--
Today is Pungenday, the 17th day of Confusion in the YOLD 3167
Or not.
Re: testing on email characters

Reply via email to