Re: $1 $2 var confusion

Rob Dixon Sat, 12 May 2007 17:00:41 -0700

Steve Bertrand wrote:

John W. Krahn wrote:

Mumia W. wrote:

That happens because the match variables ($1, $2, ...) are only changed
when a regular expression matches; otherwise, they are left alone.


In the first case, "$2 !~ /domain\.com/" succeeds but does not capture
anything, so the numbered match variables are unset.

Your situation reinforces the rule that you should always test if the
match succeeded before you attempt to use the match variables:

    my $email = '[EMAIL PROTECTED]';
    my @f = (undef, $email =~ /(.*)\@(.*)/);


Why did you put undef in there?  It serves no useful purpose other than making
the code harder to understand for beginners.


Wow...powerful statement.

To be honest, I got what I needed before I really payed attention to the
above part as per Rob and Tom's replies, but after re-reading, I agree.

In the above, do I assume correctly (without time to test for myself)
that 'undef' in this case undefines any instance of $1? (or $N for that
matter)?


No, it has no effect on $1. I thought it would cause confusion! The statement 
simply
assigns a list to @f. The first element of the list is undef, and the rest is 
the
result of applying the regex to $email, so it's the same as

 my @f = (undef);
 push @f, $email =~ /(.*)\@(.*)/;

and simply offsets the captured results by one. As I said, I can see no reason 
to have
written it this way unless Mumia wanted $f[1] to correspond to $1 and $f[2] to 
$2.

      my @f = $email =~ /(.*)\@(.*)/;

    (@f > 1) && ($f[2] =~ /domain\.com/ ?
        print "$f[1]\n" : print "var 2 is bad\n" );

The test "@f > 1" is my way of testing if the match succeeded.

The rvalue conditional operator should use the returned value:


Honestly, I hate to say I'm a beginner, but relative to others here I
won't beg otherwise. Without having to spend time reading the ?: method
(which I never use as of yet anyway), here is how I would do it now, so
I would understand it, and so would my staff who are not programmers
whatsoever, and who may have to understand it lest I get hit by a bus. I
include comments as I would if a non programmer would have to read it:

# Get the username portion, and the domain portion that we
# must verify from the input the user types in

my ($username, $domain) = split (/\@/, $email);
{... do verification}


You probably want

 my ($username, $domain) = split /\@/, $email, 2;

otherwise something like '[EMAIL PROTECTED]@nonsense' would pass your test.

Now that I've started a controversy, can I ask if the following method
is correct and accepted practice if I only care about the username portion?

I use the following example often instead of split()ing, and then
breaking apart an array. Note this is a simple example, it's more handy
for me in circumstances where I may be fed an array with numerous slices:

my $username = (split (/\@/, $email))[0];


Here, I would prefer

 my ($user) = $email =~ /([EMAIL PROTECTED])/;

(find all the characters from the beginning of the string that aren't at signs)
as split() here implicitly generates a list of substrings by splitting $email
at the at signs, and you then throw all but one of those substrings away. In
practice the overhead of doing this is negligible, but to my mind it's a little
ugly and not descriptive of the problem. It would be going a little far to say
that it's unacceptable practice though.

Again, I have to say that the speed of the feedback was great today :)
Rob, I appreciate your input, and Tom, I don't know if you helped
Randall write the books, but it's especially exciting to see yourself
and the author of several books I own and have read active on the list.


You're more than welcome :)

Rob

--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: $1 $2 var confusion

Reply via email to