On Tue, 7 Jun 2005, Cy Kurtz wrote:

> OK ... Remember you asked for it.

Right. Because without sufficient context, it's impossible to give an
adequate answer to a wildly open-ended question. Make sense?

> I have at least a dozen files that I want to update. I want to do
> this:
>
> [EMAIL PROTECTED] somedirectory]$ perl -pi~ -e 
> 's/./officers-gasenate.html/http://www.legis.state.ga.us/cgi-bin/peo_list.pl?List=stsenatedl/'
>  ./contactus.html

That won't work. This gets reduced to

    s/./officers-gasenate.html/

Which matches a dot /./ -- which is a metacharacter meaning "matches
anything at all" -- and replaces it with /officers-gasenate.html/

In other words, it will turn this string --

    abc

-- into this

    officers-gasenate.htmlofficers-gasenate.htmlofficers-gasenate.html

-- which isn't at all what you meant :-)

Further, everything after that third forward-slash is ignored, and will
probably (read: definitely) produce an error.

I see two things that are worth changing here.

 * you shouldn't be using forward-slashes as the regex delimiter
 * you should be escaping metacharacters like the dot

Thus, the regex should be something like this:

    
s|\./officers-gasenate\.html|http://www\.legis\.state\.ga\.us/cgi-bin/peo_list\.pl\?List=stsenatedl|

That's a bit unwieldy; you can break it up for clarity --

    my $old = "\./officers-gasenate\.html";
    my $new = 
"http://www\.legis\.state\.ga\.us/cgi-bin/peo_list\.pl\?List=stsenatedl";;
    s/$old/$new/;

-- but for a command-line one-liner, that's probably overkill.



Note though that it's standard to point out here that HTML is
notoriously difficult to get right with regular expressions. If all
you're doing is changing the href target of known anchor tags in a
limited set of files that you have control over, it's probably fine to
solve it this way, but if the HTML is at all complicated -- that is, if
it has any inconsistencies at all, broken tags, etc -- you're much
better off solving this kind of problem with a parser module from CPAN.

There's a lot of them to choose from, depending on your needs, but
almost any of them are a better choice than doing this kind of thing by
hand with regular expressions: it's easier, faster, and more robust.

Keep it in mind if this problem starts getting more complicated...



-- 
Chris Devers

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to