Re: Regular expression for extracting hrefs from HTML file

Chas. Owens Sun, 03 Feb 2008 17:25:52 -0800

On Feb 3, 2008 8:50 AM, R (Chandra) Chandrasekhar <[EMAIL PROTECTED]> wrote:
> Dear Folks,
>
> I am trying to construct a regular expression to extract strings having the
> structure
>
> <a href="http://...";>
>
> from HTML files, as part of learning regexes. I have used the script below to 
> do
> this:
snip


Part of learning to use regexes is learning when not to use them.
This is a job for an HTML parser, not a single regex.  Offhand, I
would say that your problem is that the anchor tag is spread across
more than one line and your script appears to be reading the file one
line at a time.  This is but one corner case you will have to deal
with (which is why you should use a parser instead).

-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
http://learn.perl.org/

Re: Regular expression for extracting hrefs from HTML file

Reply via email to