On Fri, May 03, 2013 at 09:59:49PM +1000, Edward and Erica Heim wrote:
> Hi all,

Hello,

> I'm using  LWP::UserAgent to access a website. One of the methods
> returns HTML data e.g.
> 
> my $data = $response->content;
> 
> I.e. $data contains the HTML content. I want to be able to parse it
> line by line e.g.
> 
> foreach (split /pattern/, $data) {
>     my $line = $_;
> ......
> 
> If I print $data, I can see the individual lines of the HTML data
> but I'm not clear on the "pattern" that I should use in split or if
> there is a better way to do this.
> 
> I understand that there are packages to parse HTML code but this is
> also a learning exercise for me.

You haven't explained what it is you're trying to do so it's
impossible for any of us to help you. There is no correct
"pattern" here. It depends entirely on what the input data can
be, and what you're trying to accomplish.

What is it that you want to learn? If you want to learn how to
process HTML data then you don't want to use split to do it. It's
a flawed and error prone way (there is no /pattern/ that will
work on all possible HTML inputs). You're far better off learning
to use one of the HTML parsing modules because that will give you
a reliable mechanism to process HTML data.

If you just want to learn how to use split then by all means go
for it. Just don't expect to get meaningful results when
processing unpredictable HTML data. :)

Give us more information if you want help.

Regards,


-- 
Brandon McCaig <bamcc...@gmail.com> <bamcc...@castopulence.org>
Castopulence Software <https://www.castopulence.org/>
Blog <http://www.bamccaig.com/>
perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }.
q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.};
tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'

Attachment: signature.asc
Description: Digital signature

Reply via email to