On Fri, May 03, 2013 at 09:59:49PM +1000, Edward and Erica Heim wrote: > Hi all,
Hello, > I'm using LWP::UserAgent to access a website. One of the methods > returns HTML data e.g. > > my $data = $response->content; > > I.e. $data contains the HTML content. I want to be able to parse it > line by line e.g. > > foreach (split /pattern/, $data) { > my $line = $_; > ...... > > If I print $data, I can see the individual lines of the HTML data > but I'm not clear on the "pattern" that I should use in split or if > there is a better way to do this. > > I understand that there are packages to parse HTML code but this is > also a learning exercise for me. You haven't explained what it is you're trying to do so it's impossible for any of us to help you. There is no correct "pattern" here. It depends entirely on what the input data can be, and what you're trying to accomplish. What is it that you want to learn? If you want to learn how to process HTML data then you don't want to use split to do it. It's a flawed and error prone way (there is no /pattern/ that will work on all possible HTML inputs). You're far better off learning to use one of the HTML parsing modules because that will give you a reliable mechanism to process HTML data. If you just want to learn how to use split then by all means go for it. Just don't expect to get meaningful results when processing unpredictable HTML data. :) Give us more information if you want help. Regards, -- Brandon McCaig <bamcc...@gmail.com> <bamcc...@castopulence.org> Castopulence Software <https://www.castopulence.org/> Blog <http://www.bamccaig.com/> perl -E '$_=q{V zrna gur orfg jvgu jung V fnl. }. q{Vg qbrfa'\''g nyjnlf fbhaq gung jnl.}; tr/A-Ma-mN-Zn-z/N-Zn-zA-Ma-m/;say'
signature.asc
Description: Digital signature