Re: Regex again..

Michael Rasmussen Sat, 14 Apr 2012 09:45:33 -0700

On Sat, Apr 14, 2012 at 07:05:54PM +0300, Shlomi Fish wrote:
> Hi Somu,
> 
> On Sat, 14 Apr 2012 21:01:03 +0530
> Somu <som....@gmail.com> wrote:
> 
> > OK. Can i ask "WHY?"
> > Why can't it be done using regex. Isn't a html file just another long
> > string with more, but similar special characters??
> > 
> 
> first of all I should note that you appear to be replying to the wrong 
> messages
> which breaks the flow of the thread. Otherwise, please read the links which I
> gave you:

I did, he may or may not have but ...
They all saw to not do it without the "WHY". The closest is
http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
"It's a solved problem" being the "WHY" given.

Well, that's not totally fair of me.

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
Does start:
You can't parse [X]HTML with regex. Because HTML can't be parsed by
regex. Regex is not a tool that can be used to correctly parse HTML.
...
Regular expressions are a tool that is insufficiently sophisticated to
understand the constructs employed by HTML.

Though the humor in the rest of the post mask that essential statement.

Somu, regex to HTML parsing is like:
screwdriver to nail
butter knife to screw
mid sized car to coal transport
bicycle to 3,000 km journey to be completed in 48 hours
meat to a vegetarian
hair brush to can of paint

To a greater or lessor degree you might try to use one for the purpose
but it's not the right tool for the job.

--
Michael Rasmussen, Portland Oregon
Other Adventures: http://www.jamhome.us/ or http://westy.saunter.us/
Fortune Cookie Fortune du courrier:
By being willing to be a bad artist, you have a chance to BE an artist,
and perhaps over time, a very good one
~ Julia Cameron

s/artist/what you want to be/

--
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/

Re: Regex again..

Reply via email to