On Sat, Apr 14, 2012 at 07:05:54PM +0300, Shlomi Fish wrote:
> Hi Somu,
> 
> On Sat, 14 Apr 2012 21:01:03 +0530
> Somu <som....@gmail.com> wrote:
> 
> > OK. Can i ask "WHY?"
> > Why can't it be done using regex. Isn't a html file just another long
> > string with more, but similar special characters??
> > 
> 
> first of all I should note that you appear to be replying to the wrong 
> messages
> which breaks the flow of the thread. Otherwise, please read the links which I
> gave you:

I did, he may or may not have but ...
They all saw to not do it without the "WHY".  The closest is 
  http://www.codinghorror.com/blog/2009/11/parsing-html-the-cthulhu-way.html
"It's a solved problem" being the "WHY" given. 

Well, that's not totally fair of me. 

http://stackoverflow.com/questions/1732348/regex-match-open-tags-except-xhtml-self-contained-tags
Does start:
 You can't parse [X]HTML with regex. Because HTML can't be parsed by
 regex. Regex is not a tool that can be used to correctly parse HTML. 
   ...
 Regular expressions are a tool that is insufficiently sophisticated to
 understand the constructs employed by HTML.

Though the humor in the rest of the post mask that essential statement.

Somu, regex to HTML parsing is like:
  screwdriver to nail
  butter knife to screw
  mid sized car to coal transport
  bicycle to 3,000 km journey to be completed in 48 hours
  meat to a vegetarian
  hair brush to can of paint

To a greater or lessor degree you might try to use one for the purpose
but it's not the right tool for the job.  

-- 
            Michael Rasmussen, Portland Oregon  
      Other Adventures: http://www.jamhome.us/ or http://westy.saunter.us/
Fortune Cookie Fortune du courrier:
By being willing to be a bad artist, you have a chance to BE an artist, 
and perhaps over time, a very good one 
    ~ Julia Cameron

s/artist/what you want to be/

-- 
To unsubscribe, e-mail: beginners-unsubscr...@perl.org
For additional commands, e-mail: beginners-h...@perl.org
http://learn.perl.org/


Reply via email to