And why not post an example of your catch to illustrate it for the
benefit of the list?
Because I was busy and I knew you would do it ;-)
Hee hee, yeah true enough :)
But if you know "this exact block of HTML", how about:
my @strings = ( "string 1", "string 2", ... );
Because most likeley the string he is trying to grab will be changing,
why else woul he be trygin to parse them out?
Right. So in fact you don't know "this exact block of HTML". Now,
based on what we do know your solution will probably work. But we don't
really know too much. What if "this exact block of HTML" contained
<p>h<!--</p>-->a</p>
for example? Yeah, I know, that'll never happen.
No it probably would happen, so with the regex as is you'd get 'h<!--'
in that case.
The thing is the OP seemed to be sure that the <p></p> would be one pair
on a single line (according to readmind())
Thats where his posting a complete model of his needs would have been
infinitely more workable and have avoided such wanton recommendations ;)
So I guess the moral is that it will work as expected assuming you can
gaurantee that your PP will always be on a line by itself with no P in it :)
Otherwise, you need to use some HTML parsing module since its will get
complex quick
Or how about a solution involving "links -dump" ?
ATTN casual readers: *That is the worst idea ever* don't do it!
I'm not sure it's quite that bad. I might have suggested using Java.
LOL, nice, I like you Paul you're a funny guy!
a) its not perl its a system command
b) its not portable by any means (what if "links" is not in their
path? what if "links" isn't even installed, what if "links" should have
been "lynx" what if the -dump flag on your OSs links needs to be --grab
on their OSs links, etc etc ?)
c) how does that help you get the string between the p tags in any
usefull form (IE you still have to get that data out of the output of
that command
d) Hypothetical unknown behavior: what if it creates a temp file and
is unable to remove it and it gets run a million times, now you've
potentially filled up the user's quota, potentially filled up a
partitian, etc etc
But what if chucking the output into a file does exactly what you want?
Then chuck it to a file (that was one reason I labeled it
"hypothetical"), the point is that if a single regex or existing module
will do what you need, use it instead of unnecessarily using external
stuff that would *still* most likely need to be
parsed into useable data structure and could introduce all sorts of
unknown problems.
Slavish adherence to portability concerns shouldn't get in the way of
your getting your job done. (perlfaq5)
I wouldn't call slavish the use of a built in tool (regex or modules)
instead of an external call with all its potential problems especially
if its not just a one time command.
And all because you didn't use Perl's most fundamental tool: regexs or
one of the zillions of HTML parsing modules to get what you want into a
data structure that is native to the script you want to use the data in.
In general parsing HTML with a regular expression is going to bite you.
Absolutley, but the OP insisited it was consisently like that and was so
vague about it that the regex would do it AFAI understood his needs.
Thats also why I kept mentioning modules to do the parsing for you if it
was anythign but super basic.
You might find situations where it works, and I've even done so myself
(with XML rather than HTML) but I don't think anyone could call it
robust whilst keeping a straight face.
you're right I tried it and now have the giggles, thanks a lot paul! :)
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>