Omega -1911 wrote:
Hello all,

I am trying to parse calendar events for a rss feed into variables. Can
someone help with building the following regex or point me in the direction
of some good examples? Thanks in advance.

Here is what I have tried:  (I don't know much about complex regex's as you
see)
$mystring =~ /.+(<p><li><b>)(\w+) (<FONT COLOR=\"\#990000\">)(\w+)(\[Ref
\#(\d+\])(.+)$/);


Here is a sample string:
<p><li><b> DATE <FONT COLOR="#990000">TITLE</FONT></b> EVENT <a href="
http://www.mysite.com"target="_new";>www.mysite.com</a> [Ref #67579]</li>

What I would like to pull out is the TITLE && EVENT information. The sample
string is the format for each event. Any takers on this? Again, thanks for
any help.

Hi Dave

Better than using regexes to extract the information, which are notoriously poor
at processing HTML, would be to use one of the the bespoke HTML parsing modules.
My preference is HTML::TreeBuilder, which builds a structure of HTML::Element
objects to represent the original document. From that it is easy to extract the
parts you need according to their context.

Can you let us have a URL for the information so that we can help you a little
better? Or at least an example with several records that you need to process.

Rob


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>


Reply via email to