Re: Scan data for XML invalid characters and parse articles

2002-02-13 Thread John
Here is what I ended up with - this is a chunk from a much bigger script. Suggestions gladly accepted. Haven't fixed the special characters yet. my ( $date, $p, @articles ) = (); if ( ! defined( $p = HTML::TokeParser->new( $html ))) { localError( "Unable to parse $html : $!" ); } my ( $

Re: Scan data for XML invalid characters and parse articles

2002-02-13 Thread John
At Wednesday, 13 February 2002, "Brett W. McCoy" wrote: > >Don't use regex to pull apart HTML, it'll be trouble that it's worth. Are you sure about this or am I still going about this wrong. I haven't tried this yet, haven't even gotten to the articles. This had been a really simple regex to

Re: Scan data for XML invalid characters and parse articles

2002-02-13 Thread Morbus Iff
>I have a scalar variable containing HTML that needs to be converted >to XML. It's not the best HTML so it has invalid characters (like >smart quotes, 1/2 character, etc.). I need to determine if these >characters exist in the data and throw an error if they do. What >is the best way to do

Re: Scan data for XML invalid characters and parse articles

2002-02-13 Thread Brett W. McCoy
On Wed, 13 Feb 2002, John wrote: > I have a scalar variable containing HTML that needs to be converted > to XML. It's not the best HTML so it has invalid characters (like > smart quotes, 1/2 character, etc.). I need to determine if these > characters exist in the data and throw an error if they

Re: Scan data for XML invalid characters and parse articles

2002-02-13 Thread Adam Turoff
On Wed, Feb 13, 2002 at 08:40:14AM -0800, John wrote: > I have a scalar variable containing HTML that needs to be converted > to XML. It's not the best HTML so it has invalid characters (like > smart quotes, 1/2 character, etc.). I need to determine if these > characters exist in the data and