At Wednesday, 13 February 2002, "Brett W. McCoy" <bmccoy@chapelperilous. net> wrote: > >Don't use regex to pull apart HTML, it'll be trouble that it's worth.
Are you sure about this or am I still going about this wrong. I haven't tried this yet, haven't even gotten to the articles. This had been a really simple regex to extract the date: if ( ! defined( my $p = HTML::TokeParser->new( $html ))) { localError( "Unable to parse $html : $!" ); } while ( my $token = $p->get_token()) { if ( $token[0] = 'C' $token[1] =~ m#<!-- begin header date --># ) { while ( my $token = $p->get_token()) { if ( $token[0] eq "T" ) { $date .= $token[1]; } elsif ( $token[0] eq "S" ) { $date .= $token[4]; } elsif ( $token[0] eq "E" ) { $date .= $token[2]; } elsif ( $token[0] eq "C" && $token[1] =~ m#<!-- end header date --># ) { last; } else { localError( "$token[0] : unrecognized HTML Token Type : <PRE>" . Dumper( $token ) . "</PRE>"; } } } } -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]