I get a space in my editor output window. but when I run it from a cmd window, I get the other character. (This is under Windows 2000 and perl 5.8.0) "John W. Krahn" <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > David Eason wrote: > > > > John W. Krahn wrote: > > > According to HTML::Entities > > > > > > # Some extra Latin 1 chars that are listed in the HTML3.2 draft > > > (21-May-96) > > > copy => '©', # copyright sign > > > reg => '®', # registered sign > > > nbsp => "\240", # non breaking space > > > > Thanks, John, I had no idea where to look. I didn't know a non-breaking > > space was an actual character, I thought it was just a directive to the > > browser. > > AFAIK it is. > > > I have corrected the code below accordingly and it prints "line > > 1line 3" as desired. > > FWIW on my computer "\240" prints a "space". :-) > > > use strict; > > use warnings; > > use HTML::TokeParser; > > > > my $p = HTML::TokeParser->new(*DATA) or die "Can't open: $!"; > > while (my $tag = $p->get_tag()) > > { > > if ($tag->[0] eq "dd") > > { > > my $text = $p->get_trimmed_text(); > > $text =~ s/^[\s\240]*(.*?)[\s\240]*$/$1/; > > If you are going to do that then you might as well call get_text and do > all the trimming yourself. > > my $text = $p->get_text(); > for ( $text ) { > s/^[\s\240]+//; > s/[\s\240]+$//; > s/[\s\240]+/ /g; > } > > > print "$text"; > > } > > } > > > > __DATA__ > > > > <DD>line 1</DD> > > <DD> </DD> > > <DD>line 3</DD> > > > John > -- > use Perl; > program > fulfillment
-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]