Thank you all for your help. -----Original Message----- From: Toby Stuart [mailto:[EMAIL PROTECTED]] Sent: Thursday, January 23, 2003 13:44 To: 'simran' Cc: '[EMAIL PROTECTED]' Subject: RE: FW: Removing HTML Tags
> > > > -----Original Message----- > > > > From: Johnstone, Colin [mailto:[EMAIL PROTECTED]] > > > > Sent: Thursday, January 23, 2003 12:29 PM > > > > To: '[EMAIL PROTECTED]' > > > > Subject: Removing HTML Tags > > > > > > > > > > > > Gidday all, > > > > > > > > When using our CMS (Interwoven Teamsite) I want to remove > > > > from any textarea any html tags that I don't want content > > > > contributors to use. On in particular is the font tag. Can > > > > one use a regex to remove these? > > > > > > > > I guess Im looking for a regex to remove anything between the > > > > font tags e.g <font>and </font>. Of course their could be > > > > anynumber of attributes in the openning font tag. > > > > > > > > Any help appreciated > > > > Thanking you in anticipation > > > > > > > > Colin Johnstone > > > > > > > > > > > > > > > > [Toby wrote] > > > use strict; > > > use warnings; > > > > > > my @remove_tags = qw(i b font); > > > > > > my $html = 'Some text. <i>italics</i> <b>bold</b> <font > > > class="myclass" > > > size="2">Hi There</font> <font>ABC > > > </font> <h1>Hi</h1>'; > > > > > > foreach my $tag (@remove_tags) > > > { > > > $html =~ s!<$tag.*?>(.*?)</$tag>!$1!gs; > > > > # Actually this is probably a bit better :) > > $html =~ s!<$tag.*?>(.*?)</$tag>!$1!gsim; > > > > [simran wrote] > does it make sense to use the 's' and 'm' modifiers > together... doesn't > 's' mean treat the text as a "single line" and 'm' mean "treat it as > multiple lines"! ? > > m = Specifies that if string has newline or carriage return chars, the ^ and $ ops match the start and end of the string, rather than individual lines s = Allows use of '.' to match a newline char I don't need to use both. It's a bad habit. The 'm' modifier is not necessary in this case. Observe the following: use strict; use warnings; my @remove_tags = qw(i b font); # This html contains nested tags and # some tags span multiple line my $html = 'Some text. <i>italics</i> <b>bold</b> <font class="myclass" size="2"><i>Hi There</i></font> <font>ABC </font> <h1>Hi</h1>'; foreach my $tag (@remove_tags) { $html =~ s!<$tag.*?>(.*?)</$tag>!$1!gis; #$html =~ s!<$tag.*?>(.*?)</$tag>!$1!gim; } print $html; -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]