Re: Re: Bug#203498: ITP: decss -- utility for stripping CSS tags from [Brian Nelson <[EMAIL PROTECTED]>, Wed, Jul 30, 2003 at 09:07:19AM -0700, <[EMAIL PROTECTED]>] > "I wrote a small utility called "DeCSS" that strips Cascading Style > Sheet tags from an HTML document. Yes, agreed, that's pretty much > USELESS, but what the fuck. Maybe somebody wants to do that." > > Why the hell should this be packaged for Debian?
If you have an HTML page generated by M$ Word and want to extract only the HTML part, you can either remove tons of useless CSS by hand or use such a utility... However: It is essentially a Perl-5-liner with glue code: $content =~ s%<link.*?rel=\"stylesheet\".*?>%%mg; # Strip stylesheet links $content =~ s%<style>.*?</style>%%mg; # Strip <style> blocks $content =~ s%style=\".*?\"%%mg; # Strip style attributes $content =~ s%class=\".*?\"%%mg; # Strip class attributes $content =~ s%id=\".*?\"%%mg; # Strip id attributes Doesn't this remove strings looking like CSS from ordinary text, too? Christoph -- Christoph Berg <[EMAIL PROTECTED]>, http://www.df7cb.de Wohnheim D, 2405, Universität des Saarlandes, 0681/9657944
pgpMZITSUT41M.pgp
Description: PGP signature