The following module was proposed for inclusion in the Module List: modid: HTML::Strip DSLIP: RdcOp description: Efficiently removes HTML markup from text userid: KILINRAX (Alex Bowley) chapterid: 15 (World_Wide_Web_HTML_HTTP_CGI) communities:
similar: rationale: Whilst the module quite happily strips SGML/XML-like markup from text as well as HTML; I believe it should exist in the HTML namespace simple because I can envisage no circumstances under which someone would want to blanketly remove SGML/XML markup - whereas stripping extraneous HTML markup is occasionally very desirable. A common application is preparing HTML snippets for indexing by a search engine. As this module is written bare-minimum C, it tends to be about 7 times faster than using regular expressions to do the same thing. enteredby: KILINRAX (Alex Bowley) enteredon: Wed Aug 13 14:56:59 2003 GMT The resulting entry would be: HTML:: ::Strip RdcOp Efficiently removes HTML markup from text KILINRAX Thanks for registering, -- The PAUSE PS: The following links are only valid for module list maintainers: Registration form with editing capabilities: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=be300000_9f6897b4844009e5&SUBMIT_pause99_add_mod_preview=1 Immediate (one click) registration: https://pause.perl.org/pause/authenquery?ACTION=add_mod&USERID=be300000_9f6897b4844009e5&SUBMIT_pause99_add_mod_insertit=1