Re: Trimming X/HTML files

2005-07-31 Thread Thomas SMETS
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 The regular expression remove script out of an HTML/XHTML file is simple enough but raises a major performance issue The following regular expression : r'()' takes ages to complete in python on simple HTML file more than 3 minutes of CPU

Re: Trimming X/HTML files

2005-07-28 Thread Walter Dörwald
Thomas SMETS wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA1 > > > Dear, > > I need to parse XHTML/HTML files in all ways : > ~ _ Removing comments and javascripts is a first issue > ~ _ Retrieving the list of fields to submit is my following item (todo) > > Any idea where I could fin

Trimming X/HTML files

2005-07-27 Thread Thomas SMETS
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear, I need to parse XHTML/HTML files in all ways : ~ _ Removing comments and javascripts is a first issue ~ _ Retrieving the list of fields to submit is my following item (todo) Any idea where I could find this already made ... ? \T, -BEGIN