On Thu, Feb 7, 2008 at 7:11 PM, Shaun Laughey <[EMAIL PROTECTED]> wrote:
>
> Hi,
> I have used Beautiful Soup for parsing html.
> It works very nicely and I didn't see much of an issue with speed in
> parsing several hundred html files every hour or so.
> I also rolled my own using various regex's
On 07/02/2008, Jon Ribbens <[EMAIL PROTECTED]> wrote:
> On Thu, Feb 07, 2008 at 05:50:37PM +, Michael Sparks wrote:
> > > The code at
> > > http://www.voidspace.org.uk/python/weblog/arch_d7_2005_04_23.shtml#e35
> > > is wrong, for example.
> >
> > That's because it whitelists a collection of ta
On Thu, Feb 07, 2008 at 05:50:37PM +, Michael Sparks wrote:
> > The code at
> > http://www.voidspace.org.uk/python/weblog/arch_d7_2005_04_23.shtml#e35
> > is wrong, for example.
>
> That's because it whitelists a collection of tags but doesn't whitelist
> specific attributes, I presume.
That
Hi Michael,
Michael Sparks wrote:
> Just a quick Q for people: what's your favourite way (preferably a library :)
> of allowing a subset of HTML tags through? I can think of 1/2 dozen different
> ways of doing this, but I'm sure there's a preferred approach for some...
>
> Thanks in advance :-)
Michael Sparks wrote:
> On Thursday 07 February 2008 15:48:46 Jon Ribbens wrote:
>
>> The code at
>> http://www.voidspace.org.uk/python/weblog/arch_d7_2005_04_23.shtml#e35
>> is wrong, for example.
>>
>
> That's because it whitelists a collection of tags but doesn't whitelist
> specific at
Hi,
Just a quick suggestion, I've used Strip-o-Gram in the past and found
it to be pretty good.
http://zope.org/Members/chrisw/StripOGram/readme
--
Jon
On Feb 7, 2008 2:35 PM, Michael Sparks <[EMAIL PROTECTED]> wrote:
> Hi,
>
>
> Just a quick Q for people: what's your favourite way (preferably
On Thursday 07 February 2008 15:48:46 Jon Ribbens wrote:
> Be aware that if you are doing this for security reasons (e.g. to
> prevent cross-site scripting),
It is for that reason, essentially.
> it is very hard to get right.
Indeed, that's why I thought I'd find out what everyone else actually
Jon Ribbens wrote:
> On Thu, Feb 07, 2008 at 02:35:29PM +, Michael Sparks wrote:
>
>> Just a quick Q for people: what's your favourite way (preferably a library
>> :)
>> of allowing a subset of HTML tags through? I can think of 1/2 dozen
>> different
>> ways of doing this, but I'm sure t
On Thu, Feb 07, 2008 at 02:35:29PM +, Michael Sparks wrote:
> Just a quick Q for people: what's your favourite way (preferably a library :)
> of allowing a subset of HTML tags through? I can think of 1/2 dozen different
> ways of doing this, but I'm sure there's a preferred approach for some.
If you're not bothered about speed, BeautifulSoup can catch, remove and
replace arbitrary HTML tags in a document.
On Thu, Feb 7, 2008 at 2:35 PM, Michael Sparks <[EMAIL PROTECTED]> wrote:
> Hi,
>
>
> Just a quick Q for people: what's your favourite way (preferably a library
> :)
> of allowing a
Michael Sparks wrote:
> Hi,
>
>
> Just a quick Q for people: what's your favourite way (preferably a library :)
> of allowing a subset of HTML tags through? I can think of 1/2 dozen different
> ways of doing this, but I'm sure there's a preferred approach for some...
>
> Thanks in advance :-)
>
>
Hi,
Just a quick Q for people: what's your favourite way (preferably a library :)
of allowing a subset of HTML tags through? I can think of 1/2 dozen different
ways of doing this, but I'm sure there's a preferred approach for some...
Thanks in advance :-)
Michael.
--
http://yeoldeclue.com/bl
12 matches
Mail list logo