re pattern for matching JS/CSS

2006-12-15 Thread i80and
I'm working on a program to remove tags from a HTML document, leaving just the content, but I want to do it simply. I've finished a system to remove simple tags, but I want all CSS and JS to be removed. What re pattern could I use to do that? I've tried '' but that didn't work properly. I'm fai

Re: cxfrozen linux binaries run on FreeBSD?

2006-12-15 Thread i80and
I haven't personally used freeze (Kubuntu doesn't seem to install it with the python debs), but based on what I know of it, it makes make files. I'm not a make expert, but if FreeBSD has GNU tools, freeze's output _should_ be able to be compiled on FreeBSD. On Dec 15, 5:52 am, robert <[EMAIL PROT

re pattern for matching JS/CSS

2006-12-15 Thread i80and
I'm working on a program to remove tags from a HTML document, leaving just the content, but I want to do it simply. I've finished a system to remove simple tags, but I want all CSS and JS to be removed. What re pattern could I use to do that? I've tried '' but that didn't work properly. I'm fai

Re: Having problems with urlparser concatenation

2006-11-09 Thread i80and
Thank you! Fixed my problem perfectly! Gabriel Genellina wrote: > At Thursday 9/11/2006 20:23, i80and wrote: > > >I'm working on a basic web spider, and I'm having problems with the > >urlparser. > >[...] > > SpliceStart = Website.find('&

Having problems with urlparser concatenation

2006-11-09 Thread i80and
I'm working on a basic web spider, and I'm having problems with the urlparser. This is the effected function: -- def FindLinks(Website): WebsiteLen = len(Website)+1 CurrentLink = '' i = 0 SpliceStart = 0 SpliceEnd = 0

Re: Character encoding

2006-11-07 Thread i80and
I would suggest using string.replace. Simply replace ' ' with ' ' for each time it occurs. It doesn't take too much code. On Nov 7, 1:34 pm, "mp" <[EMAIL PROTECTED]> wrote: > I have html document titles with characters like >,  , and > ‡. How do I decode a string with these values in Python? >