Re: Converting HTML to ASCII

2005-02-27 Thread Thomas Dickey
Grant Edwards <[EMAIL PROTECTED]> wrote: > First, make it work. Then make it work right. Then worry > about how fast it is. > "Premature optimization..." That could be - but then again, most of the comments I've seen for that particular issue are for rather old releases. >> It seems to use a

Re: Converting HTML to ASCII

2005-02-26 Thread Grant Edwards
On 2005-02-26, Paul Rubin wrote: > Jorgen Grahn <[EMAIL PROTECTED]> writes: >> You should probably do what some other poster suggested -- download >> lynx or some other text-only browser and make your code execute it >> in -dump mode to get the text-formatted html. You'll get that >> working in an

Re: Converting HTML to ASCII

2005-02-26 Thread Jorgen Grahn
On 26 Feb 2005 02:36:31 -0800, Paul Rubin <> wrote: > Jorgen Grahn <[EMAIL PROTECTED]> writes: >> You should probably do what some other poster suggested -- download >> lynx or some other text-only browser and make your code execute it >> in -dump mode to get the text-formatted html. You'll get tha

Re: Converting HTML to ASCII

2005-02-26 Thread Mike Meyer
Michael Spencer <[EMAIL PROTECTED]> writes: > Mike Meyer wrote: > >> It also fails on tags with a ">" in a string in the tag. That's >> well-formed but ill-used HTML. >> True enough...however, it doesn't fail too horribly: > >>> striptags("""the text""") > "'>the text" > >>> De

Re: Converting HTML to ASCII

2005-02-26 Thread Paul Rubin
Jorgen Grahn <[EMAIL PROTECTED]> writes: > You should probably do what some other poster suggested -- download > lynx or some other text-only browser and make your code execute it > in -dump mode to get the text-formatted html. You'll get that > working in an hour or so, and then you can see if you

Re: Converting HTML to ASCII

2005-02-25 Thread Kent Johnson
gf gf wrote: Hi. I'm looking for a Python lib to convert HTML to ASCII. You might find these threads on comp.lang.python interesting: http://tinyurl.com/5zmpn http://tinyurl.com/6mxmb Kent -- http://mail.python.org/mailman/listinfo/python-list

Re: Converting HTML to ASCII

2005-02-25 Thread Michael Spencer
Mike Meyer wrote: It also fails on tags with a ">" in a string in the tag. That's well-formed but ill-used HTML. True enough...however, it doesn't fail too horribly: >>> striptags("""the text""") "'>the text" >>> and I think that case could be rectified rather easily, by stripping an

Re: Converting HTML to ASCII

2005-02-25 Thread Jorgen Grahn
On Fri, 25 Feb 2005 10:51:47 -0800 (PST), gf gf <[EMAIL PROTECTED]> wrote: > Hans, > > Thanks for the tip. I took a look at Beatiful Soup, > and it looked like it was a framework to parse HTML. This is my understanding, too. > I'm not really interetsed in going through it tag by > tag - just t

Re: Converting HTML to ASCII

2005-02-25 Thread Mike Meyer
Michael Spencer <[EMAIL PROTECTED]> writes: > gf gf wrote: >> [wants to extract ASCII from badly-formed HTML and thinks BeautifulSoup is >> too complex] > > You haven't specified what you mean by "extracting" ASCII, but I'll > assume that you want to start by eliminating html tags and comments, >

Re: Converting HTML to ASCII

2005-02-25 Thread Michael Spencer
gf gf wrote: [wants to extract ASCII from badly-formed HTML and thinks BeautifulSoup is too complex] You haven't specified what you mean by "extracting" ASCII, but I'll assume that you want to start by eliminating html tags and comments, which is easy enough with a couple of regular expressions:

Converting HTML to ASCII

2005-02-25 Thread gf gf
Hans, Thanks for the tip. I took a look at Beatiful Soup, and it looked like it was a framework to parse HTML. I'm not really interetsed in going through it tag by tag - just to get it converted to ASCII. How can I do this with B. Soup? --Thanks PS William - thanks for the reference to lynx,

Re: Converting HTML to ASCII

2005-02-24 Thread HC Hörsch
Try Beautiful Soup! 1) Be able to handle badly formed, or illegal, HTML, as best as possible. From the description: "It won't choke if you give it ill-formed markup: it'll just give you access to a correspondingly ill-formed data structure." Can anyone direct me to something which could help me

Re: Converting HTML to ASCII

2005-02-24 Thread William Park
gf gf <[EMAIL PROTECTED]> wrote: > Hi. I'm looking for a Python lib to convert HTML to > ASCII. Of course, a quick Google search showed > several options (although, I must say, less than I > would expect, considering how easy this is to do in > *other* languages... :| ), but, I have 2 requirement

Converting HTML to ASCII

2005-02-24 Thread gf gf
Hi. I'm looking for a Python lib to convert HTML to ASCII. Of course, a quick Google search showed several options (although, I must say, less than I would expect, considering how easy this is to do in *other* languages... :| ), but, I have 2 requirements, which none of them seem to meet: 1) Be