bruce wrote: > simon... > > the ' ' is not to be seen/viewed as text/ascii.. it's a representation > of a hex 'u\xa0' if i recall...
Did you not see this part of the post that you're replying to? > 'nbsp': '\xa0', My point was not that '\xa0' is an ascii character... It was that your initial request was very misleading: "i'm running into a problem where i'm seeing non-ascii chars in the parsing i'm doing. in looking through various docs, i can't find functions to remove/restrict strings to valid ascii chars." That's why you got three different answers to the wrong question. You weren't "seeing non-ascii chars" at all. You were seeing ascii representations of html entities that, in the case of ' ', happen to represent non-ascii values. > > i'm looking to remove or replace the insances with a ' ' (space) Simplicity: s.replace(' ', ' ') ~Simon "You keep using that word. I do not think it means what you think it means." -Inigo Montoya, "The Princess Bride" > > -bruce > > > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] Behalf > Of Simon Forman > Sent: Monday, July 03, 2006 7:17 PM > To: python-list@python.org > Subject: Re: ascii character - removing chars from string > > > bruce wrote: > > hi... > > > > update. i'm getting back html, and i'm getting strings like " foo " > > which is valid HTML as the ' ' is a space. > > &, n, b, s, p, ; Those are all ascii characters. > > > i need a way of stripping/removing the ' ' from the string > > > > the needs to be treated as a single char... > > > > text = "foo cat " > > > > ie ok_text = strip(text) > > > > ok_text = "foo cat" > > Do you really want to remove those html entities? Or would you rather > convert them back into the actual text they represent? Do you just > want to deal with 's? Or maybe the other possible entities that > might appear also? > > Check out htmlentitydefs.entitydefs (see > http://docs.python.org/lib/module-htmlentitydefs.html) it's kind of > ugly looking so maybe use pprint to print it: > > >>> import htmlentitydefs, pprint > >>> pprint.pprint(htmlentitydefs.entitydefs) > {'AElig': 'Æ', > 'Aacute': 'Á', > 'Acirc': 'Â', > . > . > . > 'nbsp': '\xa0', > . > . > . > etc... > > > HTH, > ~Simon > > "You keep using that word. I do not think it means what you think it > means." > -Inigo Montoya, "The Princess Bride" > > -- > http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list