Many thanks Steve. This is good information. I think this should work fine. I was doing a string.replace in a cleanData() method with the following characters but don't know if that would have done it. This contains all the control characters that I really know about in normal use. ord(c) < 32 sounds like a much better way to go and comprehensive. So I guess instead of string.replace, I should do a ... for char in ... and check evaluate each character, correct? - or is there a better way of eliminating these other that reading a string in character by character.
'\a','\b','\e','\f','\n','\r','\t','\v','|' Regards, David On Monday, October 17, 2005, at 06:04 AM, Steve Holden wrote: > David Pratt wrote: >> I am working with a text format that advises to strip any ascii >> control >> characters (0 - 30) as part of parsing data and also the ascii pipe >> character (124) from the data. I think many of these characters are >> from a different time. Since I have never seen most of these >> characters >> in text I am not sure how these first 30 control characters are all >> represented (other than say tab (\t), newline(\n), line return(\r) ) >> so >> what should I do to remove these characters if they are ever >> encountered. Many thanks. > > You will find the ord() function useful: control characters all have > ord(c) < 32. > > You can also use the chr() function to return a character whose ord() > is > a specific value, and you can use hex escapes to include arbitrary > control characters in string literals: > > myString = "\x00\x01\x02" > > regards > Steve > -- > Steve Holden +44 150 684 7255 +1 800 494 3119 > Holden Web LLC www.holdenweb.com > PyCon TX 2006 www.python.org/pycon/ > > -- > http://mail.python.org/mailman/listinfo/python-list > -- http://mail.python.org/mailman/listinfo/python-list