On Nov 30, 7:39 am, Durand <[EMAIL PROTECTED]> wrote: > Hi, > > I've got this weird problem where in some strings, parts of the string are in > hexadecimal, or thats what I think they are. I'm not exactly sure...I get > something like this: 's\x08 \x08Test!' from parsing a log file. From what I > found on the internet, x08 is the backspace character but I'm still not sure.
What you have is the output of the repr() function, which gives an unambiguous representation in printable ASCII of the string, with the extra bonus that it's a valid Python string constant that can be used in code to produce exactly the same value. What your example means is: the string contains 's', a backspace, a space, and a backspace, followed by 'Test!'. Try this at the Python interactive prompt: | >>> q = 's\x08 \x08Test!' | >>> len(q) | 9 Note there are only 4 characters infront of 'Test!' | >>> q | 's\x08 \x08Test!' What you have looks like very raw keyboard input: s oops space oops T e etc > Anyway, I need to clean up this string to get rid of any hexadecimal > characters so that it just looks like 'Test!'. Are there any functions to do > this? Pardon the pedantry, but you don't need to "get rid of any hexadecimal characters" ... hexadecimal characters are 01234567890ABCDEFabcdef :-) I guess that what you would like to do is simulate the keyboard processing of backspaces: | >>> def unbs(strg): | ... stack = [] | ... for c in strg: | ... if c == '\x08': | ... if stack: | ... stack.pop() | ... else: | ... stack.append(c) | ... return ''.join(stack) | ... | >>> unbs(q) | 'Test!' BTW, '\b' means the same as '\x08'; saves keystrokes when testing. | >>> unbs('abc\b\b\bxyz!\b') | 'xyz' HTH, John -- http://mail.python.org/mailman/listinfo/python-list