On Dec 26, 8:53 am, Stef Mientki <stef.mien...@gmail.com> wrote: > Steven D'Aprano wrote: > > On Thu, 25 Dec 2008 11:00:18 +0100, Stef Mientki wrote: > > >> hello, > > >> Is there a function to remove escape characters from a string ? > >> (preferable all escape characters except "\n"). > > > Can you explain what you mean? I can think of at least four alternatives: > > I have the following kind of strings, > the funny "þ" is ASCII character 254, used as a separator character
ASCII ends at 127. Just refer to it as chr(254). > > [FSM] > Counts = "1þ11þ16" ==> 1,11,16 > Init1 = "1þ\BCtrl" ==> 1,Ctrl > State5 = "8þ\BJUMP_COMPL\b\n>PCWrite = 1\n>PCSource = 10" > ==> 8, JUMP_COMPL\n>PCWrite = 1\n>PCSource = 10 After making those substitutions, what are you going to do with it? Split it up into fields using the csv module or stuff.split(",") or some other DIY method? Is there a possibility that whoever "designed" that data format used chr(254) as a separator because the data fields contained "," sometimes and so "," could not be used as a separator? > Seeing and testing all your answers, with great solutions that I've > never seen before, As far as str methods and built-ins that work on str objects are concerned, there is no corpus of secret knowledge known only to a cabal of wizards; it's all in the manual, and you don't need special magical spectacles to see it :-) > knowing nothing of escape sequences (I'm a windows guy ;-) Why do you think that whether or not you are a "windows guy" is relevant to knowing anything about escape sequences? > I now see that the characters I need to remove, like \B and \b are > not "official" escape sequences. \b *is* an "official" escape sequence, just like \n; see below: | >>> x = '\b'; print len(x), repr(x) | 1 '\x08' | >>> x = r'\b'; print len(x), repr(x) | 2 '\\b' | >>> x = '\B'; print len(x), repr(x) | 2 '\\B' | >>> x = r'\B'; print len(x), repr(x) | 2 '\\B' > So in this case the best (easiest to understand) method is a few replace > statements: > s = s.replace ( '\b', '' ).replace( '\B', '' ) It's probable that \b and \B are both TWO-byte sequences, in which case you should use r'\b' so that it does what you want it to do, and use r'\B' for consistency. -- http://mail.python.org/mailman/listinfo/python-list