On Sat, 15 Mar 2008 12:09:19 -0400, Tom Stambaugh wrote: > I'm still confused about this, even after days of hacking at it. It's time I > asked for help. I understand that each of you knows more about Python, > Javascript, unicode, and programming than me, and I understand that each of > you has a higher SAT score than me. So please try and be gentle with your > responses. > > I use simplejson to serialize html strings that the server is delivering to > a browser. Since the apostrophe is a string terminator in javascript, I need > to escape any apostrophe embedded in the html. > > Just to be clear, the specific unicode character I'm struggling with is > described in Python as: > u'\N{APOSTROPHE}'}. It has a standardized utf-8 value (according to, for > example, http://www.fileformat.info/info/unicode/char/0027/index.htm) of > 0x27. > > This can be expressed in several common ways: > hex: 0x27 > Python literal: u"\u0027" > > Suppose I start with some test string that contains an embedded > apostrophe -- for example: u" ' ". I believe that the appropriate json > serialization of this is (presented as a list to eliminate notation > ambiguities): > > ['"', ' ', ' ', ' ', '\\', '\\', '0', '0', '2', '7', ' ', ' ', ' ', '"'] > > This is a 14-character utf-8 serialization of the above test string. > > I know I can brute-force this, using something like the following: > def encode(aRawString): > aReplacement = ''.join(['\\', '0', '0', '2', '7']) > aCookedString = aRawString.replace("'", aReplacement) > answer = simplejson.dumps(aCookedString) > return answer > > I can't even make mailers let me *TYPE* a string literal for the replacement > string without trying to turn it into an HTML link! > > Anyway, I know that my "encode" function works, but it pains me to add that > "replace" call before *EVERY* invocation of the simplejson.dumps() method. > The reason I upgraded to 1.7.4 was to get the c-level speedup routine now > offered by simplejson -- yet the need to do this apostrophe escaping seems > to negate this advantage! Is there perhaps some combination of dumps keyword > arguments, python encode()/str() magic, or something similar that > accomplishes this same result? > > What is the highest-performance way to get simplejson to emit the desired > serialization of the given test string?
Somehow I don't get what you are after. The ' doesn't have to be escaped at all if " are used to delimit the string. If ' are used as delimiters then \' is a correct escaping. What is the problem with that!? Ciao, Marc 'BlackJack' Rintsch -- http://mail.python.org/mailman/listinfo/python-list