On 12 Dec 2013, at 4:16 PM, peter <peterchutchin...@gmail.com> wrote: > I have a word document that I output as a .'.mht; file ie, a 'single file > web page'. > > I can put sections of this into a string field in a database and then display > the field through a view, and the formatting in the word document is > preserved. > > here is a line from the file that I read into web2py and insert into a field > in a database. > > <p class=3DStyle7 > style=3D'line-height:11.5pt;mso-line-height-rule:exactly'><span lang=3DEN-US > style=3D'font-family:"Adobe Garamond","serif";mso-bidi-font-family: "Adobe > Garamond"'>‘One Lettuce Does Not a Salad Make’ is similar to Jones’ story > ....... > > > Everything works fine except the apostrophes in the text disappear. > > When I display the field on the screen, there are no apostrophes. I f I 'view > source', it is as above, but without the apostrophe's before One, after Make > and after Jones. > > Clearly this is an encoding problem. If I read the .mht file into textpad, > the apostrophe's appear, and textpad says the file is 'ANSI'. The question is > how do I read the file in such as way as to correctly encode the apostrophes? > > I have tried various encodings including 'locale.getpreferredencoding()'. > > > Does anyone know how to solve this problem >
Your email headers suggest that the string (at least in the email) is encoded as windows-1252. So if s is your encoded string, you might try s.decode('cp1252').encode('utf8'). Assuming that UTF-8 is OK for output. -- Resources: - http://web2py.com - http://web2py.com/book (Documentation) - http://github.com/web2py/web2py (Source code) - https://code.google.com/p/web2py/issues/list (Report Issues) --- You received this message because you are subscribed to the Google Groups "web2py-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to web2py+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.