Re: not quite 1252

Serge Orlov Wed, 26 Apr 2006 15:56:44 -0700

Anton Vredegoor wrote:
> I'm trying to import text from an open office document (save as .sxw and
>   read the data from content.xml inside the sxw-archive using
> elementtree and such tools).
>
> The encoding that gives me the least problems seems to be cp1252,
> however it's not completely perfect because there are still characters
> in it like \93 or \94. Has anyone handled this before? I'd rather not
> reinvent the wheel and start translating strings 'by hand'.


I extracted content.xml from a test file and the header is:
<?xml version="1.0" encoding="UTF-8"?>

So any xml library should handle it just fine, without you trying to
guess the encoding.

-- 
http://mail.python.org/mailman/listinfo/python-list

Re: not quite 1252

Reply via email to