On 2013-04-18, Conway M wrote: > Günter, thanks for your response.
> The conf.py did not have a source_encoding specified. So I assume it would > just default to 'utf-8-sig'. Even explicitly specifying the encoding as > 'utf-8-sig' produced the same error. Do you have non-ASCII chars in conf.py? Otherwise, specifying the source encoding of conf.py is not necessary. The source encoding of conf.py will in any way not influence how Sphinx decodes the rst input files. > The snippet in the rst document that is causing the error is (also > specified in the original post): > *data = 'word,length\nTr\xe4umen,7\nGr\xfc\xdfe,5'* Interestingly, this does not contain any non-ASCII characters, so it should pass without problems! > The complete rst document can be found > here<https://raw.github.com/pydata/pandas/master/doc/source/io.rst>. Here I see that this is part of a "ipython" directive. It is used similar to "code" or "code-block", so I assume it should * treat the content as "literal", i.e. without special meaning to characters like the backslash * parse the content for syntax highlihgt. Maybe the extension defining "ipython" does not get this right and converts \x.. to non-ASCII characters. > The resulting html should look like > this<http://pandas.pydata.org/pandas-docs/dev/io.html#dealing-with-unicode-data>. > Yes indeed, the whole block contains Unicode characters not present in the input:: Out[1054]: word length 0 Träumen 7 1 Grüße 5 It seems there is rather a problem with ipython or the interface. (BTW, the example "Träumen" appears never alone in this form in German: either it is the verb "träumen" (with small t) capitalized at the beginning of a sentence like "Träumen werde ich." (Dream, I will.) or it is the plural accusative of the substantive "Traum" like in "in meinen Träumen" (in my dreams). > One thing that I just realized is that other developers who have built the > docs have built them exclusively on a Linux box. However, I am working off > a Ubuntu 12.04 virtual machine running on Windows 7. So I'm not entirely > convicted the the input file is broken and that it might be a platform > dependent issue. >From the other posts I learned that the issue could be solved with a locale setting. Günter -- You received this message because you are subscribed to the Google Groups "sphinx-users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at http://groups.google.com/group/sphinx-users?hl=en. For more options, visit https://groups.google.com/groups/opt_out.
