Lothar, that is correct.  I would like to see those 4 characters in the 
html document.  When you say "escaping" is there a way to specify in the 
rst document for a line to selectively not be interpreted as UTF-8?

On Thursday, April 18, 2013 12:15:03 PM UTC-5, Lothar Braun wrote:
>
> I’m not a specialist on this, but
>
> seeing your html I would think, that you would like to see those 4 
> characters \xe4  in your html document, but Sphinx sees this as the utf-8 
> form of one (non existing) unicode character.
>
>  
>
> If that is correct, I would try some escaping to bypass the parsing of 
> this sequence as utf-8.
>
> Lothar
>
>  
>
> *Von:* [email protected] <javascript:> [mailto:
> [email protected] <javascript:>] *Im Auftrag von *Conway M
> *Gesendet:* Donnerstag, 18. April 2013 17:39
> *An:* [email protected] <javascript:>
> *Cc:* [email protected] <javascript:>
> *Betreff:* [sphinx-users] Re: SphinxError: Can't decode unicode within a 
> doc
>
>  
>
> Günter, thanks for your response.
>
>  
>
> The conf.py did not have a source_encoding specified.  So I assume it 
> would just default to 'utf-8-sig'.  Even explicitly specifying the encoding 
> as 'utf-8-sig' produced the same error.
>
>  
>
> The snippet in the rst document that is causing the error is (also 
> specified in the original post):
>
>  
>
> *data = 'word,length\nTr\xe4umen,7\nGr\xfc\xdfe,5'*
>
>  
>
> The complete rst document can be found 
> here<https://raw.github.com/pydata/pandas/master/doc/source/io.rst>. 
>  The resulting html should look like 
> this<http://pandas.pydata.org/pandas-docs/dev/io.html#dealing-with-unicode-data>.
>  
>  
>
>  
>
> One thing that I just realized is that other developers who have built the 
> docs have built them exclusively on a Linux box.  However, I am working off 
> a Ubuntu 12.04 virtual machine running on Windows 7.  So I'm not entirely 
> convicted the the input file is broken and that it might be a platform 
> dependent issue.  
>
>  
>
>  
>
>
>
> On Thursday, April 18, 2013 2:06:43 AM UTC-5, Guenter Milde wrote:
>
> On 2013-04-17, Conway M wrote: 
>
>
> > I am trying to compile the docs of Pandas 
> > <https://github.com/pydata/pandas>but I am unable to get Sphinx to 
> > compile a document with some unicode.  Is there some flag I need to 
> > specify to let Sphinx correctly build documents with unicode in them? 
>
> The default input encoding is 'utf8', so if your rst document is 
> utf8-encoded, it should be OK. 
>
> If not, please post more details (used encoding, docutils settings). 
> A minimal example (the part of the input file that coused the error) may 
> help further. 
>
> > In this case, I don't want Sphinx to decode the text. 
>
> Docutils/Sphinx will always decode the input into an "unicode" instance 
> and encode the output. All inner processing is done on "unicode" (or 
> derived) objects. 
>
> ... 
>
> >> *  File "/usr/local/lib/python2.7/dist-packages/sphinx/environment.py", 
> >> line 609, in read_doc 
> >>     raise SphinxError(str(err)) 
> >> *SphinxError: 'utf8' codec can't decode byte 0xe4 in position 36: 
> invalid 
> >> continuation byte 
> >> *> 
> >> 
> /usr/local/lib/python2.7/dist-packages/sphinx/environment.py(609)read_doc() 
> >> -> raise SphinxError(str(err)) 
> >> (Pdb) 
>
> It looks like the input file is either broken or not in utf8 encoding 
> (which 
> then?). 
>
> It looks like the input decoding is not done by docutils.io, but by the 
> Sphinx "wrapper" - this means you must tell Sphinx about the correct 
> "source_encoding" 
> http://sphinx-doc.org/config.html#confval-source_encoding.   
> Setting the Docutils config setting "input-encoding" 
> http://docutils.sourceforge.net/docs/user/config.html#input-encoding will 
> not help. 
>
> Günter 
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "sphinx-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected]<javascript:>
> .
> Visit this group at http://groups.google.com/group/sphinx-users?hl=en.
> For more options, visit https://groups.google.com/groups/opt_out.
>  
>  
>

-- 
You received this message because you are subscribed to the Google Groups 
"sphinx-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/sphinx-users?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to