New submission from STINNER Victor <victor.stin...@haypocalc.com>:

Many C functions have bytes argument (char* type) but the encoding is not 
documented. If would not be a problem if the encoding was always the same, but 
it is not. Examples:
 - format of PyUnicode_FromFormat() should be encoded as ISO-8859-1
 - filename of PyParser_ASTFromString() should be encoded as utf-8
 - filename of PyErr_SetFromErrnoWithFilename() should be encoded to the 
filesystem encoding (with strict error handler, and not surrogateescape)
 - 's' argument of PyParser_ASTFromString() should be encoded as utf-8 if 
PyPARSE_IGNORE_COOKIE flag is set, otherwise the parser checks for #coding:xxx 
cookie (if there is no cookie, utf-8 is used)

Attached patch is a try to document most low level functions. I choosed to add 
the name of function arguments in the headers because I consider that a header 
can be used as a quick documentation. I only touched .c files to change 
argument names.

It is hard to get the right encoding, so I cannot ensure that my patch is 
correct. My patch is just a draft.

I don't know if "encoded to utf-8" is the right expression. Or should it be 
"decoded as utf-8"?

----------
assignee: d...@python
components: Documentation, Interpreter Core, Unicode
files: encodings.patch
keywords: patch
messages: 115339
nosy: d...@python, haypo
priority: normal
severity: normal
status: open
title: Document the encoding of functions bytes arguments of the C API
versions: Python 3.2
Added file: http://bugs.python.org/file18705/encodings.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue9738>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to