Jonas Jelten added the comment:

Martin, i think the most intuitive and easiest way for working with strings in 
C are just char arrays.

Starting with the main() argv being char*, probably most programmers just go 
with char* and all the encoding just works.
This is because contact with encoding is only needed for the user input 
software (xorg, keyboard input) and user output (-> your terminal emulator, the 
gui, ...).
No matter what stuff your program receives, the encoding only matters for the 
actual output display software to select the correct visual representation.
Requiring a conversion to wide chars just increases the interface complexity 
and adds really unneeded data transformations that are completely obsolete with 
UTF-8.

What I'd really like to see in CPython is that the internal storage (and the 
way it's exposed in the C-API) is just raw bytes (=> char*).

This allows super-easy integration in C projects that probably all just use 
char as their string type (see the doc example mentioned earlier).

PEP 393 states: "(..) the specification chooses UTF-8 as the recommended way of 
exposing strings to C code."

And for that, I think using char instead of wchar_t is a better solution for 
interface developers.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue22108>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to