New submission from Terry J. Reedy <tjre...@udel.edu>:

The 3.2.2 doc for compile() says "The filename argument should give the file 
from which the code was read; pass some recognizable value if it wasn’t read 
from a file ('<string>' is commonly used)."

I am not sure what 'recognizable' is supposed to mean, but as I understand it, 
it would be user-specific and any string containing a fake 'filename' should be 
accepted and attached to the output code object as the .co_filename attribute. 
(At least on Windows.)

In fact, compile() has a hidden restriction: it encodes 'filename' with the 
local filesystem encoding. It tosses the bytes result (at least on Windows) but 
lets a UnicodeEncodeError terminate compilation. The effect is to add an 
undocumented and spurious dependency to code that has nothing to do with real 
files or the local machine.

In #10114, msg118845, Victor Stinner justified this with 
"co_filename attribute is used to display the traceback: Python opens the 
related file, read the source code line and display it."
If the filename is fake, it cannot do that. (Perhaps the doc should warn users 
to make sure that fake filenames do not match any possibly real filenames ;-). 
The traceback mechanism could ignore UnicodeEncodeErrors just as well as it now 
ignores IO(?)Errors when open('fakename') does not not work.

Victor continues "On Windows, co_filename is directly used because Windows 
accepts unicode for filenames." This is not true in that on at least some 
Windows, compile tries to encode with the mbcs codec, which in turn uses the 
hidden local codepage. I believe that for most or all codepages, this will even 
raise errors for some valid Unicode filenames.

I do not know whether the stored .co_filename attribute type for *nix is str, 
as on Windows, or bytes. If the latter, the doc should say so.
If compile() continues to filter fake filenames, which I oppose, the doc should 
also say so and say what it does.

This issue came up on python-list when someone used a Chinese filename and mbcs 
rejected it.

----------
components: Interpreter Core
messages: 151034
nosy: terry.reedy
priority: normal
severity: normal
stage: test needed
status: open
title: compile() should not encode 'filename' (at least on Windows)
type: behavior
versions: Python 3.2, Python 3.3

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue13758>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to