[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-15 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: I wrote in msg56419: > It is not perfect, since the extra function calls in the codecs module > causes test_profile and test_doctest to fail. How this was resolved? __ Tracker <[EMAIL PROTECTED]>

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-15 Thread Christian Heimes
Christian Heimes added the comment: >> It is not perfect, since the extra function calls in the codecs module >> causes test_profile and test_doctest to fail. > > How this was resolved? It's not resolved yet. __ Tracker <[EMAIL PROTECTED]>

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Guido van Rossum
Guido van Rossum added the comment: Committed revision 58466. Fingers crossed. -- resolution: -> accepted status: open -> closed __ Tracker <[EMAIL PROTECTED]> __ ___

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Guido van Rossum
Guido van Rossum added the comment: > > This looks promising. I'm working on the freeze issue. Once I get that > > working I'll check this in. Thanks Alexandre and Christian for all > > your hard work!!! > > You're welcome. Does the patch qualify me for Misc/ACKS? :) Yes, and also Alexandre. :-)

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Christian Heimes
Christian Heimes added the comment: I found two minor bugs in the fix. In Modules/posixmodule.c the tmpnam() and tempnam() methods return a PyString instance. Please change line 5373 and 5431 to use PyUnicode_DecodeFSDefault(). Index: Modules/posixmodule.c ===

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Christian Heimes
Christian Heimes added the comment: > This looks promising. I'm working on the freeze issue. Once I get that > working I'll check this in. Thanks Alexandre and Christian for all > your hard work!!! You're welcome. Does the patch qualify me for Misc/ACKS? :) I'm going to work on the basestring p

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Christian Heimes
Christian Heimes added the comment: >> UnicodeDecodeError: 'utf8' codec can't decode bytes in position >> 1428-1430: invalid data > > I can't reproduce this. Can you open a separate issue? It breaks for me with the same error message on Ubuntu Linux, i386, UCS-4 build and locale de_DE.UTF-8

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Guido van Rossum
Guido van Rossum added the comment: > The problem is the imp module, which modulefinder uses, does not > detect the encoding of the files from the mode-line. This causes > TextIOWrapper to crash when it tries to read modules using an encoding > other than ASCII or UTF-8. Here an example: > > >>

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Guido van Rossum
Guido van Rossum added the comment: This looks promising. I'm working on the freeze issue. Once I get that working I'll check this in. Thanks Alexandre and Christian for all your hard work!!! __ Tracker <[EMAIL PROTECTED]> __

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Christian Heimes
Christian Heimes added the comment: Changes since updated_file_fsenc-5.patch: * Fix for hard coded FS default encoding on Apple and Windows * Added two notes to unicode_default_encoding and Py_FileSystemDefaultEncoding __ Tracker <[EMAIL PROTECTED]>

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Christian Heimes
Christian Heimes added the comment: Alexandre Vassalotti wrote: > Alexandre Vassalotti added the comment: > > I thought of another way to implement PyUnicode_DecodeFSDefault. If > Py_FileSystemDefaultEncoding is set, decode with the codecs module, > otherwise use UTF-8 + replace. This works beca

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: I thought of another way to implement PyUnicode_DecodeFSDefault. If Py_FileSystemDefaultEncoding is set, decode with the codecs module, otherwise use UTF-8 + replace. This works because when Py_FileSystemDefaultEncoding is initialized at the end of Py_Initi

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: > I have a question for Alexandre related to frozen.c -- why is there a > mode line with an encoding involved in freezing hello.py? For some reason which I don't know about, freeze.py tries to read all the modules accessible from sys.path: # collect a

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Guido van Rossum
Guido van Rossum added the comment: Only a few modules are involved in the bootstrap. The filename is mostly used to display in the traceback. There is already a fallback in the traceback-printing code that tries to look through sys.path for a file matching the module if it can't open the filenam

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Christian Heimes
Christian Heimes added the comment: Alexandre Vassalotti wrote: > That isn't true. My mangler does exactly the same thing as your > original one. > > However, I forgot to add Py_CHARMASK to the calls of tolower() and > isalnum() which would cause problems on platforms with signed char. I wasn't

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: Christian wrote: > Alexandre's mangle loop doesn't do the same job as mine. Chars like _ > and - aren't removed from the encoding name and the if clauses don't > catch for example UTF-8 or ISO-8859-1 only UTF8 or ISO8859-1. That isn't true. My mangler doe

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-14 Thread Guido van Rossum
Guido van Rossum added the comment: OK, in the spirit of delegation I'll leave this for you and Alexandre to work out more. If you're stuck, post to the list so others can jump in. __ Tracker <[EMAIL PROTECTED]>

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Christian Heimes
Christian Heimes added the comment: Guido van Rossum wrote: > Guido van Rossum added the comment: > > Crys, is this OK with you? Alexandre's mangle loop doesn't do the same job as mine. Chars like _ and - aren't removed from the encoding name and the if clauses don't catch for example UTF-8 or

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Guido van Rossum
Guido van Rossum added the comment: Crys, is this OK with you? On 10/13/07, Alexandre Vassalotti <[EMAIL PROTECTED]> wrote: > > Alexandre Vassalotti added the comment: > > Guido wrote: > > I figured out the problem -- it came from marshalled old code objects. > > If you throw away all .pyc files

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: > Remove the PyString type check on 'filename' and 'name' in PyCode_New. Oops. I removed one of the ! the checks by mistake. __ Tracker <[EMAIL PROTECTED]>

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: Guido wrote: > I figured out the problem -- it came from marshalled old code objects. > If you throw away all .pyc files the problem goes away. You can also > get rid of the similar checks for the 'name' argument -- this should > just be a PyUnicode too.

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Guido van Rossum
Guido van Rossum added the comment: Well, you could ensure that by checking that you haven't reached the end of the mangling buffer. That will have the added advantage that when the input is something silly like 32 spaces followed by "utf-8" it will be still be mangled correctly. The slight extra

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Christian Heimes
Christian Heimes added the comment: Guido van Rossum wrote: > - Why copy the default encoding before mangling it? With a little extra > care you will only have to copy it once. Now I remember why I added the strncpy() call plus encoding[31] = '\0'. I wanted to make sure that the code doesn't b

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Guido van Rossum
Guido van Rossum added the comment: > > Oh. Hm. I still wish that PyCode_New() could just insist that the > > filename argument is a PyUnicode instance. Why can't it? Perhaps the > > caller should be fixed instead? > I'll try. I figured out the problem -- it came from marshalled old code object

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Christian Heimes
Christian Heimes added the comment: Guido van Rossum wrote: > But that function is a terrible example; it was done that way because > an earlier version of the function *did* allow using the errors > argument and I wanted to make sure to catch all calls that were still > passing an errors value.

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Guido van Rossum
Guido van Rossum added the comment: On 10/13/07, Christian Heimes <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: > > - Why add an 'errors' argument to the function when it's a fatal error > > to use it? > > I wanted the signature of the method be equal to the other methods > PyUnicode_Decod

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Christian Heimes
Christian Heimes added the comment: Guido van Rossum wrote: > - You added a removal of hotshot from setup.py to the patch; but that's > been checked in in the mean time. Oh, the change shouldn't make it into the patch. I guess I forgot a svn revert on setup.py > - Why add an 'errors' argument t

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: Guido wrote: > Why copy the default encoding before mangling it? With a little > extra care you will only have to copy it once. Also, consider not > mangling at all, but assuming the encoding comes in a canonical form > -- several other functions assume

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Alexandre Vassalotti
Changes by Alexandre Vassalotti: __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/option

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Alexandre Vassalotti
Alexandre Vassalotti added the comment: I found a few problems in your patch. In PyCode_New() the type check for the filename argument is incorrect: --- Objects/codeobject.c(revision 58412) +++ Objects/codeobject.c(working copy) @@ -59,7 +59,7 @@ freevars == NULL || !

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Guido van Rossum
Guido van Rossum added the comment: Couple of nits: - You added a removal of hotshot from setup.py to the patch; but that's been checked in in the mean time. - Why add an 'errors' argument to the function when it's a fatal error to use it? - Using 0 to autodetect the length is scary. Normally

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-13 Thread Guido van Rossum
Changes by Guido van Rossum: -- assignee: -> gvanrossum nosy: +gvanrossum __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-12 Thread Martin v. Löwis
Changes by Martin v. Löwis: -- keywords: +patch __ Tracker <[EMAIL PROTECTED]> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.p

[issue1272] Decode __file__ and co_filename to unicode using fs default

2007-10-12 Thread Christian Heimes
New submission from Christian Heimes: I'm sending the patch in for review. -- components: Interpreter Core files: py3k_file_fsenc2.patch messages: 56374 nosy: tiran severity: normal status: open title: Decode __file__ and co_filename to unicode using fs default versions: Python 3.0