[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Brett Cannon
Brett Cannon <[EMAIL PROTECTED]> added the comment: Committed in r66209. -- resolution: -> accepted status: open -> closed ___ Python tracker <[EMAIL PROTECTED]> ___ _

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Hye-Shik Chang
Hye-Shik Chang <[EMAIL PROTECTED]> added the comment: pitrou, that's because Python source code can't be correctly tokenized when it's encoded in few odd encodings like iso-2022 or shift-jis which utilizes \, (, ) and " as second byte of two-byte character sequence. For example, '\x81\\' is HO

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Benjamin Peterson
Benjamin Peterson <[EMAIL PROTECTED]> added the comment: The patch also looks pretty harmless to me. :) -- nosy: +benjamin.peterson ___ Python tracker <[EMAIL PROTECTED]> ___ _

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-09-03 Thread Antoine Pitrou
Antoine Pitrou <[EMAIL PROTECTED]> added the comment: I don't understand the whole decoding machinery in the tokenizer, but the patch looks ok to me. (tested in debug mode under Linux and Windows) -- nosy: +pitrou ___ Python tracker <[EMAIL PROTECTED]

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-21 Thread Brett Cannon
Changes by Brett Cannon <[EMAIL PROTECTED]>: -- keywords: +needs review ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Python-bugs-list

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-21 Thread Brett Cannon
Changes by Brett Cannon <[EMAIL PROTECTED]>: -- priority: critical -> release blocker ___ Python tracker <[EMAIL PROTECTED]> ___ ___ Pyt

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
Brett Cannon <[EMAIL PROTECTED]> added the comment: Attached is a patch that fixes where the error occurs. By opening the file by either file name or file descriptor, the problem goes away. Once this patch is accepted then PyErr_Occurred() should be added to all uses of PyTokenizer_FindEncoding()

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
Brett Cannon <[EMAIL PROTECTED]> added the comment: Turns out that the NULL return value can signal an error that manifests itself as SyntaxError("encoding problem: with BOM") thanks to the lack of tok->filename being set in Parser/tokenizer.c:fp_setreadl() which is called by check_coding_spec()

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
Brett Cannon <[EMAIL PROTECTED]> added the comment: I have not bothered to check if this exists in 2.6, but I don't see why it would be any different. -- type: -> behavior ___ Python tracker <[EMAIL PROTECTED]>

[issue3594] PyTokenizer_FindEncoding() never succeeds

2008-08-18 Thread Brett Cannon
New submission from Brett Cannon <[EMAIL PROTECTED]>: Turns out that PyTokenizer_FindEncoding() never properly succeeds because the tok_state used by it does not have tok->filename set, which is an error condition in the tokenizer. This error has been masked by the one place the function is used,