[issue20574] Implement incremental decoder for cp65001

STINNER Victor Sun, 09 Feb 2014 05:19:05 -0800

New submission from STINNER Victor:

(Follow up of issue #20538 and #20571.) Attached patch implements incremental 
decoders for multibyte code pages (on Windows), especially for CP_UTF8 aka 
"cp65001" in Python.


Code pages 932, 936, 949, 950 and 1361 already have an incremental decoder 
since:
---
changeset:   38817:549c547700af
branch:      legacy-trunk
user:        Martin v. Löwis <mar...@v.loewis.de>
date:        Wed Jun 14 05:21:04 2006 +0000
files:       Doc/api/concrete.tex Include/unicodeobject.h Lib/encodings/mbcs.py 
Misc/NEWS Modules/_codecsmodule.c Objects/unicodeobject.c
description:
Patch #1455898: Incremental mode for "mbcs" codec.
---

Python currently uses IsDBCSLeadByteEx():
http://msdn.microsoft.com/en-us/library/windows/desktop/dd318667%28v=vs.85%29.aspx

And CharPrevA():
http://msdn.microsoft.com/en-us/library/windows/desktop/ms647471%28v=vs.85%29.aspx

But IsDBCSLeadByteEx() only supports code pages 932, 936, 949, 950 and 1361.

Python supports the code page 65001 (codec "cp65001") since Python 3.3. New 
tests on incremental decoders were added in Python 3.4: I addedd a skip for 
cp65001 since it was not supported (#20571). This issue implements the 
incremental decoder and so removes the skip.

I prefer to wait for Python 3.5 (not rush for add this new feature after 3.4 
beta 3). cp65001 is mostly used for output (sys.stdout/sys.stderr) on Windows, 
not for input.

----------
files: incremental_cp_utf8.patch
keywords: patch
messages: 210759
nosy: haypo, larry, loewis, serhiy.storchaka
priority: normal
severity: normal
status: open
title: Implement incremental decoder for cp65001
type: enhancement
versions: Python 3.5
Added file: http://bugs.python.org/file34008/incremental_cp_utf8.patch

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20574>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue20574] Implement incremental decoder for cp65001

Reply via email to