[issue21872] LZMA library sometimes fails to decompress a file

2019-06-02 Thread Ma Lin
Change by Ma Lin : -- nosy: +Ma Lin ___ Python tracker <https://bugs.python.org/issue21872> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyth

[issue21872] LZMA library sometimes fails to decompress a file

2019-06-04 Thread Ma Lin
Ma Lin added the comment: fix-bug.diff fixes this bug, I will submit a PR after thoroughly understanding the problem. -- keywords: +patch Added file: https://bugs.python.org/file48391/fix-bug.diff ___ Python tracker <https://bugs.python.

[issue37188] Creating a ctypes array of an element with size zero causes "Fatal Python error: Floating point exception"

2019-06-08 Thread Ma Lin
Ma Lin added the comment: > 3.7/3.8 are done 3.7 and master (3.9) are done, 3.8 was missed. -- nosy: +Ma Lin ___ Python tracker <https://bugs.python.org/issu

[issue21872] LZMA library sometimes fails to decompress a file

2019-06-13 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +13910 stage: -> patch review pull_request: https://github.com/python/cpython/pull/14048 ___ Python tracker <https://bugs.python.org/issu

[issue21872] LZMA library sometimes fails to decompress a file

2019-06-13 Thread Ma Lin
Ma Lin added the comment: I wrote a review guide in PR 14048. -- versions: +Python 3.8, Python 3.9 -Python 2.7, Python 3.4, Python 3.5, Python 3.6 ___ Python tracker <https://bugs.python.org/issue21

[issue35360] Update SQLite to 3.26 in Windows and macOS installer builds

2019-06-17 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch pull_requests: +14017 stage: -> patch review pull_request: https://github.com/python/cpython/pull/14179 ___ Python tracker <https://bugs.python.org/issu

[issue35360] Update SQLite to 3.26 in Windows and macOS installer builds

2019-06-17 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +14018 pull_request: https://github.com/python/cpython/pull/14180 ___ Python tracker <https://bugs.python.org/issue35

[issue35360] Update SQLite to 3.28 in Windows and macOS installer builds

2019-06-17 Thread Ma Lin
Ma Lin added the comment: PR 14179 is for Windows build PR 14180 is for Mac OS X build Both update to Sqlite 3.28.0 -- title: Update SQLite to 3.26 in Windows and macOS installer builds -> Update SQLite to 3.28 in Windows and macOS installer bui

[issue35360] Update SQLite to 3.28 in Windows and macOS installer builds

2019-06-17 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +14019 pull_request: https://github.com/python/cpython/pull/14182 ___ Python tracker <https://bugs.python.org/issue35

[issue35360] Update SQLite to 3.28 in Windows and macOS installer builds

2019-06-17 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +14020 pull_request: https://github.com/python/cpython/pull/14183 ___ Python tracker <https://bugs.python.org/issue35

[issue35360] Update SQLite to 3.28 in Windows and macOS installer builds

2019-06-17 Thread Ma Lin
Ma Lin added the comment: 2.7 branch: PR 14182 is for Windows build PR 14183 is for Mac OS X build -- ___ Python tracker <https://bugs.python.org/issue35

[issue21872] LZMA library sometimes fails to decompress a file

2019-06-18 Thread Ma Lin
Ma Lin added the comment: I investigated this problem. Here is the toggle conditions: - The format is FORMAT_ALONE, this is the legacy .lzma container format. - The file's header recorded "Uncompressed Size". - The file doesn't have "End of Payload Marker&

[issue21872] LZMA library sometimes fails to decompress a file

2019-06-18 Thread Ma Lin
Ma Lin added the comment: toggle conditions -> trigger conditions -- ___ Python tracker <https://bugs.python.org/issue21872> ___ ___ Python-bugs-list mai

[issue37457] python3.7 re.split() module bug

2019-06-30 Thread Ma Lin
Ma Lin added the comment: Try this pattern: >>> re.split(r"\s+", text) ['Some', 'File', 'Num10', 'example.txt'] IMO 3.7 behaivor is more reasonable, it fixes a bug (issue25054). -- nosy: +Ma Lin _

[issue17535] IDLE: Add an option to show line numbers along the left side of the editor window, and have it enabled by default.

2019-07-03 Thread Ma Lin
Ma Lin added the comment: I tried PR 14030 today. By default, the fgcolor is black. Looks like a black belt always on the left side, this makes me feel a bit oppressive. Of course, the fgcolor can be changed. -- nosy: +Ma Lin Added file: https://bugs.python.org/file48455/1.png

[issue37527] Timestamp conversion on windows fails with timestamps close to EPOCH

2019-07-09 Thread Ma Lin
Ma Lin added the comment: Looks like a similar problem to issue29097. >>> from datetime import datetime >>> d = datetime.fromtimestamp(1) >>> d.timestamp() Traceback (most recent call last): File "", line 1, in OSError: [Errno 22] Invalid argume

[issue33408] Enable AF_UNIX support in Windows

2019-07-10 Thread Ma Lin
Ma Lin added the comment: Have you upgraded the building SDK that supports AF_UNIX? And should remove AF_UNIX flag at runtime on systems older than Windows 10 1804, see issue32394. -- nosy: +Ma Lin ___ Python tracker <https://bugs.python.

[issue33408] Enable AF_UNIX support in Windows

2019-07-10 Thread Ma Lin
Ma Lin added the comment: It would be nice to investigate the habit of using AF_UNIX in Python code on GitHub: https://github.com/search?l=Python&q=AF_UNIX&type=Code If adding this flag will break a lot of code, due to lacking supports to datagram, maybe we should not add AF_UNIX

[issue33408] Enable AF_UNIX support in Windows

2019-08-01 Thread Ma Lin
Ma Lin added the comment: The current AF_UNIX address family of Windows10 doesn't support datagram, adding this flag may break some cross-platform code: https://github.com/osbuild/osbuild/blob/9371eb9eaa3d0a7cab876eb4c7b70f519dfbd915/osbuild/__init__.py#L253 https://github.com/wat

[issue37752] Redundant Py_CHARMASK called in normalizestring(codecs.c)

2019-08-03 Thread Ma Lin
Ma Lin added the comment: Search "Py_CHARMASK" in Python source code, there are more than a dozen Py_CHARMASK can be deleted: https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Python/mystrtoul.c#L102 https://github.com/python/cp

[issue37752] Redundant Py_CHARMASK called in normalizestring(codecs.c)

2019-08-03 Thread Ma Lin
Ma Lin added the comment: Or remove Py_CHARMASK in Py_ISxxx/Py_TOLOWER/Py_TOUPPER macros? Sometimes `c` is already a unsinged char, Py_CHARMASK is not necessary in these cases. -- ___ Python tracker <https://bugs.python.org/issue37

[issue37752] Redundant Py_CHARMASK called in some files

2019-08-06 Thread Ma Lin
Ma Lin added the comment: VC2017 optimizes multiple `unsigned char)((c) & 0xff` to a single `movzx` operation, maybe other compilers do it as well. If so, there will be no performance changes. -- ___ Python tracker <https://bugs.pyt

[issue37774] Micro-optimize vectorcall using PY_LIKELY

2019-08-07 Thread Ma Lin
Ma Lin added the comment: How about write a suggestion on when to use them in the comment? For example: > You should use it only in cases when the likeliest branch is > very very very likely, or when the unlikeliest branch is very > very very unlikely. from https://kernelnewbie

[issue18236] str.isspace should use the Unicode White_Space property

2019-08-07 Thread Ma Lin
Ma Lin added the comment: Greg, could you try this code after your patch? >>> import re >>> re.match(r'\s', '\x1e') # <- before patch -- nosy: +Ma Lin ___ Python tra

[issue37812] Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)

2019-08-11 Thread Ma Lin
Ma Lin added the comment: How about using a hybrid implementation for `int`. For example, on 64-bit platform, if (a integer >=-9223372036854775808 and <=9223372036854775807), use a native `signed long` to represent it. People mostly use +-* operations, maybe using native int is faster

[issue37812] Make implicit returns explicit in longobject.c (in CHECK_SMALL_INT)

2019-08-11 Thread Ma Lin
Ma Lin added the comment: I sent twice, but it doesn't appear in Python-Ideas list. I will try to post to Python-Dev tomorrow. -- ___ Python tracker <https://bugs.python.org/is

[issue35228] Index search in CHM help crashes viewer

2018-12-01 Thread Ma Lin
Ma Lin added the comment: I suffered this problem more than one years. Here is a solution, before compiling the chm, modify like this: --- D:\Python-3.7.1\Doc\build\htmlhelp\python371.hhpSun Dec 02 13:12:37 2018 +++ D:\fix_crash\python371.hhp Sun Dec 02 13:05:57 2018 @@ -1,6 +1,6

[issue35482] python372rc1.chm is ill

2018-12-13 Thread Ma Lin
New submission from Ma Lin : python372rc1.chm can't be opened, it seems the compiling is not successful. Compare to python371.chm, the file size reduced a lot: python371.chm 8,534,435 bytes python372rc1.chm 5,766,102 bytes Some files in chm are missing, see attached pic

[issue35482] python372rc1.chm is ill

2018-12-13 Thread Ma Lin
Change by Ma Lin : Added file: https://bugs.python.org/file47989/372rc1_chm_files.png ___ Python tracker <https://bugs.python.org/issue35482> ___ ___ Python-bugs-list m

[issue35482] python372rc1.chm is ill

2018-12-13 Thread Ma Lin
Change by Ma Lin : Added file: https://bugs.python.org/file47990/371_compile_progress.png ___ Python tracker <https://bugs.python.org/issue35482> ___ ___ Python-bug

[issue35482] python372rc1.chm is ill

2018-12-13 Thread Ma Lin
Change by Ma Lin : Added file: https://bugs.python.org/file47991/372rc1_compile_progress.png ___ Python tracker <https://bugs.python.org/issue35482> ___ ___ Python-bug

[issue35482] can't open python368rc1.chm and python372rc1.chm

2018-12-14 Thread Ma Lin
Ma Lin added the comment: python368rc1.chm has the same problem. I did a git bisect. On 3.6 branch, e825b4e1a9bbe1d4c561f4cbbe6857653ef13a15 is the first bad commit On 3.7 branch, 9a75b8470a2e0de5406edcabba140f023c99c6a9 is the first bad commit -- title: python372rc1.chm is ill

[issue35482] can't open python368rc1.chm and python372rc1.chm

2018-12-14 Thread Ma Lin
Ma Lin added the comment: These first bad commits come from issue35054 -- ___ Python tracker <https://bugs.python.org/issue35482> ___ ___ Python-bugs-list mailin

[issue35482] can't open python368rc1.chm and python372rc1.chm

2018-12-17 Thread Ma Lin
Ma Lin added the comment: > I guess the next step is to try taking out those new index entries to figure > out which one causes the problem. Very big amount of work..., I would suggest to revert them. chm is a successful product, but it out of support, and no replacement yet,

[issue35482] can't open python368rc1.chm and python372rc1.chm

2018-12-18 Thread Ma Lin
Ma Lin added the comment: amazing, you did find it. -- ___ Python tracker <https://bugs.python.org/issue35482> ___ ___ Python-bugs-list mailing list Unsub

[issue35482] can't open python368rc1.chm and python372rc1.chm

2018-12-18 Thread Ma Lin
Ma Lin added the comment: ' comes from html.escape(s, quote=True) https://github.com/python/cpython/blob/4a9ee26750aa8cb37b5072b2bb4dd328819febb4/Lib/html/__init__.py#L24 Of course, it's not a bug. It would be better to patch in Sphinx, I will do it at

[issue35228] Index search in CHM help crashes viewer

2018-12-25 Thread Ma Lin
Ma Lin added the comment: I solved this thoroughly: Format disk C: and install a clean Windows 10. Don't forget to backup important files in C:\Users\\ folder. -- ___ Python tracker <https://bugs.python.org/is

[issue35636] remove redundant code in unicode_hash(PyObject *self)

2019-01-01 Thread Ma Lin
New submission from Ma Lin : Please see the PR -- messages: 332857 nosy: Ma Lin priority: normal severity: normal status: open title: remove redundant code in unicode_hash(PyObject *self) ___ Python tracker <https://bugs.python.org/issue35

[issue35636] remove redundant code in unicode_hash(PyObject *self)

2019-01-01 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch pull_requests: +10783 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35636> ___ ___ Python-

[issue35636] remove redundant code in unicode_hash(PyObject *self)

2019-01-01 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch pull_requests: +10783, 10784 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue35636] remove redundant code in unicode_hash(PyObject *self)

2019-01-01 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch, patch pull_requests: +10783, 10784, 10785 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue35636] remove redundant code in unicode_hash(PyObject *self)

2019-01-01 Thread Ma Lin
Ma Lin added the comment: Every non-empty str will be checked twice at present. -- components: +Interpreter Core type: -> enhancement versions: +Python 3.8 ___ Python tracker <https://bugs.python.org/issu

[issue35636] remove redundant code in unicode_hash(PyObject *self)

2019-01-01 Thread Ma Lin
Change by Ma Lin : -- versions: +Python 3.6, Python 3.7 ___ Python tracker <https://bugs.python.org/issue35636> ___ ___ Python-bugs-list mailing list Unsub

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: This redundant exists since Python 3.4 or earlier. -- title: remove redundant code in unicode_hash(PyObject *self) -> remove redundant check in unicode_hash(PyObject *self) type: enhancement -> performance versions: +Python 3.4, Pyth

[issue35639] Lowecasing Unicode Characters

2019-01-02 Thread Ma Lin
Ma Lin added the comment: please read this discussion https://bugs.python.org/issue17252 behavior in Python 3.2- is correct for Turkish users. behavior in Python 3.3+ is correct for non-Turkish users. -- nosy: +Ma Lin ___ Python tracker <ht

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: Thanks for review. Don't know why bytes and str generates the same hash value for ASCII sequence. >>> hash('abc') == hash(b'abc') True This may brings some hash collisions, d

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: > I'd advise against changing the hash function without a very good reason. You > never know how much code relies on it in one way or another. ok, maybe this can be changed in Python 4.0 -- ___ Python trac

[issue35636] remove redundant check in unicode_hash(PyObject *self)

2019-01-02 Thread Ma Lin
Ma Lin added the comment: One scene is caching regular expresses, b'[a-z]', '[a-z]' may exist in the same dict. Any way, it's trivial on the whole. -- ___ Python tracker <https:

[issue35696] remove unnecessary operation in long_compare()

2019-01-09 Thread Ma Lin
New submission from Ma Lin : static int long_compare(PyLongObject *a, PyLongObject *b) { } This function in /Objects/longobject.c is used to compare two PyLongObject's value. We only need the sign, converting to -1 or +1 is not necessary. -- messages: 333293 nosy: M

[issue35696] remove unnecessary operation in long_compare()

2019-01-09 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch pull_requests: +10974 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35696> ___ ___ Python-

[issue35696] remove unnecessary operation in long_compare()

2019-01-09 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch pull_requests: +10974, 10975 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue35696] remove unnecessary operation in long_compare()

2019-01-09 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch, patch pull_requests: +10974, 10975, 10976 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin
Ma Lin added the comment: Simplify the test-case, it seem the `state` is not reset properly. Python 3.6.8 (tags/v3.6.8:3c6b436a57, Dec 24 2018, 00:16:47) >>> import re >>> re.findall(r"(?=(<\w+>)(<\w+>)?)", "") [('', ''),

[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch pull_requests: +11162, 11163 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch pull_requests: +11162 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue34294> ___ ___ Python-

[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch, patch pull_requests: +11162, 11163, 11164 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue34294] re.finditer and lookahead bug

2019-01-13 Thread Ma Lin
Ma Lin added the comment: I tried to fix it, feel free to create a new PR if you don't want this one. PR11546 has a small question, should `state->data_stack` be dealloced as well? FYI, function `state_reset(SRE_STATE* state)` in file `_sre.c`: https://github.com/python/cpyt

[issue35779] Print friendly version message in REPL

2019-01-18 Thread Ma Lin
New submission from Ma Lin : The current version message in REPL is too complicate for official release. Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license"

[issue35779] Print friendly version message in REPL

2019-01-19 Thread Ma Lin
Ma Lin added the comment: It's interesting to see the a Python 1.5.2 from April 1999 :) Thanks for your opinion, let me explain: We only print simplified message in official binary release, any Linux/private builds still using the current message. We know enough information about off

[issue34294] re.finditer and lookahead bug

2019-01-19 Thread Ma Lin
Ma Lin added the comment: Serhiy Storchaka lost his sight. Please stop any work and rest, because your left eye will have more burden, and your mental burden will make it worse. Go to hospital ASAP. If any other core developer want to review this patch, I would like to give a detailed

[issue34294] re module: wrong capturing groups

2019-01-21 Thread Ma Lin
Ma Lin added the comment: Original post's bug was introduced in Python 3.7.0 When investigate the code, I found another bug about capturing groups. This bug exists since very early version. regex module doesn't have this bug. Python 3.4.4 (v3.4.4:737efcadf5a6, Dec 20 2015, 19:28:

[issue35779] Print friendly version message in REPL

2019-01-22 Thread Ma Lin
Ma Lin added the comment: Hi, thanks for your replies. To be honest, the reason is I fell it's a bit ugly, this line is very long at REPL startup. And the information is not very clear [1]. I'm not strongly pushing this idea, just raise my feeling, keep it easy. :) Yes,

[issue35859] Capture behavior depends on the order of an alternation

2019-01-30 Thread Ma Lin
Change by Ma Lin : -- nosy: +Ma Lin ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyth

[issue35859] Capture behavior depends on the order of an alternation

2019-01-30 Thread Ma Lin
Ma Lin added the comment: You can `#define VERBOSE` in file `_sre.c`, it will print the engine's actual actions: |02FAC684|02FC7402|MARK 0 ... |02FAC6BC|02FC7401|MARK 1 In my computer, 02FC7400 points to "ab", 02FC7401 points 'b' in "ab", 02FC7402 point

[issue35859] Capture behavior depends on the order of an alternation

2019-02-04 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch pull_requests: +11699 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-

[issue35859] Capture behavior depends on the order of an alternation

2019-02-04 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch pull_requests: +11699, 11700 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue35859] Capture behavior depends on the order of an alternation

2019-02-04 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch, patch, patch pull_requests: +11699, 11700, 11701 stage: -> patch review ___ Python tracker <https://bugs.python.org/issu

[issue35859] Capture behavior depends on the order of an alternation

2019-02-09 Thread Ma Lin
Ma Lin added the comment: For a capture group, state->mark[] array stores it's begin and end: begin: state->mark[(group_number-1)*2] end: state->mark[(group_number-1)*2+1] So state->mark[0] is the begin of the first capture group. state->mark[1] is the end of the

[issue33376] [pysqlite] Duplicate rows can be returned after rolling back a transaction

2019-02-15 Thread Ma Lin
Change by Ma Lin : -- nosy: +Ma Lin ___ Python tracker <https://bugs.python.org/issue33376> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyth

[issue23689] Memory leak in Modules/sre_lib.h

2019-02-18 Thread Ma Lin
Ma Lin added the comment: Try to allocate SRE_REPEAT on state's stack, the performance has not changed significantly. It passes the other tests, except this one (test_stack_overflow): https://github.com/python/cpython/blob/v3.8.0a1/Lib/test/test_re.py#L1225-L1230 I'll try to fix

[issue23689] Memory leak in Modules/sre_lib.h

2019-02-18 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +11951 ___ Python tracker <https://bugs.python.org/issue23689> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue35859] Capture behavior depends on the order of an alternation

2019-02-22 Thread Ma Lin
Ma Lin added the comment: A bug harvest, see PR11756, maybe sre has more bugs. Those bug exist since Python 2. Any ideas from regular expression experts? -- ___ Python tracker <https://bugs.python.org/issue35

[issue26744] print() function hangs on MS-Windows 10

2017-03-09 Thread Ma Lin
Ma Lin added the comment: This is an invalid issue, very sorry for waste your time! Especially apologize to Stinner. After enabling `QuickEdit Mode`, then click the console will suspend the program. -- resolution: -> not a bug stage: -> resolved status: open -&g

[issue24821] The optimization of string search can cause pessimization

2017-03-29 Thread Ma Lin
Changes by Ma Lin : -- nosy: +Ma Lin ___ Python tracker <http://bugs.python.org/issue24821> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.pyth

[issue29990] Range checking in GB18030 decoder

2017-04-04 Thread Ma Lin
New submission from Ma Lin: This issue is split from issue24117, that issue became a soup of small issues, so I'm going to close it. For 4-byte GB18030 sequence, the legal range is: 0x81-0xFE for the 1st byte 0x30-0x39 for the 2nd byte 0x81-0xFE for the 3rd byte 0x30-0x39 for the 4th

[issue24117] Wrong range checking in GB18030 decoder.

2017-04-04 Thread Ma Lin
Changes by Ma Lin : -- stage: patch review -> resolved status: open -> closed ___ Python tracker <http://bugs.python.org/issue24117> ___ ___ Python-bugs-

[issue29990] Range checking in GB18030 decoder

2017-04-04 Thread Ma Lin
Changes by Ma Lin : -- pull_requests: +1171 ___ Python tracker <http://bugs.python.org/issue29990> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue30003] Remove hz codec

2017-04-05 Thread Ma Lin
New submission from Ma Lin: hz is a Simplified Chinese codec, available in Python since around 2004. However, hz encoder has a serious bug, it forgets to escape ~ >>> 'hi~'.encode('hz') b'hi~'# the correct output should be b'hi~~' As a resul

[issue29990] Range checking in GB18030 decoder

2017-04-05 Thread Ma Lin
Ma Lin added the comment: > except 0x80 (€) I suppose the English edition is not the final release of GB18030-2000. At the end of official Chinese edition of GB18030-2005, listed the difference between GB18030-2000 and GB18030-2005 clearly, it doesn't mention 0x80 (€), so GB18030-200

[issue29990] Range checking in GB18030 decoder

2017-04-06 Thread Ma Lin
Ma Lin added the comment: This is a very trivial bug, it's hard to imagine a scene that someone trying to decode those 8630 illegal 4-byte sequences with GB18030 decoder. And I think this bug can't lead to security vulnerabilities. As far as I can see, GB2312/GBK/GB18030 codecs a

[issue30003] Remove hz codec

2017-04-06 Thread Ma Lin
Ma Lin added the comment: I tried to fix this two years ago, here is the patch (not merged): http://bugs.python.org/review/24117/diff/14803/Modules/cjkcodecs/_codecs_cn.c But later, I thought it's a good opportunity to remove this codec, this serious bug indicates that almost no one is

[issue30003] Remove hz codec

2017-04-07 Thread Ma Lin
Ma Lin added the comment: >From my subjective feelings, probably no old archives still exist, but I can't >assert it. That's why I suggest remove it, or at least don't fix it. Ah, let's slow down the pace, this bug exists over a dacade, we don't need to sol

[issue24117] Wrong range checking in GB18030 decoder.

2017-04-07 Thread Ma Lin
Ma Lin added the comment: I closed this issue, because it involved too many things. 1, for GB18030 decoder bug, see issue29990. 2, for hz encoder bug, see issue30003. 3, for problem in Traditional Chinese codecs, please create a new issue

[issue36101] remove non-ascii characters in docstring

2019-02-24 Thread Ma Lin
New submission from Ma Lin : replace ’(\u2019) with '(\x27) -- assignee: docs@python components: Documentation messages: 336468 nosy: Ma Lin, docs@python priority: normal severity: normal status: open title: remove non-ascii characters in docstring versions: Python 3.7, Pytho

[issue36101] remove non-ascii characters in docstring

2019-02-24 Thread Ma Lin
Change by Ma Lin : -- keywords: +patch pull_requests: +12049 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue36101> ___ ___ Python-

[issue36101] remove non-ascii characters in docstring

2019-02-24 Thread Ma Lin
Ma Lin added the comment: only 3.8 branch has those non-ascii characters. -- versions: -Python 3.7 ___ Python tracker <https://bugs.python.org/issue36

[issue35859] Capture behavior depends on the order of an alternation

2019-03-01 Thread Ma Lin
Ma Lin added the comment: The PR11756 is prepared. I force-pushed the patch in four steps, hope you can review it easier: https://github.com/python/cpython/pull/11756/commits 🔴 Step 1, test-cases Show the wrong behaviors before this fix, the corresponding test-case will be updated in next

[issue35859] Capture behavior depends on the order of an alternation

2019-03-01 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +12137 ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue36158] Regex search behaves differently in list comprehension

2019-03-02 Thread Ma Lin
Ma Lin added the comment: Just remind, the pattern r'"{1}', is same as r'"', means " repeats 1 time. -- nosy: +Ma Lin ___ Python tracker

[issue35859] Capture behavior depends on the order of an alternation

2019-03-04 Thread Ma Lin
Ma Lin added the comment: Found another bug in re: >>> re.match(r'(?:.*?\b(?=(\t)|(x))x)*', 'a\txa\tx').groups() ('\t', 'x') Expected result: (None, 'x') PHP 7.3.2 NULL, "x" Java 11.0.2 "\

[issue23689] Memory leak in Modules/sre_lib.h

2019-03-04 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +12158 ___ Python tracker <https://bugs.python.org/issue23689> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue23689] Memory leak in Modules/sre_lib.h

2019-03-04 Thread Ma Lin
Ma Lin added the comment: PR11926 (closed) tried to allocate SRE_REPEAT on state's stack. It's feasible, but messes up the code in sre_lib.h, and reduces performance a little (roughly 6% slower), so I gave up this solution. PR12160 uses a memory pool, this solution doesn't m

[issue35859] Capture behavior depends on the order of an alternation

2019-03-12 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +12269 ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue35859] Capture behavior depends on the order of an alternation

2019-03-12 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +12270 ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue35859] Capture behavior depends on the order of an alternation

2019-03-12 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +12271 ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue35859] Capture behavior depends on the order of an alternation

2019-03-12 Thread Ma Lin
Change by Ma Lin : Added file: https://bugs.python.org/file48204/t.py ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsub

[issue35859] Capture behavior depends on the order of an alternation

2019-03-12 Thread Ma Lin
Ma Lin added the comment: > Could you please create and run some microbenchmarks to measure > possible performance penalty of additional MARH_PUSHes? I am > especially interesting in worst cases. Besides the worst case, I prepared two solutions. Solution_A (PR12288): Fix the bugs, I

[issue36357] Build 32bit Python on Windows with SSE2 instruction set

2019-03-18 Thread Ma Lin
New submission from Ma Lin : On windows, it seems 32bit builds (3.7.2/3.8.0a2) don't using SSE2 sufficiently. I test on 3.8 branch, python38.dll only uses XMM register 28 times. The official build is the same. After enable this option, python38.dll uses XMM register 11,704 times.

[issue35859] Capture behavior depends on the order of an alternation

2019-03-18 Thread Ma Lin
Change by Ma Lin : -- pull_requests: +12381 ___ Python tracker <https://bugs.python.org/issue35859> ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue35859] Capture behavior depends on the order of an alternation

2019-03-18 Thread Ma Lin
Ma Lin added the comment: I guess PR12427 is mature enough for review, I have been working on it these days. You may review these commits one by one, commit message is review guide. https://github.com/python/cpython/pull/12427/commits Maybe you will need two or three days to understand it

<    1   2   3   4   >