[issue36226] multipart/related header causes false positive StartBoundaryNotFoundDefect and MultipartInvariantViolationDefect
tzickel added the comment: It should be noted that this causes a big headache for users of requests / urllib3 / etc... as those print on each multipart response a logging warning based on this bug, and it might cause people to go try debugging valid code: https://github.com/urllib3/urllib3/issues/800 https://github.com/psf/requests/issues/3001 https://github.com/diyan/pywinrm/issues/269 https://github.com/jborean93/pypsrp/issues/39 https://github.com/home-assistant/home-assistant/pull/17042 https://github.com/Azure/azure-storage-python/issues/167 and others -- nosy: +tzickel ___ Python tracker <https://bugs.python.org/issue36226> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue38066] Hide internal asyncio.Stream methods
tzickel added the comment: The documentation needs to scrub this methods as well, for example: https://docs.python.org/3/library/asyncio-stream.html#asyncio.StreamReader.at_eof still mentions them. -- nosy: +tzickel ___ Python tracker <https://bugs.python.org/issue38066> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
New submission from tzickel : bpo 36051 added optimization to release GIL on certain conditions of bytes joining, but it has missed a critical path. If the number of items joining is less or equal to NB_STATIC_BUFFERS (10) than static_buffers will be used to hold the buffers. https://github.com/python/cpython/blob/5b66ec166b81c8a77286da2c0d17be3579c3069a/Objects/stringlib/join.h#L54 But then the decision to release the GIL or not (drop_gil) does not take this into consideration, and the GIL might be released and then another thread is free to do the same code path, and hijack the static_buffers for it's own usage, causing a race condition. A decision should be made if it's worth for the optimization to not use the static buffers in this case (although it's an early part of the code...) or not drop the GIL anyhow if it's static buffers (a thing which might make this optimization not worth it, since based on length of data to join, and not number of items to join). -- messages: 364288 nosy: tzickel priority: normal severity: normal status: open title: A race condition with GIL releasing exists in stringlib_bytes_join versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
Change by tzickel : -- nosy: +bmerry, inada.naoki ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
tzickel added the comment: Also, in line: https://github.com/python/cpython/blob/d07d9f4c43bc85a77021bcc7d77643f8ebb605cf/Objects/stringlib/join.h#L85 perhaps add an if to check if the backing object is really mutable ? (Py_buffer.readonly) -- ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
tzickel added the comment: Also, semi related, (dunno where to discuss it), would a good .join() optimization be to add an optional length parameter, like .join(iterable, length=10), and when running in that code-path, it would skip all the calls to (PySequence_Fast which converts no list to list), and all the pre calculation of length to calculate allocation size, and instead directly start copying from input until length is done, and if the iterable didn't have enough length to fill up, only then throw an exception ? There are places where you know how much information you expect to be .joining (or you want to join just a part of it) ? -- ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
tzickel added the comment: My mistake... -- resolution: -> not a bug ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39974] A race condition with GIL releasing exists in stringlib_bytes_join
tzickel added the comment: Regarding getting the buffer and releasing the GIL, if it's wrong, why not fix other places in the code that do it, like: https://github.com/python/cpython/blob/611836a69a7a98bb106b4d315ed76a1e17266f4f/Modules/posixmodule.c#L9619 The GIL is released, the syscall might be blocking, and iov contains pointers to buffers ? -- ___ Python tracker <https://bugs.python.org/issue39974> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40007] An attempt to make asyncio.transport.writelines (selector) use Scatter I/O
New submission from tzickel : I have a code that tries to be smart and prepare data to be chunked efficiently before sending, so I was happy to read about: https://docs.python.org/3/library/asyncio-protocol.html#asyncio.WriteTransport.writelines Only to see that it simply does: self.write(b''.join(lines)) So I've attempted to write an version that uses sendmsg (scatter I/O) instead (will be attached in PR). What I've learnt is: 1. It's hard to benchmark (If someone has an good example on checking if it's worth it, feel free to add such). 2. sendmsg has an OS limit on how many items can be done in one call. If the user does not call writer.drain() it might have too many items in the buffer, in that case I concat them (might be an expensive operation ? but that should not be th enormal case). 3. socket.socket.sendmsg can accept any bytes like iterable, but os.writev can only accept sequences, is that a bug ? 4. This is for the selector stream socket for now. -- components: asyncio messages: 364565 nosy: asvetlov, tzickel, yselivanov priority: normal pull_requests: 18416 severity: normal status: open title: An attempt to make asyncio.transport.writelines (selector) use Scatter I/O type: enhancement versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue40007> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40007] An attempt to make asyncio.transport.writelines (selector) use Scatter I/O
tzickel added the comment: BTW, if wanted a much more simpler PR can be made, where writelines simply calls sendmsg on the input if no buffer exists, and if not only then concats and acts like the current code base. -- ___ Python tracker <https://bugs.python.org/issue40007> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40023] os.writev and socket.sendmsg return value are not ideal
New submission from tzickel : os.writev and socket.sendmsg accept an iterable but the return value is number of bytes sent. That is not helpful as the user will have to write manual code to figure out which part of the iterable was not sent. I propose to make a version of the functions where: 1. The return value is an iterable of the leftovers (including a maybe one-time memoryview into an item who has been partly-sent). 2. There is a small quirk where writev accepts only sequences but sendmsg accepts any iterable, which causes them not to behave the same for no good reason. 3. Do we want an sendmsgall like sendall in socket, where it doesn't give up until everything is sent ? 4. Today trying to use writev / sendmsg to be fully complaint requires checking the number of input items in the iterable to not go over IOV_MAX, maybe the python version of the functions should handle this automatically (and if it overflows, return the extra in leftovers) ? Should the functions be the current one with an optional argument (return_leftovers) or a new function altogether. -- components: Library (Lib) messages: 364651 nosy: larry, tzickel priority: normal severity: normal status: open title: os.writev and socket.sendmsg return value are not ideal type: enhancement versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue40023> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40085] Argument parsing option c should accept int between -128 to 255 ?
New submission from tzickel : I converted some code from python to c-api and was surprised that a code stopped working. Basically the "c" parsing option allows for 1 char bytes or bytearray inputs and converts them to a C char. But just as indexing a bytes array returns an int, so should this option support it. i.e. b't'[0] = 116 Not sure if it should limit between 0 to 255 or -128 to 127. -- components: C API messages: 365139 nosy: tzickel priority: normal severity: normal status: open title: Argument parsing option c should accept int between -128 to 255 ? type: enhancement versions: Python 3.9 ___ Python tracker <https://bugs.python.org/issue40085> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40262] SSL recv_into requires the object to implement __len__ unlike socket one
New submission from tzickel : I am writing this as a bug, as I have an object which implements the buffer protocol but not the __len__. SSL's recv_into seems to require the buffer object to implement __len__, but this is unlike the socket recv_into which uses the buffer protocol length. Here is the socket.recv_into implementation: https://github.com/python/cpython/blob/402e1cdb132f384e4dcde7a3d7ec7ea1fc7ab527/Modules/socketmodule.c#L3556 as you can see, the length is optional, and it not given, it takes it from the buffer protocol length. But here is SSL recv_into implementation: https://github.com/python/cpython/blob/master/Lib/ssl.py#L1233 if length is not given, it tries to call the __len__ of the object itself (not it's buffer protocol). -- assignee: christian.heimes components: SSL messages: 366257 nosy: christian.heimes, tzickel priority: normal severity: normal status: open title: SSL recv_into requires the object to implement __len__ unlike socket one type: behavior versions: Python 3.6, Python 3.7, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue40262> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30581] os.cpu_count() returns wrong number of processors on system with > 64 logical processors
tzickel added the comment: One should be careful with this modification because of the Windows definition of process groups. For example, if multi-threaded code thinks that by reading the value of the new os.cpu_count() it can use all the cores returned, by default it cannot as in windows processes by default can run only in a single process group (how it worked before). We can see such code builtin python stdlib itself: https://github.com/python/cpython/blob/bc61315377056fe362b744d9c44e17cd3178ce54/Lib/concurrent/futures/thread.py#L102 I think even .NET still uses the old way that python did until now: https://github.com/dotnet/corefx/blob/aaaffdf7b8330846f6832f43700fbcc060460c9f/src/System.Runtime.Extensions/src/System/Environment.Windows.cs#L71 Although some of this stuff is used in code for python multiprocess code which that might actually get a boost (since different process can get scheduled to different groups) https://msdn.microsoft.com/en-us/library/windows/desktop/dd405503(v=vs.85).aspx -- nosy: +tzickel ___ Python tracker <http://bugs.python.org/issue30581> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: A. It would be nice to add a test that tests this. B. Now that Pool is cleaning up properly, any of it's functions which return another object (like imap's IMapIterator) need to hold a reference to the Pool, so it won't get cleanedup before computing. -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: here is something quick I did to check if it works (it works) but I'm not fluent in multiprocessing code, so If i'm missing something or doing something wrong feel free to tell me: https://github.com/tzickel/cpython/commit/ec63a43706f3bf615ab7ed30fb095607f6101e26 -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35378] multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects
tzickel added the comment: It's important to note that before those PR, that code would leak the Pool instance until the process ends (once per call). https://github.com/python/cpython/compare/master...tzickel:fix34172 Is my proposed fix (till I get it to a PR). -- ___ Python tracker <https://bugs.python.org/issue35378> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35378] multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects
tzickel added the comment: I dont mind, I think my code is ready for review, but I'm not versed in this, so if you think you have something better, feel free to open a PR or tell me if I should submit mine, and you can comment on it: https://github.com/python/cpython/compare/master...tzickel:fix34172 -- ___ Python tracker <https://bugs.python.org/issue35378> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: The previous posts here touch all this subjects: A. The documentation explicitly says: "When the pool object is garbage collected terminate() will be called immediately." (Happened till a code refactor 9 years ago introduced this bug). B. Large amount of code was developed for this technique: https://github.com/python/cpython/blob/master/Lib/multiprocessing/util.py#L147 (Till the end of the file almost) C. The reason I opened this bug was because I was called to see why a long running process crashes after a while, and found out it leaked tons of subprocesses / pool._cache memory. D. The quoted code, will currently simply leak each invocation lots of subprocesses... I too, think we should push for the said fix. -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: Reverting the code will cause another class of problems, like the reason I fixed it. Programs written such as the example that Pablo gave (and what I've seen) will quietly leak child processes, file descriptors (for the pipes) and memory to a variety degree might not be detected, or in the end detected in a big error or crash. Also in some ResourceWarnings (if not all), the resources are closed in the end (like in sockets), here without this code patch you cannot implicitly reclaim the resources (because there is a Thread involved here), which I think is a high bar for the user to think about. You can also enable multiprocessing's debug logging to see how the code behaves with and without the fix: https://stackoverflow.com/a/1353037 I also agree with Pablo that there is code in the stdlib that holdes reference between child and parent. There is also code that has circular reference (for example Python 2's OrderedDict) and that is ok as well (not that this is the situation here). -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35378] multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects
tzickel added the comment: https://bugs.python.org/issue35267 -- ___ Python tracker <https://bugs.python.org/issue35378> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35378] multiprocessing.Pool.imaps iterators do not maintain alive the multiprocessing.Pool objects
tzickel added the comment: +1 -- ___ Python tracker <https://bugs.python.org/issue35378> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: OK, This issue has been biting me a few more times in production, so for now I've added the environment variable PYTHONDONTWRITEBYTECODE which resolves it (but it's a hack). I'm sure I am not the only one with it (recall that this is happening in a complex setup where I run python in windows via a network drive hosted in netapp, and runs thousands of times from many machines). So, I've been looking for a simple way to show you it's a major fault at the python 2 import I/O error checking, and found that new strace has fault injection capability :) In this demo I'll be running under debian sid (has strace version high enough for fault injection and latest python 2), you can use docker if you don't have it. On my mac, I'm running (this example is on one of python's init modules, but of course can happen on any .py file): user$ docker run -it --cap-add SYS_PTRACE debian:sid The cap-add is needed for strace to run. root@2dcc36934ea6:/# apt-get update && apt-get install -y strace python root@2dcc36934ea6:/# python Python 2.7.14 (default, Sep 17 2017, 18:50:44) [GCC 7.2.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> exit() Python works just fine. root@2dcc36934ea6:/# strace -P /usr/lib/python2.7/sre_parse.pyc -P /usr/lib/python2.7/sre_parse.py -e trace=read -e fault=read python read(7, 0x55c3ac2ad900, 4096) = -1 EPERM (Operation not permitted) (INJECTED) ... read(6, 0x55c3ac2cb680, 4096) = -1 EPERM (Operation not permitted) (INJECTED) ... Traceback (most recent call last): File "/usr/lib/python2.7/site.py", line 554, in main() ... File "/usr/lib/python2.7/sre_compile.py", line 572, in compile p = sre_parse.parse(p, flags) AttributeError: 'module' object has no attribute 'parse' +++ exited with 1 +++ This command simply causes the python process to fail the read command on the files sre_parse.py and sre_parse.pyc (the .pyc btw already existed from previous run). This should be OK, since it can't read a required module form disk. root@2dcc36934ea6:/# python Traceback (most recent call last): File "/usr/lib/python2.7/site.py", line 554, in main() ... File "/usr/lib/python2.7/sre_compile.py", line 572, in compile p = sre_parse.parse(p, flags) AttributeError: 'module' object has no attribute 'parse' This is already bad, python does not work anymore, now even without an I/O error :( root@2dcc36934ea6:/# ls -l /usr/lib/python2.7/sre_parse.pyc -rw-r--r-- 1 root root 118 Oct 21 09:20 /usr/lib/python2.7/sre_parse.pyc If we check, we see that the previous python instance with I/O error created an empty byte code valid sre_parse.pyc (you can check it by dis.dis it, and see it's empty code object), this is the crux of the bug. root@2dcc36934ea6:/# rm /usr/lib/python2.7/sre_parse.pyc let's delete the bad .pyc file root@2dcc36934ea6:/# python Python 2.7.14 (default, Sep 17 2017, 18:50:44) [GCC 7.2.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> exit() yey, python works now again root@2dcc36934ea6:/# ls -l /usr/lib/python2.7/sre_parse.pyc -rw-r--r-- 1 root root 21084 Oct 21 09:20 /usr/lib/python2.7/sre_parse.pyc We can see now that the .pyc file has a much bigger size (21084 bytes compared to 118 from before) root@2dcc36934ea6:/# strace -P /usr/lib/python2.7/sre_parse.pyc -P /usr/lib/python2.7/sre_parse.py -e trace=read -e fault=read python -B read(7, 0x55ceb72a7900, 4096) = -1 EPERM (Operation not permitted) (INJECTED) ... read(6, 0x55ceb72c5680, 4096) = -1 EPERM (Operation not permitted) (INJECTED) ... Traceback (most recent call last): File "/usr/lib/python2.7/site.py", line 554, in main() ... AttributeError: 'module' object has no attribute 'parse' +++ exited with 1 +++ We can now try this issue with python -B which should not try to create .pyc files root@2dcc36934ea6:/# python Python 2.7.14 (default, Sep 17 2017, 18:50:44) [GCC 7.2.0] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> exit() yey, python still works (hopefully if this is an network I/O error, it will stop occurring on a future run with a watchdog for an server app) A less likely variant (but possible) is that if the .pyc does not exist, and you have an I/O error on importing the .py, it will produce an bad .pyc file, you can try it: root@2dcc36934ea6:/# rm /usr/lib/python2.7/sre_parse.pyc root@2dcc36934ea6:/# strace -P /usr/lib/python2.7/sre_parse.py -e trace=read -e fault=read python ro
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: Ignore the hash append / link at the start of each shell command (it's the output from docker, and not related to python commits). BTW, forgot to mention, of course when doing the fault injection on the .py files, the error is bad as well, it should be I/O error, and instead it shows that it's an empty module: AttributeError: 'module' object has no attribute 'parse' (The patch actually fixes that). -- ___ Python tracker <https://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: Added a script to check if the bug exists (provided you have an updated strace 4.15 or above). Without patch: # ./import_io_check.sh strace: Requested path 'tmp.py' resolved into '/root/tmp.py' read(3, 0x55fc3a71cc50, 4096) = -1 ENOSYS (Function not implemented) (INJECTED) read(3, 0x55fc3a71cc50, 4096) = -1 ENOSYS (Function not implemented) (INJECTED) Traceback (most recent call last): File "", line 1, in ImportError: No module named py +++ exited with 1 +++ Traceback (most recent call last): File "", line 1, in ImportError: No module named py Bug exists an incorrect .pyc has been produced With patch: # PYTHON=Python-2.7.14-with-patch/python ./import_io_check.sh strace: Requested path 'tmp.py' resolved into '/root/tmp.py' read(3, 0x55a8ff7d3020, 4096) = -1 ENOSYS (Function not implemented) (INJECTED) read(3, 0x55a8ff7d3020, 4096) = -1 ENOSYS (Function not implemented) (INJECTED) Traceback (most recent call last): File "", line 1, in File "tmp.py", line 1 ^ SyntaxError: unexpected EOF while parsing +++ exited with 1 +++ Script finished successfully -- nosy: +brett.cannon Added file: https://bugs.python.org/file47229/import_io_check.sh ___ Python tracker <https://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
New submission from tzickel : In multiprocessing.Pool documentation it's written "When the pool object is garbage collected terminate() will be called immediately.": https://docs.python.org/3.7/library/multiprocessing.html#multiprocessing.pool.Pool.terminate A. This does not happen, creating a Pool, deleting it and collecting the garbage, does not call terminate. B. The documentation for Pool itself does not specify it has a context manager (but the examples show it). C. This bug is both in Python 3 & 2. -- components: Library (Lib) messages: 322028 nosy: tzickel priority: normal severity: normal status: open title: multiprocessing.Pool and ThreadPool leak resources after being deleted type: behavior versions: Python 2.7, Python 3.7 ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: >>> from multiprocessing import Pool >>> import gc >>> a = Pool(10) >>> del a >>> gc.collect() 0 >>> After this, there are still left behind Process (Pool) or Dummy (ThreadPool) and big _cache data (If you did something with it) which lingers till the process dies. You are correct on the other issue (I'm using and reading the Python 2 documentation which does not have that...). -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: But alas that does not work... -- nosy: +davin, pitrou ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: What other object in the standard lib, leaks resources when deleted in CPython ? Even that documentation says the garbage collector will eventually destroy it, just like here... I think there is an implementation bug. -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: I think I've found the code bug causing the leak: https://github.com/python/cpython/blob/caa331d492acc67d8f4edd16542cebfabbbe1e79/Lib/multiprocessing/pool.py#L180 There is a circular reference between the Pool object, and the self._worker_handler Thread object (and it's also saved in the frame locals for the thread object, which prevents it from being garbage collected). -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
Change by tzickel : -- pull_requests: +7971 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
Change by tzickel : -- keywords: +patch pull_requests: +7972 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34300] gcc 7.3 causes a warning when compiling getpath.c in python 2.7
New submission from tzickel : When compiling on ubuntu 18.04 the 2.7 branch, I get this warning: gcc -pthread -c -fno-strict-aliasing -g -O2 -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -DPy_BUILD_CORE -DPYTHONPATH='":plat-linux2:lib-tk:lib-old"' \ -DPREFIX='"/usr/local"' \ -DEXEC_PREFIX='"/usr/local"' \ -DVERSION='"2.7"' \ -DVPATH='""' \ -o Modules/getpath.o ./Modules/getpath.c In file included from /usr/include/string.h:494:0, from Include/Python.h:38, from ./Modules/getpath.c:3: In function 'strncpy', inlined from 'joinpath' at ./Modules/getpath.c:202:5, inlined from 'search_for_prefix' at ./Modules/getpath.c:265:9, inlined from 'calculate_path' at ./Modules/getpath.c:505:8: /usr/include/x86_64-linux-gnu/bits/string_fortified.h:106:10: warning: '__builtin_strncpy': specified size between 9223372036854779906 and 18446744073709551615 exceeds maximum object size 9223372036854775807 [-Wstringop-overflow=] return __builtin___strncpy_chk (__dest, __src, __len, __bos (__dest)); ^~ I think it's because the compiler can't reason that Py_FatalError aborts the program, and thus not overflow strncpy. Since there are about 3-4 warnings while building, maybe we should add a manual return after Py_FatalError in joinpath ? -- components: Build messages: 322809 nosy: tzickel priority: normal severity: normal status: open title: gcc 7.3 causes a warning when compiling getpath.c in python 2.7 versions: Python 2.7 ___ Python tracker <https://bugs.python.org/issue34300> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34300] gcc 7.3 causes a warning when compiling getpath.c in python 2.7
tzickel added the comment: Changing Py_FatalError prototype to add: __attribute__((noreturn)) also stops the warning. -- ___ Python tracker <https://bugs.python.org/issue34300> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: It actually makes tons of sense that while the thread is running, that the object representing it is alive. After the thread finishes its work, the object dies. >>> import time, threading, weakref, gc >>> t = threading.Thread(target=time.sleep, args=(10,)) >>> wr = weakref.ref(t) >>> t.start() >>> del t >>> gc.collect() >>> wr() Wait 10 seconds... >>> gc.collect() >>> wr() The thread is gone (which doesn't happen with the pool). Anyhow, I've submitted a patch to fix the bug that was introduced 9 years ago on GH, feel free to check it. -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
Change by tzickel : -- pull_requests: +9072 ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34172] multiprocessing.Pool and ThreadPool leak resources after being deleted
tzickel added the comment: Its ok, you only did it twice :) I've submitted a manual 2.7 fix on GH. -- ___ Python tracker <https://bugs.python.org/issue34172> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35030] Python 2.7 OrderedDict leaks memory
New submission from tzickel : https://github.com/requests/requests/issues/4553#issuecomment-431514753 It was fixed in Python 3 by using weakref, but not backported to Python 2. Also might be nice to somehow to that leak test in more places to detect such issues ? -- components: Library (Lib) messages: 328133 nosy: rhettinger, tzickel priority: normal severity: normal status: open title: Python 2.7 OrderedDict leaks memory versions: Python 2.7 ___ Python tracker <https://bugs.python.org/issue35030> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35030] Python 2.7 OrderedDict creates circular references
Change by tzickel : -- keywords: +patch pull_requests: +9344 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35030> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35030] Python 2.7 OrderedDict creates circular references
tzickel added the comment: You can see the testing code here: https://github.com/numpy/numpy/blob/eb40e161e2e593762da9c77858343e3720351ce7/n umpy/testing/_private/utils.py#L2199 it calls gc.collect in the end and only throws this error if it returns a non zero return value from it (after clearing the gc before calling the test code). -- ___ Python tracker <https://bugs.python.org/issue35030> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35030] Python 2.7 OrderedDict creates circular references
tzickel added the comment: I see, so basically this would be a problem only if the root object had a __del__ method and then the GC wouldn't reclaim it ? -- ___ Python tracker <https://bugs.python.org/issue35030> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35030] Python 2.7 OrderedDict creates circular references
tzickel added the comment: Is this commit interesting ? It has less lines, more simple and makes no cycles to collect, and it seems in my limited benchmark faster than the current implementation. https://github.com/tzickel/cpython/commit/7e8b70b67cd1b817182be4dd2285bd136e6b156d -- ___ Python tracker <https://bugs.python.org/issue35030> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35030] Python 2.7 OrderedDict creates circular references
tzickel added the comment: Sorry ignore it. Closed the PR as well. -- ___ Python tracker <https://bugs.python.org/issue35030> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
Change by tzickel : -- pull_requests: +9540 ___ Python tracker <https://bugs.python.org/issue3243> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35117] set.discard should return True or False based on if element existed.
New submission from tzickel : Sometimes you want to do something based on if the item existed before removal, so instead of checking if it exists, then removing and doing something, if would be nice to make the function return True or False based on if the element existed. -- components: Interpreter Core messages: 328938 nosy: tzickel priority: normal severity: normal status: open title: set.discard should return True or False based on if element existed. type: enhancement versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue35117> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue3243] Support iterable bodies in httplib
tzickel added the comment: This patch was opened for 2.7 but never applied there ? https://github.com/python/cpython/pull/10226 This causes a bug with requests HTTP library (and others as well as httplib) when you want to send an iterable object as POST data (with a non-chunked way), it works in Python 3 but not 2, and this effects behaviour and performance... -- nosy: +tzickel ___ Python tracker <https://bugs.python.org/issue3243> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35117] set.discard should return True or False based on if element existed.
tzickel added the comment: I would think that the .discard is the equivalent of .pop in dict. (instead of wasting time once checking and once removing, also the data in a set is the data, there is no value to check). Even the standard lib has lots of usage of dict.pop(key, None) to not throw an exception. (instead of not using None, and catching exception). What sets this apart from other mutating APIs is that it does not throw an exception. -- ___ Python tracker <https://bugs.python.org/issue35117> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35210] Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
New submission from tzickel : There was a TODO in the code about this: https://github.com/python/cpython/blob/e42b705188271da108de42b55d9344642170aa2b/Modules/_io/iobase.c#L909 -- components: IO messages: 329629 nosy: tzickel priority: normal severity: normal status: open title: Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read type: performance versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue35210> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35210] Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
Change by tzickel : -- keywords: +patch pull_requests: +9726 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue35210> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35210] Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
Change by tzickel : -- nosy: +benjamin.peterson, stutzbach ___ Python tracker <https://bugs.python.org/issue35210> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35210] Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
tzickel added the comment: How is that different from the situation today ? The bytearray passed to readinto() is deleted before the function ends. This revision simply changes 2 mallocs and a memcpy to 1 malloc and a potential realloc. -- ___ Python tracker <https://bugs.python.org/issue35210> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35210] Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
tzickel added the comment: I think that if someone tries that this code will raise an exception at the resize part (since the reference will be higher than one), a check can be added and in this case fallback to the previous behaviour, If it's a required check, I can add it. -- ___ Python tracker <https://bugs.python.org/issue35210> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue35210] Use bytes + memoryview + resize instead of bytesarray + array in io.RawIOBase.read
tzickel added the comment: ahh, very interesting discussion. BTW, how is this code different than https://github.com/python/cpython/blame/50ff02b43145f33f8e28ffbfcc6a9d15c4749a64/Modules/_io/bufferedio.c which does the same thing exactly ? (i.e. the memoryview can leak there as well). -- ___ Python tracker <https://bugs.python.org/issue35210> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28824] os.environ should preserve the case of the OS keys ?
New submission from tzickel: In Windows, python's os.environ currently handles the case sensitivity different that the OS. While it's true that the OS is case insensitive, it does preserve the case that you first set it as. For example: C:\Users\user>set aSD=Blah C:\Users\user>set asd aSD=Blah But in python: >>> import os >>> 'aSD' in os.environ.keys() False Today as more people pass environment variables to processes, it's better to behave as the OS does. Basically I think that os.environ (both in 2.7 and 3) should preserve the case as well (for when you need to access / iterate over the keys or set a key), but ignore it when you get a key. https://github.com/python/cpython/blob/b82a5a65caa5b0f0efccaf2bbea94f1eba19a54d/Lib/os.py#L733 -- components: Windows messages: 281906 nosy: larry, loewis, paul.moore, steve.dower, tim.golden, tzickel, zach.ware priority: normal severity: normal status: open title: os.environ should preserve the case of the OS keys ? versions: Python 2.7, Python 3.7 ___ Python tracker <http://bugs.python.org/issue28824> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28824] os.environ should preserve the case of the OS keys ?
tzickel added the comment: My issue is that somebody wants to pass a few dict like environment variables as some prefix_key=value but he wants to preserve the case of the key for usage in python so the .keys() space needs to be enumerated. A workaround for this issue can be importing nt and using nt.environ which preserves the cases. -- ___ Python tracker <http://bugs.python.org/issue28824> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue28824] os.environ should preserve the case of the OS keys ?
tzickel added the comment: Steve, I've checked in Python 3.5.2, and os.environ.keys() still uppercases everything when scanning (for my use case). Has it changed since then ? -- ___ Python tracker <http://bugs.python.org/issue28824> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: any chance for 2.6.12 ? 4 line patch. -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: Sorry Brett of course I meant the upcoming 2.7.12 -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27743] Python 2 has a wrong artificial limit on the amount of memory that can be allocated in ctypes
New submission from tzickel: Python 2 has a wrong artificial limit on the amount of memory that can be allocated in ctypes via sequence repeating (i.e. using create_string_buffer or c_char * ) The problem is practical in Windows 64 bit, when running python 64 bit, since in that platform the sys.maxint is 2GB and while sys.maxsize is as large as in other platforms, trying to allocate more than 2GB of memory results in a different exception than other platforms (where sys.maxint = sys.maxsize): Python 2.7.11 (v2.7.11:6d1b6a68f775, Dec 5 2015, 20:40:30) [MSC v.1500 64 bit (AMD64)] on win32 Type "help", "copyright", "credits" or "license" for more information. >>> import sys, ctypes >>> ctypes.c_char * (sys.maxint + 1) Traceback (most recent call last): File "", line 1, in AttributeError: class must define a '_length_' attribute, which must be a positive integer >>> ctypes.create_string_buffer(sys.maxint + 1) Traceback (most recent call last): File "", line 1, in File "c:\Python27-64\lib\ctypes\__init__.py", line 65, in create_string_buffer buftype = c_char * init AttributeError: class must define a '_length_' attribute, which must be a positive integer on other platforms you get: Traceback (most recent call last): File "", line 1, in OverflowError: cannot fit 'long' into an index-sized integer Thus to allocate more than 2GB, you need to use other methods (numpy or something else). >From my reading of the code, I assume the bug is this line: https://github.com/python/cpython/blob/2.7/Modules/_ctypes/_ctypes.c#L1388 Where the code checks if _length_ is an integer (PyInt_Check). As soon as the number is bigger than sys.maxint, it's a long, and while it's a valid size for the platform (< sys.maxsize), it will bork there. Since this seems like an artificial limit, I think it should be fixed, since it's practical to allocate this sizes on 64 bit systems for some applications. Python 3 has no issue, since it has no int type :) -- components: ctypes messages: 272510 nosy: Bob, Christoph Sarnowski, Patrick Stewart, doko, jpe, larry, mark.dickinson, mattip, meador.inge, python-dev, rkuska, steve.dower, tzickel, vinay.sajip priority: normal severity: normal status: open title: Python 2 has a wrong artificial limit on the amount of memory that can be allocated in ctypes type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue27743> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
New submission from tzickel: I had a non-reproducible issue occur a few times in which python 2.7.9 would produce .pyc files with empty code objects on a network drive under windows. The .pyc might have been created due to intermittent network errors that are hard to reproduce reliably. The .pyc files would override the previous correct .pyc files that existed in the same place. The incorrect .pyc is a valid file, but instead of having the code object of the original .py file compiled, it would have the code object of an empty .py file. Python would then go on to use the incorrect .pyc file until it is manually deleted. This peculiar .pyc files got me thinking about how cpython can produce such an incorrect .pyc file instead of failing. The main issue here is that getc function, returns EOF both on EOF and on file error. It seems as if the tokenizer starts reading the file stream, and gets an EOF directly, it would not check if it resulted from actually reading an empty file or because of an file error, and happily return an empty AST which would be then compiled to a bad empty code .pyc instead of aborting the process because of an file error. -- messages: 250556 nosy: tzickel priority: normal severity: normal status: open title: Python can sometimes create incorrect .pyc files type: behavior versions: Python 2.7 ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: You are not looking at the correct code, the function you are pointing to, check_compiled_module is run to check the existing .pyc (it is a good question, why the .pyc is overriden, but that is a secondary issue, which I cannot reproduce as I've said by demand). I am talking about the code which creates a new (and incorrect) .pyc in parse_source_module: https://hg.python.org/cpython/file/2.7/Python/import.c#l861 calls in the end to Py_UniversalNewlineFgets https://hg.python.org/cpython/file/2.7/Objects/fileobject.c#l2749 you can see that function will return NULL if it gets an EOF because of a file error, and then the tokenises which calls it will not know if it got NULL because of EOF or file error, and compile the AST and generate an incorrect .pyc file. -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
Changes by tzickel : -- nosy: +brett.cannon, meador.inge ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: As for the "example" .pyc just create an empty 0 byte .py file and compile it, that is the same .pyc that is created in my system (instead in my case the .py is not empty). Just so people don't have to trace the code like I did, here is the traceback of the primary issue. Remember that my hypothesis is that fopen returns an FILE stream, that returns EOF on the first get because of an I/O error, not because the file is empty: --> GETC is called, and gets EOF on the first try, and thus Py_UniversalNewlineFgets returns NULL * frame #0: 0x000109fe4c44 Python`Py_UniversalNewlineFgets frame #1: 0x000109fc972c Python`decoding_fgets + 321 frame #2: 0x000109fc9262 Python`tok_nextc + 918 frame #3: 0x000109fc830e Python`PyTokenizer_Get + 171 frame #4: 0x000109fc5853 Python`parsetok + 128 frame #5: 0x00010a066748 Python`PyParser_ASTFromFile + 109 -> Now load_source_module has an empty AST which will get compiled to an empty code module -> this code is optmized so we don't see the parse_source_module call here, but write_compiled_module will be called afterwards with the empty AST tree instead of aborting... frame #6: 0x00010a05e3b5 Python`load_source_module + 671 frame #7: 0x00010a05f003 Python`import_submodule + 270 frame #8: 0x00010a05ebc6 Python`load_next + 284 frame #9: 0x00010a05cb5d Python`PyImport_ImportModuleLevel + 453 frame #10: 0x00010a042641 Python`builtin___import__ + 135 -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: Not sure why nobody has responded yet, but I have gone up and made a patch for the problem for 2.7 HEAD. Would be great if someone with more understanding of python's source could say if this is the optimal place to do the ferror test. I am able to see that this patch fixes the issue (now the import fails on EOF with an error instead of producing an empty valid AST and an empty code object .pyc file) Unfortunately testing that the patch fixes the issue, currently involves LD_PRELOADing an dynamic library which hooks __srget (because thats what the getc macro in Py_UniversalNewlineFgets uses in posix systems) to return EOF, and ferror to return a non zero result: If the code for the hooking dynamic library is needed, please tell me (I can't figure out how to make an automated test for it). - shell$ cat a.py print 'hi' Before fixing python: shell$ ls a.py* a.py shell$ DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=libblah.dylib python Python 2.7.10 (default, Jul 13 2015, 12:05:58) [GCC 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.environ['BORK_IO_TEST'] = 'a' # this activates the EOF / ferror >>> import a # this should print 'hi' or fail but does not... >>> a >>> exit() shell$ ls a.py* a.py a.pyc You can see that it accepted a.py as an empty file and a bad a.pyc was created. After the patch: shell$ ls a.py* a.py shell$ DYLD_FORCE_FLAT_NAMESPACE=1 DYLD_INSERT_LIBRARIES=libblah.dylib ./python.exe Python 2.7.10+ (2.7:f6125114b55f+, Sep 18 2015, 19:18:34) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.72)] on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import os >>> os.environ['BORK_IO_TEST'] = 'a' >>> import a Traceback (most recent call last): File "", line 1, in File "a.py", line 1 ^ SyntaxError: unexpected EOF while parsing >>> a Traceback (most recent call last): File "", line 1, in NameError: name 'a' is not defined >>> exit() shell$ ls a.py* a.py Now the import failed, and of course no empty code .pyc file was created. -- keywords: +patch nosy: +benjamin.peterson Added file: http://bugs.python.org/file40505/ferror_check_in_tokenizer.patch ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: Although I haven't reviewed python 3.5 code, I've put an breakpoint on calling "ferror" in the debugger, and it seems that python 3 does not check the file status on import as well... -- nosy: +eric.snow, ncoghlan ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: TL:DR Python 2 forgot to do I/O error checking when reading .py files from disk. On some rare situations this can bite Python in the ass and cause it to bork .pyc files. Checked python 3, it checks the I/O in a different / better way. Next python 2.7 is out in 1.5 month, I really want this fix to get in, did I forgot to nose some developer who can help out ? -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: 1. You are correct the issue I am talking about is in parsing source files (Altough because python caches them as .pyc it's a worse situation). 2. The example you give is EINTR handling (which is mostly handling interrupted I/O operations by signals and retrying them) the real I/O error checking in that get_line is I belive in the next ferror check there. It might be nice to have EINTR checking (and retry) when reading the source file as well, but that is a secondary issue. 3. As for your recommendation for changing Py_UniversalNewlineFgets, you can see that both it's documentation says "Note that we need no error handling: fgets() treats error and eof identically." and since it seems like a low-level function that does not have any python stuff like exception handling, and in it's current signature it can't return an error (it simply returns the buffer, or NULL if nothing was read). 4. As for why putting it in that position, basically there could be a few I/O paths, besides Py_UniversalNewlineFget, such as codec decoding in fp_readl (via decoding_fgets) that can fail in I/O as well. Looking at the code again (while not tested), maybe the check can be actually moved to the end of decoding_fgets in tokenizer.c i.e. if there is an ferror in tok->fp in the end of decoding_fgets then to return error_ret(tok); there, but then double eyes need to be sure that no other code path can have an I/O error. I am not an expert on the layout of tokenizer (read it mostly to figure out this bug) so if that's better it can be moved I guess. -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25562] Python 2 & 3 don't allow the user to disable ctypes SEH in windows
New submission from tzickel: In Windows, there is a mechanizm called SEH that allows C/C++ programs to catch OS Exceptions (such as divide by zero, page faults, etc..). Python's ctypes module for some reason forces the user to wrap all ctypes FFI calls with a special SEH wrapper that converts those exceptions to Python exceptions. For the UNIX people think about it that python installs a signal handler without you asking (or being able to remove it) when calling FFI functions. The main issue with this, is that when you want to debug why a DLL behaves badly and you want a process dump (or catch the stack trace in the DLL) you can't without attaching a debugger and catching first-chance exceptions (because the ctypes SEH handling masks the issue). My proposal is to have both in python 2 and in python 3 an option to call an FFI function with selectively using or not SEH. Here is the SEH wrap (as you can see it's not optional in runtime): https://github.com/python/cpython/blob/master/Modules/_ctypes/callproc.c#L806 -- components: ctypes messages: 254143 nosy: amaury.forgeotdarc, belopolsky, meador.inge, tzickel priority: normal severity: normal status: open title: Python 2 & 3 don't allow the user to disable ctypes SEH in windows type: behavior versions: Python 2.7, Python 3.6 ___ Python tracker <http://bugs.python.org/issue25562> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
tzickel added the comment: Meador Inge any other questions regarding the issue ? I can't believe 2.7.11 is coming out soon, and nobody is taking this issue seriously enough... -- ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26003] Issues with PyEval_InitThreads and PyGILState_Ensure
New submission from tzickel: A few issues regarding threads: A. (Python 2 & 3) The documentation (https://docs.python.org/3/c-api/init.html) about initializing the GIL/Threading system does not specify that calling PyEval_InitThreads actually binds the calling thread as the main_thread in the ceval.c, meaning that the thread will be in charge till the process goes down for handling Py_AddPendingCall calls, and if it ends/dies, they won't be handled anymore. This ceval.c's main_thread is different BTW from the one in signalmodule.c which is bound to the thread that called Py_InitializeEx. Maybe there is sense for both main_thread to be the same one and initialized in the same time ? (even without a GIL existing) B. (Python 3) Besides the bad documentation regarding this, python 3.4 added issue #19576 which actually hides the call for PyEval_InitThreads inside PyGILState_Ensure. Without careful care and knowledge by the programmer, this might cause for a short lived thread created in C to actually bind the ceval.c's main_thread and when the thread dies main_thread will never be changed again. The reason this is important is beforehand, the programmer needed to think about PyEval_InitThreads now it's hidden and not even mentioned in the documentation. C. (Python 2 & 3) In PyEval_InitThreads documentation it's written "It is not safe to call this function when it is unknown which thread (if any) currently has the global interpreter lock." Thus it should be mentioned that PyGILState_Ensure is now calling it in the documentation ? Also I believe the reason this quote exists is because a potential race condition between thread A which might be running code in PyEval_EvalFrameEx (before PyEval_InitThreads is called, and thus is not GIL aware), and thread B which calls PyEval_InitThreads then calls PyGILState_Ensure, then running Python code, while thread A is still running python code as well. I think it should be explained more clearly in the documentation the implications (race condition). I think there might be a way to make an PyEval_InitThreads variant which can overcome this race condition. Basically it involves using Py_AddPendingCall to a C function which calls PyEval_InitThreads, and notifies the calling command/thread when it's done. This way we can be sure that the GIL is taken by one thread, and all the others are blocked. (maybe a signal should be sent as well, in case the main_thread is blocked on an I/O operation). D. (Python 2) If the main_thread finishes it's job, while other python threads are still alive, signal handling isn't processed anymore (Example will be added as a file). -- components: Interpreter Core files: signalexample.py messages: 257425 nosy: tzickel priority: normal severity: normal status: open title: Issues with PyEval_InitThreads and PyGILState_Ensure versions: Python 2.7, Python 3.6 Added file: http://bugs.python.org/file41485/signalexample.py ___ Python tracker <http://bugs.python.org/issue26003> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue26003] Issues with PyEval_InitThreads and PyGILState_Ensure
Changes by tzickel : -- nosy: +pitrou ___ Python tracker <http://bugs.python.org/issue26003> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19576] "Non-Python created threads" documentation doesn't mention PyEval_InitThreads()
tzickel added the comment: I think that the document regarding PyGILState_Ensure and PyEval_InitThreads should be clarified better, written in issue #26003 -- nosy: +tzickel ___ Python tracker <http://bugs.python.org/issue19576> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue25083] Python can sometimes create incorrect .pyc files
Changes by tzickel : -- nosy: +serhiy.storchaka ___ Python tracker <http://bugs.python.org/issue25083> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21878] wsgi.simple_server's wsgi.input read/readline waits forever in certain circumstances
tzickel added the comment: Just encountered this issue as well. It's not related to newlines, but to not supporting HTTP or persistent connections (the wsgi.input is the socket's I/O directly, and if the client serves a persistent connection, then the .read() will block forever). A simple solution is to use a saner wsgi server (gevent works nicely). Here is their implmentation of the socket I/O wrapper class (Input), and it's read/readlines functions: https://github.com/gevent/gevent/blob/a65501a1270c1763e9de336a9c3cf52081223ff6/gevent/pywsgi.py#L303 ------ nosy: +tzickel ___ Python tracker <http://bugs.python.org/issue21878> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com