[issue46896] add support for watching writes to selecting dictionaries
New submission from Carl Meyer : CPython extensions providing optimized execution of Python bytecode (e.g. the Cinder JIT), or even CPython itself (e.g. the faster-cpython project) may wish to inline-cache access to frequently-read and rarely-changed namespaces, e.g. module globals. Rather than requiring a dict version guard on every cached read, the best-performing way to do this is is to mark the dictionary as “watched” and set a callback on writes to watched dictionaries. This optimizes the cached-read fast-path at a small cost to the (relatively infrequent and usually less perf sensitive) write path. We have an implementation of this in Cinder ( https://docs.google.com/document/d/1l8I-FDE1xrIShm9eSNJqsGmY_VanMDX5-aK_gujhYBI/edit#heading=h.n2fcxgq6ypwl ), used already by the Cinder JIT and its specializing interpreter. We would like to make the Cinder JIT available as a third-party extension to CPython ( https://docs.google.com/document/d/1l8I-FDE1xrIShm9eSNJqsGmY_VanMDX5-aK_gujhYBI/ ), and so we are interested in adding dict watchers to core CPython. The intention in this issue is not to add any specific optimization or cache (yet); just the ability to mark a dictionary as “watched” and set a write callback. The callback will be global, not per-dictionary (no extra function pointer stored in every dict). CPython will track only one global callback; it is a well-behaved client’s responsibility to check if a callback is already set when setting a new one, and daisy-chain to the previous callback if so. Given that multiple clients may mark dictionaries as watched, a dict watcher callback may receive events for dictionaries that were marked as watched by other clients, and should handle this gracefully. There is no provision in the API for “un-watching” a watched dictionary; such an API could not be used safely in the face of potentially multiple dict-watching clients. The Cinder implementation marks dictionaries as watched using the least bit of the dictionary version (so version increments by 2); this also avoids any additional memory usage for marking a dict as watched. Initial proposed API, comments welcome: // Mark given dictionary as "watched" (global callback will be called if it is modified) void PyDict_Watch(PyObject* dict); // Check if given dictionary is already watched int PyDict_IsWatched(PyObject* dict); typedef enum { PYDICT_EVENT_CLEARED, PYDICT_EVENT_DEALLOCED, PYDICT_EVENT_MODIFIED } PyDict_WatchEvent; // Callback to be invoked when a watched dict is cleared, dealloced, or modified. // In clear/dealloc case, key and new_value will be NULL. Otherwise, new_value will be the // new value for key, NULL if key is being deleted. typedef void(*PyDict_WatchCallback)(PyDict_WatchEvent event, PyObject* dict, PyObject* key, PyObject* new_value); // Set new global watch callback; supply NULL to clear callback void PyDict_SetWatchCallback(PyDict_WatchCallback callback); // Get existing global watch callback PyDict_WatchCallback PyDict_GetWatchCallback(); The callback will be called immediately before the modification to the dict takes effect, thus the callback will also have access to the prior state of the dict. -- components: C API messages: 414307 nosy: carljm, dino.viehland, itamaro priority: normal severity: normal status: open title: add support for watching writes to selecting dictionaries versions: Python 3.11 ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Change by Carl Meyer : -- title: add support for watching writes to selecting dictionaries -> add support for watching writes to selected dictionaries ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: Thanks gps! Working on a PR and will collect pyperformance data as well. We haven't observed any issues in Cinder with the callback just being called at shutdown, too, but if there are problems with that it should be possible to just have CPython clear the callback at shutdown time. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: > Could we (or others) end up with unguarded stale caches if some buggy > extension forgets to chain the calls correctly? Yes. I can really go either way on this. I initially opted for simplicity in the core support at the cost of asking a bit more of clients, on the theory that a) there are lots of ways for a buggy C extension to cause crashes with bad use of the C API, and b) I don't expect there to be very many extensions using this API. But it's also true that the consequences of a mistake here could be hard to debug (and easily blamed to the wrong place), and there might turn out to be more clients for dict-watching than I expect! If the consensus is to prefer CPython tracking an array of callbacks instead, we can try that. > when you say "only one global callback": does that mean per-interpreter, or > per-process? Good question! The currently proposed API suggests per-process, but it's not a question I've given a lot of thought to yet; open to suggestions. It seems like in general the preference is to avoid global state and instead tie things to an interpreter instance? I'll need to do a bit of research to understand exactly how that would affect the implementation. Doesn't seem like it should be a problem, though it might make the lookup at write time to see if we have a callback a bit slower. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: Thanks for the feedback! > Why so coarse? Simplicity of implementation is a strong advantage, all else equal :) And the coarse version is a) at least somewhat proven as useful and usable already by Cinder / Cinder JIT, and b) clearly doable without introducing memory or noticeable CPU overhead to unwatched dicts. Do you have thoughts about how you'd do a more granular version without overhead? > Getting a notification for every change of a global in module, is likely to > make use the use of global variables extremely expensive. It's possible. We haven't ever observed this as an issue in practice, but we may have just not observed enough workloads with heavy writes to globals. I'd like to verify this problem with a real representative benchmark before making design decisions based on it, though. Calling a callback that is uninterested in a particular key doesn't need to be super-expensive if the callback is reasonably written, and this expense would occur only on the write path, for cases where the `global` keyword is used to rebind a global. I don't think it's common for idiomatic Python code to write to globals in perf-sensitive paths. Let's see how this shows up in pyperformance, if we try running it with all module globals dicts watched. > For example, we could just tag the low bit of any pointer in a dictionary’s > values that we want to be notified of changes to Would you want to tag the value, or the key? If value, does that mean if the value is changed it would revert to unwatched unless you explicitly watched the new value? I'm a bit concerned about the performance overhead this would create for use of dicts outside the write path, e.g. the need to mask off the watch bit of returned value pointers on lookup. > What happens if a watched dictionary is modified in a callback? It may be best to document that this isn't supported; it shouldn't be necessary or advisable for the intended uses of dict watching. That said, I think it should work fine if the callback can handle re-entrancy and doesn't create infinite recursion. Otherwise, I think it's a case of "you broke it, you get to keep all the pieces." > How do you plan to implement this? Steal a bit from `ma_version_tag` We currently steal the low bit from the version tag in Cinder; my plan was to keep that approach. > You'd probably need a PEP to replace PEP 509, but I think this may need a PEP > anyway. I'd prefer to avoid coupling this to removal of the version tag. Then we get into issues of backward compatibility that this proposal otherwise avoids. I don't think the current proposal is of a scope or level of user impact that should require a PEP, but I'm happy to write one if needed. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13049] distutils2 should not allow packages
New submission from Carl Meyer : As discussed at http://groups.google.com/group/the-fellowship-of-the-packaging/browse_frm/thread/3b7a8ddd307d1020 , distutils2 should not allow a distribution to install files into a top-level package that is already installed from a different distribution. -- assignee: tarek components: Distutils2 messages: 144542 nosy: alexis, carljm, eric.araujo, tarek priority: normal severity: normal status: open title: distutils2 should not allow packages type: behavior ___ Python tracker <http://bugs.python.org/issue13049> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12405] packaging does not record/remove directories it creates
Carl Meyer added the comment: > Carl: Can you tell us how pip removes directories? In short - pip would _love_ to have directories recorded as well as files, exactly as Vinay has proposed. We don't have that info (even the distutils --record option currently doesn't record directories, thus installed-files.txt doesn't contain directories), so we are reduced to some nasty things like referring to top-level.txt in order to avoid lots of empty directories hanging about, which in itself was the subject of recent controversy re Twisted's custom namespace packages implementation. Please, let's have directories recorded in RECORD, and yes, if a directory would have been created but already existed, it should also be recorded (so that shared directories are in the RECORD file for both/all of the sharing distributions). -- ___ Python tracker <http://bugs.python.org/issue12405> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12405] packaging does not record/remove directories it creates
Carl Meyer added the comment: > This is what I proposed earlier: we’d need to record all directories that > would have been created, but I’m not sure if it will be possible. For > example, if one uses --prefix /tmp/usr and pysetup install creates /tmp/usr, > /tmp/usr/lib, /tmp/usr/lib/python2.7, /tmp/usr/lib/python2.7/site-packages, > /tmp/usr/lib/python2.7/site-packages/spam and > /tmp/usr/lib/python2.7/site-packages/Spam-0.1.dist-info, then we pysetup > should Spam, should packaging remove only the package and dist-info > directories or also the site-packages, python2.7, lib and usr directories? I think it would make sense to draw a distinction between "creating the prefix directories (including site-packages)" and "creating the distribution-specific directories within the prefix directories." And only record the latter in RECORD for the given installed distribution. If I use --prefix and install some things, and then uninstall them, I would not consider it a bug to find the empty site-packages directory still remaining under that prefix. (In fact, I'd be surprised if it were removed). > Okay, so I will champion a patch to PEP 376. Thank you! -- ___ Python tracker <http://bugs.python.org/issue12405> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13304] test_site assumes that site.ENABLE_USER_SITE is True
New submission from Carl Meyer : If the test suite is run with PYTHONNOUSERSITE=true, the test_s_option test in test_site fails, because it implicitly assumes that site.ENABLE_USER_SITE is True and that site.USER_SITE should unconditionally be in sys.path. This is a practical problem in the reference implementation for PEP 404, as the tests should pass when run from within a virtual environment, but a system-isolated virtual environment disables user-site (i.e. has the same effect as PYTHONNOUSERSITE). I think the correct fix here is to conditionally skip that test if site.ENABLE_USER_SITE is not True. I also think the module-level conditional check at the top of the file, which, if site.USER_SITE does not exist, creates site.USER_SITE and calls site.addsitedir() on it, should only run if site.ENABLE_USER_SITE is True. -- components: Tests messages: 146722 nosy: carljm priority: normal severity: normal status: open title: test_site assumes that site.ENABLE_USER_SITE is True ___ Python tracker <http://bugs.python.org/issue13304> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13304] test_site assumes that site.ENABLE_USER_SITE is True
Carl Meyer added the comment: Added a patch implementing my proposed fix. -- hgrepos: +87 ___ Python tracker <http://bugs.python.org/issue13304> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13304] test_site assumes that site.ENABLE_USER_SITE is True
Changes by Carl Meyer : -- keywords: +patch Added file: http://bugs.python.org/file23575/cea40c2d7323.diff ___ Python tracker <http://bugs.python.org/issue13304> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13304] test_site assumes that site.ENABLE_USER_SITE is True
Changes by Carl Meyer : Removed file: http://bugs.python.org/file23575/cea40c2d7323.diff ___ Python tracker <http://bugs.python.org/issue13304> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue13304] test_site assumes that site.ENABLE_USER_SITE is True
Changes by Carl Meyer : Added file: http://bugs.python.org/file23576/d851c64c745a.diff ___ Python tracker <http://bugs.python.org/issue13304> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11574] TextIOWrapper: Unicode Fallback Encoding on Python 3.3
Carl Meyer added the comment: Here's an example real-world case where the only solution I could find was to simply avoid non-ASCII characters entirely (which is obviously not a real solution): https://github.com/pypa/virtualenv/issues/201#issuecomment-3145690 distutils/distribute require long_description to be a string, not bytes (so it can rfc822-escape it, and use string methods to do so), but does not explicitly set an output encoding when it writes egg-info. This means that a developer either has the choice to a) break installation of their package on any system with an ASCII default locale, or b) not use any non-ASCII characters in long_description. One might say, "ok, this is a bug in distutils/distribute, it should explicitly specify UTF-8 encoding when writing egg-info." But if this is a sensible thing for distutils/distribute to do, regardless of user locale, why would it not be equally sensible for Python itself to have the default output encoding always be UTF-8 (with the ability for a developer who wants to support arbitrary user locale to explicitly do so)? -- nosy: +carljm ___ Python tracker <http://bugs.python.org/issue11574> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8668] Packaging: add a 'develop' command
Carl Meyer added the comment: Can someone post a link here to the page of use cases that Michael just reviewed? I think the link came through on the Fellowship mailing list, but I'm not quickly finding it... -- ___ Python tracker <http://bugs.python.org/issue8668> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8668] Packaging: add a 'develop' command
Carl Meyer added the comment: On 07/11/2011 09:17 AM, Michael Mulich wrote: > * Cases 2, 3, 5 and 6 are strongly related. I'd suggest you condense them > into a single use case. I agree with case 2 and 6 most, but have questions: > ** Why wouldn't one simply use a virtualenv? I don't know. I don't consider case 3 useful, because I don't consider "I don't want to use a virtualenv" (without some clearer technical justification) to be a prejudice the develop feature needs to support; especially if supporting it essentially means re-implementing a less-capable version of virtualenv within the develop command. > -- Case 5 touches on this topic, but if we are installing in-place, who cares > if can place a development package in the global site-packages directory? Several of these stories make the assumption that even the "in-place" installation will require placing a file in the installation location (a .pth file, if we follow the current setuptools implementation strategy). I think this is probably true, given the requirements in case 6 (which I agree with). So if you want an in-place install that's globally accessible, you'd need write access to global site-packages. > ** After the package has been installed in-place (using the develop command), > how does one identify it as an in development project (or in development > mode)? -- Case 3 and 6 touch on this topic (case 3 is a little vague at this > time), but doesn't explain what type of action is intended. So if we install > in-place (aka, develop), how does the python interpreter find the package? > Are we using PYTHONPATH at this point (which would be contradict a > requirement in case 6)? These use cases (probably intentionally) don't touch on specific implementation strategies, but as I mentioned there's an implicit assumption that a .pth file is the most likely strategy. > * Case 4 is a be unclear. Is Carl, the actor, pulling unreleased remote > changes (hg pull --update) for these mercurial server plugins then running > the develop command on them? Right, although the requirement for that story is that you don't have to re-run the develop command after every pull; if you develop-install it once, you can simply pull more code changes in and they'll immediately be available. I've added a line to that story to make it more clear. > * Case 1 is good and very clear, but I'd consider it a feature rather than > required. Perhaps it should not be focused on first (priority). Thoughts? I agree that's a second-level feature (or, perhaps more accurately, a bug in the existing setuptools feature that I was hoping could be addressed in the d2 version), not a primary requirement. -- ___ Python tracker <http://bugs.python.org/issue8668> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12279] Add build_distinfo command to packaging
Carl Meyer added the comment: You guys are more familiar with the codebase than I am, but it seems to me that the RECORD file should clearly either be not present or empty when metadata has been built but not yet installed. I don't really think the "invalid PEP 376" issue is a problem: PEP 376 describes the metadata for installed distributions; it has nothing to say about built metadata for a distribution which has not yet been installed. For purposes of the develop command, if a pth file is used to implement develop, then ideally when develop is run a RECORD file would be added containing only the path to that pth file, as thats the only file that has actually been installed (and the only one that should be removed if the develop-installed package is uninstalled). -- nosy: +carljm ___ Python tracker <http://bugs.python.org/issue12279> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue12279] Add build_distinfo command to packaging
Carl Meyer added the comment: >> I don't really think the "invalid PEP 376" issue is a problem: PEP >> 376 describes the metadata for installed distributions; it has >> nothing to say about built metadata for a distribution which has not >> yet been installed. > The problem is that develop is a kind of install. Right, I was simply referring to "build_distinfo" leaving it empty/missing; I'd want "develop" to add a (very short) RECORD file as specified below. >> For purposes of the develop command, if a pth file is used to >> implement develop, then ideally when develop is run a RECORD file >> would be added containing only the path to that pth file, as thats >> the only file that has actually been installed > Yeah! > >> (and the only one that should be removed if the develop-installed >> package is uninstalled). > Are you saying that such a RECORD file would allow any installer compatible > with PEP 376 to undo a develop install? Clever! Yeah, that's the idea. I don't see any actual use case for having all of the Python modules etc included in the RECORD file for a develop-install, because they haven't been installed anywhere: what we really want to know is "what has been placed in the installation location that we need to keep track of."? -- ___ Python tracker <http://bugs.python.org/issue12279> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8668] Packaging: add a 'develop' command
Carl Meyer added the comment: > Ah, higery’s code already has an answer for me: it writes *two* paths in the > .pth file, one to the build dir (so that .dist-info is found) and one to the > modules root (for modules, built in place). Anyone sees a problem with that? > (For example huge sys.path.) > > In this scheme, when Python modules are edited, changes are visible > instantly, when C modules are edited, a call to build_ext is required, and > when the metadata is edited, build_distinfo is required. Does that sound > good? That sounds reasonable to me. I'm not worried about that making sys.path too long: whatever we do we aren't going to challenge buildout in that department ;-) -- ___ Python tracker <http://bugs.python.org/issue8668> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8668] Packaging: add a 'develop' command
Carl Meyer added the comment: > I’ve reviewed the last patch. It looks like the code only installs > to the global site-packages, and there is no support to install to > the user site-packages or to another arbitrary location. > > On Windows, normal users seem to be able to write to the global > site-packages (see #12260), but on other OSes with a proper rights > model that won’t do. Luckily, PEP 370 brings us user > site-packages (currently poorly documented, see #8617 and #10745), > but only for 2.6, 2.7 and 3.x. It looks like Tarek is ready to drop > 2.4 compatibility for distutils2, so the question is: what to do > under 2.5? > > Generally, I don’t see why develop could not install to any > directory. We want a default invocation without options to Just > Work™, finding a writable directory already on sys.path and writing > into it, but that doesn’t exclude letting the user do what they > want. I don't see why the installation-location-finding for develop should be any different than for a normal "pysetup install". Does "pysetup install" install to global site-packages by default, or try to find somewhere it can install without additional privileges? Whatever it does by default, develop should do the same. If "develop" can install to arbitrary locations, then "install" should be able to as well (though I don't really see the value in "arbitrary locations", since you then have to set up PYTHONPATH manually anyway). There is no reason for them to have different features in this area, it just adds confusion. Certainly "develop" should support PEP 370, ideally with the same command-line flag as a regular install. -- ___ Python tracker <http://bugs.python.org/issue8668> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue8668] Packaging: add a 'develop' command
Carl Meyer added the comment: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 > Éric Araujo added the comment: > > [Carl] >> there's an implicit assumption that a .pth file is the most likely >> strategy. > If you have other ideas, please share them. No, I think that's the most promising strategy. The "implicit assumption" comment was not criticism, just explanation for Michael. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk4nF2wACgkQ1j/fhc23WEDvlwCeK3Y+MJGyb3uoEzYzJWaSCrTy WewAoI7UdW+nqP2SEtquvQXCndXX57VO =UFOY -END PGP SIGNATURE- -- ___ Python tracker <http://bugs.python.org/issue8668> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue9878] Avoid parsing pyconfig.h and Makefile by autogenerating extension module
Changes by Carl Meyer : -- nosy: +carljm ___ Python tracker <http://bugs.python.org/issue9878> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11591] "python -S" should be robust against e.g. "from site import addsitedir"
New submission from Carl Meyer : If python is run with the -S flag, that declares the intent of the user to not have site-specific additions to sys.path. However, some code in that process may have a legitimate need for a function defined in site.py - for instance, addsitedir. But the act of importing site.py, as a side effect, adds the standard site-specific directories to sys.path. python -S would be more useful and reliable if it prevented importing site from automatically making the sys.path additions. There is no loss of flexibility here, as user code could still explicitly call site.main() to achieve all of the current side-effects of "import site". The fix is a one-liner, and is in the linked hg repository. -- components: Library (Lib) hgrepos: 4 messages: 131281 nosy: carljm priority: normal severity: normal status: open title: "python -S" should be robust against e.g. "from site import addsitedir" type: behavior ___ Python tracker <http://bugs.python.org/issue11591> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11591] "python -S" should be robust against e.g. "from site import addsitedir"
Changes by Carl Meyer : -- keywords: +patch Added file: http://bugs.python.org/file21274/87df1d37c88e.diff ___ Python tracker <http://bugs.python.org/issue11591> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11591] "python -S" should be robust against e.g. "from site import addsitedir"
Carl Meyer added the comment: Adding a test is easier said than done. The behavior change here depends on python being run with -S. Currently test_site skips itself if the test suite is run with -S, and if I remove that skip it crashes under -S. Options as I see it: 1. Declare this one-liner correct by inspection. It doesn't break any existing tests. 2. Add a new test file (test_no_site.py?) that only runs with -S and tests that importing something from site doesn't trigger sys.path additions. This seems like the most reasonable test, but I'm not sure how useful it is, since I doubt most people ever try running the test suite with -S. 3. Make the fix more complicated such that it uses an intermediary variable which can be mocked (unlike sys.flags.no_site, which is read-only), and then add a test which mocks this variable, temporarily removes "site" from sys.modules, tries importing it again, and checks whether main() is called. This creates a complex test which is highly coupled to the implementation in site.py, but would be run under normal conditions (without -S). Which option do you prefer? -- ___ Python tracker <http://bugs.python.org/issue11591> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11598] missing afxres.h error when building bdist_wininst in Visual Studio 2008 Express
New submission from Carl Meyer : By opening up pcbuild.sln in VS2008 Express, I was able to successfully build python and pythonw, but when I tried to build bdist_wininst it failed with "Fatal Error RC1015: cannot open include file afxres.h" Googling turned up a number of comments about how this file is part of MFC, which is really not supposed to be used with VS2008. The recommended "fix" that seemed to work for most people online was to replace "afxres.h" with "windows.h" in the rc file. I did this in PC/bdist_wininst/install.rc, and then it failed with a different error about a missing IDC_STATIC token. I have very little experience with Windows, so it's entirely possible I'm just doing something wrong, but I was asked in #python-dev to file a bug here. -- components: Build, Windows messages: 131351 nosy: carljm priority: normal severity: normal status: open title: missing afxres.h error when building bdist_wininst in Visual Studio 2008 Express versions: Python 3.3 ___ Python tracker <http://bugs.python.org/issue11598> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11591] "python -S" should be robust against e.g. "from site import addsitedir"
Carl Meyer added the comment: Added documentation to Doc/library/site.rst and Misc/NEWS. -- hgrepos: +5 ___ Python tracker <http://bugs.python.org/issue11591> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11591] "python -S" should be robust against e.g. "from site import addsitedir"
Changes by Carl Meyer : Added file: http://bugs.python.org/file21327/ebe5760afa08.diff ___ Python tracker <http://bugs.python.org/issue11591> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11591] "python -S" should be robust against e.g. "from site import addsitedir"
Carl Meyer added the comment: > Did you have to manually click “Create Patch” to make roundup generate it? Yes - the first time too. > Did you try first to click on the button of the existing repo before adding a > new repo entry? That would probably have worked fine. The "Remote hg repo" field was just empty when I made my latest comment, so I filled it in again. Wasn't sure if it would duplicate, or be smart enough to tell they were the same repo, or what. I guess it duplicated :/ -- ___ Python tracker <http://bugs.python.org/issue11591> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue6087] distutils.sysconfig.get_python_lib gives surprising result when used with a Python build
Changes by Carl Meyer : -- nosy: +carljm ___ Python tracker <http://bugs.python.org/issue6087> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue11868] Minor word-choice improvement in devguide "lifecycle of a patch" opening paragraph
New submission from Carl Meyer : The opening paragraph of the "lifecycle of a patch" devguide page contains a confusing parenthetical aside implying that an "svn-like" workflow would mean never *saving* anything to your working copy and using "hg diff" to generate a patch. This is obviously wrong given the usual meaning of "save": if you never save anything to your working copy, "hg diff" will be empty. Patch attached with proposed alternative wording. -- components: Devguide files: svn-like-wording.diff keywords: patch messages: 133978 nosy: carljm priority: normal severity: normal status: open title: Minor word-choice improvement in devguide "lifecycle of a patch" opening paragraph versions: 3rd party Added file: http://bugs.python.org/file21707/svn-like-wording.diff ___ Python tracker <http://bugs.python.org/issue11868> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue2244] urllib and urllib2 decode userinfo multiple times
New submission from Carl Meyer: Both urllib and urllib2 call urllib.unquote() multiple times on data in the userinfo section of an FTP URL. One call occurs at the end of the urllib.splituser() function. In urllib, the other call appears in URLOpener.open_ftp(). In urllib2, the other two occur in FTPHandler.ftp_open() and Request.get_host(). The effect of this is that if the userinfo section of an FTP url should need to contain a literal % sign followed by two digits, the % sign must be double-encoded as %2525 (for urllib) or triple-encoded as %252525 (for urllib2) in order for the URL to be accessed. The proper behavior would be to only ever unquote a given data segment once. The W3's URI: Generic Syntax RFC (http://gbiv.com/protocols/uri/rfc/rfc3986.html) addresses this very issue in section 2.4 (When to Encode or Decode): "Implementations must not percent-encode or decode the same string more than once, as decoding an already decoded string might lead to misinterpreting a percent data octet as the beginning of a percent-encoding, or vice versa in the case of percent-encoding an already percent-encoded string." The solution would be to standardize where in urllib and urllib2 the unquoting happens, and then make sure it happens nowhere else. I'm not familiar enough with the libraries to know where it should be removed without possibly breaking other behavior. It seems that just removing the map/unquote call in urllib.splituser() would fix the problem in urllib. I would guess the call in urllib2 Request.get_host() should also be removed, as the RFC referenced above says clearly that only individual data segments of the URL should be decoded, not larger portions that might contain delimiters (: and @). I've attached a patchset for these suggested changes. Very superficial testing suggests that the patch doesn't break anything obvious, but I make no guarantees. -- components: Library (Lib) files: urllib-issue.patch keywords: patch messages: 63324 nosy: carljm severity: normal status: open title: urllib and urllib2 decode userinfo multiple times type: behavior versions: Python 2.5 Added file: http://bugs.python.org/file9621/urllib-issue.patch __ Tracker <[EMAIL PROTECTED]> <http://bugs.python.org/issue2244> __ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Change by Carl Meyer : -- keywords: +patch pull_requests: +29891 stage: -> patch review pull_request: https://github.com/python/cpython/pull/31787 ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: Draft PR is up for consideration. Perf data in https://gist.github.com/carljm/987a7032ed851a5fe145524128bdb67a Overall it seems like the base implementation is perf neutral -- maybe a slight impact on the pickle benchmarks? With all module global dicts (uselessly) watched, there are a few more benchmarks with small regressions, but also some with small improvements (just noise I guess?) -- overall still pretty close to neutral. Comments welcome! -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: Hi Dennis, thanks for the questions! > A curiosity: have you considered watching dict keys rather than whole dicts? There's a bit of discussion of this above. A core requirement is to avoid any memory overhead and minimize CPU overhead on unwatched dicts. Additional memory overhead seems like a nonstarter, given the sheer number of dict objects that can exist in a large Python system. The CPU overhead for unwatched dicts in the current PR consists of a single added `testb` and `jne` (for checking if the dict is watched), in the write path only; I think that's effectively the minimum possible. It's not clear to me how to implement per-key watching under this constraint. One option Brandt mentioned above is to steal the low bit of a `PyObject` pointer; in theory we could do this on `me_key` to implement per-key watching with no memory overhead. But then we are adding bit-masking overhead on every dict read and write. I think we really want the implementation here to be zero-overhead in the dict read path. Open to suggestions if I've missed a good option here! > That way, changing global values would not have to de-optimize, only adding > new global keys would. > Indexing into dict values array wouldn't be as efficient as embedding direct > jump targets in JIT-generated machine code, but as long as we're not doing > that, maybe watching the keys is a happy medium? But we are doing that, in the Cinder JIT. Dict watching here is intentionally exposed for use by extensions, including hopefully in future the Cinder JIT as an installable extension. We burn exact pointer values for module globals into generated JIT code and deopt if they change (we are close to landing a change to code-patch instead of deopting.) This is quite a bit more efficient in the hot path than having to go through a layer of indirection. I don't want to assume too much about how dict watching will be used in future, or go for an implementation that limits its future usefulness. The current PR is quite flexible and can be used to implement a variety of caching strategies. The main downside of dict-level watching is that a lot of notifications will be fired if code does a lot of globals-rebinding in modules where globals are watched, but this doesn't appear to be a problem in practice, either in our workloads or in pyperformance. It seems likely that a workable strategy if this ever was observed to be a problem would be to notice at runtime that globals are being re-bound frequently in a particular module and just stop watching that module's globals. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: > have you considered watching dict keys rather than whole dicts? Just realized that I misunderstood this suggestion; you don't mean per-key watching necessarily, you just mean _not_ notifying on dict values changes. Now I understand better how that connects to the second part of your comment! But yeah, I don't want this limitation on dict watching use cases. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: Thanks for outlining the use cases. They make sense. The current PR provides a flexible generic API that fully supports all three of those use cases (use cases 2 and 3 are strict subsets of use case 1.) Since the callback is called before the dict is modified, all the necessary information is available to the callback to decide whether the event is interesting to it or not. The question is how much of the bookkeeping to classify events as "interesting" or "uninteresting" should be embedded in the core dispatch vs being handled by the callback. One reason to prefer keeping this logic in the callback is that with potentially multiple chained callbacks in play, the filtering logic must always exist in the callback, regardless. E.g. if callback A wants to watch only keys-version changes to dict X, but callback B wants to watch all changes to it, events will fire for all changes, and callback A must still disregard "uninteresting" events that it may receive (just like it may receive events for dicts it never asked to watch at all.) So providing API for different "levels" of watching means that the "is this event interesting to me" predicate must effectively be duplicated both in the callback and in the watch level chosen. The proposed rationale for this complexity and duplication is the idea that filtering out uninteresting events at dispatch will provide better performance. But this is hypothetical: it assumes the existence of perf-bottleneck code paths that repeatedly rebind globals. The only benchmark workload with this characteristic that I know of is pystone, and it is not even part of the pyperformance suite, I think precisely because it is not representative of real-world code patterns. And even assuming that we do need to optimize for such code, it's also not obvious that it will be noticeably cheaper in practice to filter on the dispatch side. It may be more useful to focus on API. If we get the API right, internal implementation details can always be adjusted in future if a different implementation can be shown to be noticeably faster for relevant use cases. And if we get existing API right, we can always add new API if we have to. I don't think anything about the proposed simple API precludes adding `PyDict_WatchKeys` as an additional feature, if it turns out to be necessary. One modification to the simple proposed API that should improve the performance (and ease of implementation) of use case #2 would be to split the current `PyDict_EVENT_MODIFIED` into two separate event types: `PyDict_EVENT_MODIFIED` and `PyDict_EVENT_NEW_KEY`. Then the callback-side event filtering for use case #2 would just be `event == PyDict_EVENT_NEW_KEY` instead of requiring a lookup into the dict to see whether the key was previously set or not. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: I've updated the PR to split `PyDict_EVENT_MODIFIED` into separate `PyDict_EVENT_ADDED`, `PyDict_EVENT_MODIFIED`, and `PyDict_EVENT_DELETED` event types. This allows callbacks only interested in e.g. added keys (case #2) to more easily and cheaply skip uninteresting events. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: > There should not be much of a slowdown for this code when watching `CONST`: How and when (and based on what data?) would the adaptive interpreter make the decision that for this code sample the key `CONST`, but not the key `var`, should be watched in the module globals dict? It's easy to contrive an example in which it's beneficial to watch one key but not another, but this is practically irrelevant unless it's also feasible for an optimizer to consistently make the right decision about which key(s) to watch. The code sample also suggests that the module globals dict for a module is being watched while that module's own code object is being executed. In module body execution, writing to globals (vs reading them) is relatively much more common, compared to any other Python code execution context, and it's much less common for the same global to be read many times. Given this, how frequently would watching module globals dictionaries during module body execution be a net win at all? Certainly cases can be contrived in which it would be, but it seems unlikely that it would be a net win overall. And again, unless the optimizer can reliably (and in advance, since module bodies are executed only once) distinguish the cases where it's a win, it seems the example is not practically relevant. > Another use of this is to add watch points in debuggers. > To that end, it would better if the callback were a Python object. It is easy to create a C callback that delegates to a Python callable if someone wants to implement this use case, so the vectorcall overhead is paid only when needed. The core API doesn't need to be made more complex for this, and there's no reason to impose any overhead at all on low-level interpreter-optimization use cases. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue46896] add support for watching writes to selected dictionaries
Carl Meyer added the comment: Thanks for the extended example. I think in order for this example to answer the question I asked, a few more assumptions should be made explicit: 1) Either `spam_var` and/or `eggs_var` are frequently re-bound to new values in a hot code path somewhere. (Given the observations above about module-level code, we should assume for a relevant example this takes place in a function that uses `global spam_var` or `global eggs_var` to allow such rebinding.) 2) But `spam_var` and `eggs_var` are not _read_ in any hot code path anywhere, because if they were, then the adaptive interpreter would be just as likely to decide to watch them as it is to watch `EGGS_CONST`, in which case any benefit of per-key watching in this example disappears. (Keep in mind that with possibly multiple watchers around, "unwatching" anything on the dispatch side is never an option, so we can't say that the adaptive interpreter would decide to unwatch the frequently-re-bound keys after it observes them being re-bound. It can always "unwatch" them in the sense of no longer being interested in them in its callback, though.) It is certainly possible that this case could occur, where some module contains both a frequently-read-but-not-written global and also a global that is re-bound using `global` keyword in a hot path, but rarely read. But it doesn't seem warranted to pre-emptively add a lot of complexity to the API in order to marginally improve the performance of this quite specific case, unsupported by any benchmark or sample workload demonstrating it. > This might not be necessary for us right now I think it's worth keeping in mind that `PyDict_WatchKey` API can always be added later without disturbing or changing semantics of the `PyDict_Watch` API added here. -- ___ Python tracker <https://bugs.python.org/issue46896> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43562] test_ssl.NetworkedTests.test_timeout_connect_ex fails if network is unreachable
New submission from Carl Meyer : In general it seems the CPython test suite takes care to not fail if the network is unreachable, but `test_timeout_connect_ex` fails because the result code of the connection is checked without any exception being raised that would reach `support.transient_internet`. -- components: Tests messages: 389113 nosy: carljm priority: normal severity: normal status: open title: test_ssl.NetworkedTests.test_timeout_connect_ex fails if network is unreachable type: behavior versions: Python 3.10, Python 3.8, Python 3.9 ___ Python tracker <https://bugs.python.org/issue43562> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43562] test_ssl.NetworkedTests.test_timeout_connect_ex fails if network is unreachable
Change by Carl Meyer : -- keywords: +patch pull_requests: +23697 stage: -> patch review pull_request: https://github.com/python/cpython/pull/24937 ___ Python tracker <https://bugs.python.org/issue43562> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43564] some tests in test_urllib2net fail instead of skipping on unreachable network
New submission from Carl Meyer : In general it seems the CPython test suite takes care to skip instead of failing networked tests when the network is unavailable (c.f. `support.transient_internet` test helper). In this case of the 5 FTP tests in `test_urllib2net` (that is, `test_ftp`, `test_ftp_basic`, `test_ftp_default_timeout`, `test_ftp_no_timeout`, and `test_ftp_timeout`), even though they use `support_transient_internet`, they still fail if the network is unavailable. The reason is that they make calls which end up raising an exception in the form `URLError("ftp error: OSError(101, 'Network is unreachable')"` -- the original OSError is flattened into the exception string message, but is otherwise not in the exception args. This means that `transient_network` does not detect it as a suppressable exception. It seems like many uses of `URLError` in urllib pass the original `OSError` directly to `URLError.__init__()`, which means it ends up in `args` and the unwrapping code in `transient_internet` is able to find the original `OSError`. But the ftp code instead directly interpolates the `OSError` into a new message string. -- components: Tests messages: 389115 nosy: carljm priority: normal severity: normal status: open title: some tests in test_urllib2net fail instead of skipping on unreachable network type: behavior versions: Python 3.10 ___ Python tracker <https://bugs.python.org/issue43564> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43564] ftp tests in test_urllib2net fail instead of skipping on unreachable network
Change by Carl Meyer : -- title: some tests in test_urllib2net fail instead of skipping on unreachable network -> ftp tests in test_urllib2net fail instead of skipping on unreachable network ___ Python tracker <https://bugs.python.org/issue43564> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43564] ftp tests in test_urllib2net fail instead of skipping on unreachable network
Change by Carl Meyer : -- keywords: +patch pull_requests: +23699 stage: -> patch review pull_request: https://github.com/python/cpython/pull/24938 ___ Python tracker <https://bugs.python.org/issue43564> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue43564] ftp tests in test_urllib2net fail instead of skipping on unreachable network
Carl Meyer added the comment: Created a PR that fixes this by being more consistent in how urllib wraps network errors. If there are backward-compatibility concerns with this change, another option could be some really ugly regex-matching code in `test.support.transient_internet`. -- ___ Python tracker <https://bugs.python.org/issue43564> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45384] Accept Final as indicating ClassVar for dataclass
Carl Meyer added the comment: > Are Final default_factory fields real fields or pseudo-fields? (i.e. are they > returned by dataclasses.fields()?) They are real fields, returned by `dataclasses.fields()`. In my opinion, the behavior change proposed in this bug is a bad idea all around, and should not be made, and the inconsistency with PEP 591 should rather be resolved by explicitly specifying the interaction with dataclasses in a modification to the PEP. Currently the meaning of: ``` @dataclass class C: x: Final[int] = 3 ``` is well-defined, intuitive, and implemented consistently both in the runtime and in type checkers. It specifies a dataclass field of type `int`, with a default value of `3` for new instances, which can be overridden with an init arg, but cannot be modified (per type checker; runtime doesn't enforce Final) after the instance is initialized. Changing the meaning of the above code to be "a dataclass with no fields, but one final class attribute of value 3" is a backwards-incompatible change to a less useful and less intuitive behavior. I argue the current behavior is intuitive because in general the type annotation on a dataclass attribute applies to the eventual instance attribute, not to the immediate RHS -- this is made very clear by the fact that typecheckers happily accept `x: int = dataclasses.field(...)` which in a non-dataclass context would be a type error. Therefore the Final should similarly be taken to apply to the eventual instance attribute, not to the immediate assignment. And therefore it should not (in the case of dataclasses) imply ClassVar. I realize that this means that if we want to allow final class attributes on dataclasses, it would require wrapping an explicit ClassVar around Final, which violates the current text of PEP 591. I would suggest this is simply because that PEP did not consider the specific case of dataclasses, and the PEP should be amended to carve out dataclasses specifically. -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue45384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue45384] Accept Final as indicating ClassVar for dataclass
Carl Meyer added the comment: Good idea to check with the PEP authors. I don’t think allowing both ClassVar and Final in dataclasses requires general intersection types. Neither ClassVar nor Final are real types; they aren’t part of the type of the value. They are more like special annotations on a name, which are wrapped around a type as syntactic convenience. You’re right that it would require more than just amendment to the PEP text, though; it might require changes to type checkers, and it would also require changes to the runtime behavior of the `typing` module to special-case allowing `ClassVar[Final[…]]`. And the downside of this change is that it couldn’t be context sensitive to only be allowed in dataclasses. But I think this isn’t a big problem; type checkers could still error on that wrapping in non dataclass contexts if they want to. But even if that change can’t be made, I think backwards compatibility still precludes changing the interpretation of `x: Final[int] = 3` on a dataclass, and it is more valuable to be able to specify Final instance attributes (fields) than final class attributes on dataclasses. -- ___ Python tracker <https://bugs.python.org/issue45384> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue39428] allow creation of "symtable entry" objects from Python
New submission from Carl Meyer : Currently the "symtable entry" extension type (PySTEntry_Type) defined in `Python/symtable.c` defines no `tp_new` or `tp_init`, making it impossible to create instances of this type from Python code. I have a use case for pickling symbol tables (as part of a cache subsystem for a static analyzer), but the inability to create instances of symtable entries from attributes makes this impossible, even with custom pickle support via dispatch_table or copyreg. If the idea of making instances of this type creatable from Python is accepted in principle, I can submit a PR for it. Thanks! -- messages: 360522 nosy: carljm priority: normal severity: normal status: open title: allow creation of "symtable entry" objects from Python type: enhancement ___ Python tracker <https://bugs.python.org/issue39428> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: > Anything that is touched by the immortal object will be leaked. This can also > happen in obscure ways if reference cycles are created. I think this is simply expected behavior if you choose to create immortal objects, and not really an issue. How could you have an immortal object that doesn't keep its strong references alive? > this does not fully cover all cases as objects that become tracked by the GC > after they are modified (for instance, dicts and tuples that only contain > immutable objects). Those objects will still participate in reference > counting after they start to be tracked. I think the last sentence here is not quite right. An immortalized object will never start participating in reference counting again after it is immortalized. There are two cases. If at the time of calling `immortalize_heap()` you have a non-GC-tracked object that is also not reachable from any GC-tracked container, then it will not be immortalized at all, so will be unaffected. This is a side effect of the PR using the GC to find objects to immortalize. If the non-GC-tracked object is reachable from a GC-tracked object (I believe this is by far the more common case), then it will be immortalized. If it later becomes GC-tracked, it will start participating in GC (but the immortal bit causes it to appear to the GC to have a very high reference count, so GC will never collect it or any cycle it is part of), but that will not cause it to start participating in reference counting again. > if immortal objects are handed to extension modules compiled with the other > version of the macros, the reference count can be corrupted I think the word "corrupted" makes this sound worse than it is in practice. What happens is just that the object is still effectively immortal (because the immortal bit is a very high bit), but the copy-on-write benefit is lost for the objects touched by old extensions. > 1.17x slower on logging_silent or unpickle_pure_python is a very expensive > price Agreed. It seems the only way this makes sense is under an ifdef and off by default. CPython does a lot of that for debug features; this might be the first case of doing it for a performance feature? > I would be more interested by an experiment to move ob_refcnt outside > PyObject to solve the Copy-on-Write issue It would certainly be interesting to see results of such an experiment. We haven't tried that for refcounts, but in the work that led to `gc.freeze()` we did try relocating the GC header to a side location. We abandoned that because the memory overhead of adding a single indirection pointer to every PyObject was too large to even consider the option further. I suspect that this memory overhead issue and/or likely cache locality problems will make moving refcounts outside PyObject look much worse for performance than this immortal-instances patch does. -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: > An immortalized object will never start participating in reference counting > again after it is immortalized. Well, "passed to an extension compiled with no-immortal headers" is an exception to this. But for the "not GC tracked but later becomes GC tracked" case, it will not re-enter reference counting, only the GC. -- ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: > This may break the garbage collector algorithm that relies on the balance > between strong references between objects and its reference count to do the > calculation of the isolated cycles. I don't think it really breaks anything. What happens is that the immortal object appears to the GC to have a very large reference count, even after adjusting for within-cycle references. So cycles including an immortal object are always kept alive, which is exactly the behavior one should expect from an immortal object. -- ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: I think the concerns about "perfect" behavior in corner cases are in general irrelevant here. In the scenarios where this optimization matters, there is no quantitative change that occurs at 100% coverage. Preventing 99% of CoW is 99% as good as preventing 100% :) So the fact that a few objects here and there in special cases could still trigger CoW just doesn't matter; it's still a massive improvement over the status quo. (That said, I wouldn't _mind_ improving the coverage, e.g. if you can suggest a better way to find all heap objects instead of using the GC.) And similarly, gps is right that the concern that immortal objects can keep other objects alive (even via references added after immortalization) is a non-issue in practice. There really is no other behavior one could prefer or expect instead. > if said objects (isolated and untracked before and now tracked) acquire > strong references to immortal objects, those objects will be visited when the > gc starts calculating the isolated cycles and that requires a balanced > reference count to work. I'm not sure what you mean here by "balanced ref count" or by "work" :) What will happen anytime an immortal object gets into the GC, for any reason, is that the GC will "subtract" cyclic references and see that the immortal object still has a large refcount even after that adjustment, and so it will keep the immortal object and any cycle it is part of alive. This behavior is correct and should be fully expected; nothing breaks. It doesn't matter at all to the GC that this large refcount is "fictional," and it doesn't break the GC algorithm, it results only in the desired behavior of maintaining immortality of immortal objects. It is perhaps slightly weird that this behavior falls out of the immortal bit being a high bit rather than being more explicit. I did do some experimentation with trying to explicitly prevent immortal instances from ever entering GC, but it turned out to be hard to do that in an efficient way. And motivation to do it is low, because there's nothing wrong with the behavior in the existing PR. -- ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: > Is it a common use case to load big data and then fork to use preloaded data? A lot of the "big data" in question here is simply lots of Python module/class/code objects resulting from importing lots of Python modules. And yes, this "pre-fork" model is extremely common for serving Python web applications; it is the way most Python web application servers work. We already have an example in this thread of another large Python web application (YouTube) that had similar needs and considered a similar approach. -- ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: > I would be interested to hear the answer to Antoine's question which is > basically: why not using the multiprocessing fork server? Concretely, because for a long time we have used the uWSGI application server and it manages forking worker processes (among other things), and AFAIK nobody has yet proposed trying to replace that with something built around the multiprocessing module. I'm actually not aware of any popular Python WSGI application server built on top of the multiprocessing module (but some may exist). What problem do you have in mind that the fork server would solve? How is it related to this issue? I looked at the docs and don't see that it does anything to help sharing Python objects' memory between forked processes without CoW. -- ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40255] Fixing Copy on Writes from reference counting
Carl Meyer added the comment: Makes sense. Yes, caution is required about what code runs before fork, but forkserver’s solution for that would be a non-starter for us, since it would ensure that we can share no basically no memory at all between worker processes. -- ___ Python tracker <https://bugs.python.org/issue40255> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Carl Meyer added the comment: I volunteered in the python-dev thread to write a patch to the docs clarifying future status of lib2to3; happy to include the PendingDeprecationWarning as well. Re linking to alternatives, we want to make sure we link to alternatives that are committed to updating to support newer Python versions' syntax. This definitely includes LibCST; I can inquire with the parso maintainer about whether it also includes parso. In future it could also include a third-party-maintained copy of lib2to3, if someone picks that up. -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Change by Carl Meyer : -- pull_requests: +18987 pull_request: https://github.com/python/cpython/pull/19663 ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Carl Meyer added the comment: I opened a PR. It deprecates the lib2to3 library to discourage future use of it for Python3, but not the 2to3 tool. This of course means that the lib2to3 module will in practice stick around in the stdlib as long as 2to3 is still bundled with Python. It seems like the idea in this issue is to deprecate and remove both. I'm not sure what we typically do to deprecate a command-line utility bundled with Python. Given warnings are silent by default, the deprecation warning for lib2to3 won't be visible to users of 2to3. Should I add something to its `--help` output? Or something more aggressive; an unconditionally-printed warning? -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Carl Meyer added the comment: @gregory.p.smith What do you think about the question I raised above about how to make this deprecation visible to users of the 2to3 CLI tool, assuming the plan is to remove both? -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Carl Meyer added the comment: Right, although I think it still makes sense to link both LibCST and parso since they provide different levels of abstraction that would be suitable for different types of tools (e.g. I would rather write an auto-formatter on top of parso, because LibCST's careful parsing and assignment of whitespace would mostly just get in the way, but I'd rather write any kind of refactoring tooling on top of LibCST.) Another tool that escaped my mind when writing the PR that should probably be linked also is Baron/RedBaron (https://github.com/PyCQA/redbaron); 457 stars makes it slightly more popular than LibCST (but it's also been around a lot longer.) -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue40360] Deprecate lib2to3 (and 2to3) for future removal
Carl Meyer added the comment: > Coul you please add a what's new entry for this change? The committed change already included an entry in NEWS. Is a "What's New" entry something different? > I don't understand why there is a PendingDeprecationWarning and not a > DeprecationWarning. Purely because I was following gps' recommendation in the first comment on this issue. Getting rid of PendingDeprecationWarning seems like an orthogonal decision; if it happens, this can trivially be upgraded to DeprecationWarning as part of a removal sweep. -- ___ Python tracker <https://bugs.python.org/issue40360> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14444] Virtualenv not portable from Python 2.7.2 to 2.7.3 (os.urandom missing)
Carl Meyer added the comment: Alternatively, the conditional definition of urandom in os.py (removed in http://hg.python.org/cpython/rev/a0f43f4481e0#l7.1) could be reintroduced, allowing the new stdlib to be used with older interpreters. (Thanks to Dave Malcolm for pointing this out.) This seems like perhaps a reasonable concession to backwards compatibility for a bugfix release. -- ___ Python tracker <http://bugs.python.org/issue1> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14444] Virtualenv not portable from Python 2.7.2 to 2.7.3 (os.urandom missing)
Carl Meyer added the comment: There's no question that this is a case of virtualenv allowing users to do something that's not supported. Nonetheless, virtualenv is very widely used, and in practice it does not break "more often". This, however, will break for lots of users, and those users will (wrongly) perceive the breakage to be caused by a Python security bugfix, not by virtualenv. I think the purity argument is completely correct, but this certainly seems like a situation where practicality might nonetheless beat purity. -- ___ Python tracker <http://bugs.python.org/issue1> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue14444] Virtualenv not portable from Python 2.7.2 to 2.7.3 (os.urandom missing)
Carl Meyer added the comment: I'd been thinking the "escape the security fix" argument didn't apply, because the security fix requires opt-in anyway and the -R flag would fail immediately on a non-updated virtualenv. But there is also the environment variable. It is quite possible that someone could update their system Python, set PYTHONHASHSEED and think they are protected from the hash collision vulnerability, but not be because they are running in a virtualenv. That is a strong argument for letting this break and forcing the update. -- ___ Python tracker <http://bugs.python.org/issue1> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21156] Consider moving importlib.abc.InspectLoader.source_to_code() to importlib.abc.Loader
Carl Meyer added the comment: Making `source_to_code` a staticmethod on the `InspectLoader` abc but not in the `importlib.machinery` implementation causes awkwardness for anyone trying to inherit `SourceFileLoader` and override `source_to_code` in typechecked code, since typeshed assumes that `SourceFileLoader` actually implements the `importlib.abc.FileLoader` interface. Given the ABC registration, it seems that `importlib.machinery.SourceFileLoader` should in fact implement the `importlib.abc.FileLoader` interface. Should we make `SourceFileLoader.source_to_code` a staticmethod also? If so, I can file a separate bug for that. -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue21156> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue30533] missing feature in inspect module: getmembers_static
New submission from Carl Meyer: The inspect module contains a getattr_static() function, for accessing an arbitrary attribute on a Python object without risking descriptor or __getattr__ code execution. This is useful for introspection tools that don't want to trigger any side effects. The inspect module also contains a getmembers() function, which returns a mapping of names to values for all the object's members. This function could also be very useful to introspection tools, except that internally it uses normal getattr, thus reintroduces the risk of arbitrary code execution. It would be useful to have an equivalent to getmembers() that is descriptor-safe. This could be done either by introducing a new getmembers_static(), or possibly by adding a `getattr` optional keyword argument to getmembers, that would take a getattr-equivalent callable to use in fetching attributes from the object. (The latter option might render some internal assumptions of getmembers() incorrect, needs experimentation.) -- components: Library (Lib) messages: 294876 nosy: carljm priority: normal severity: normal status: open title: missing feature in inspect module: getmembers_static type: enhancement versions: Python 3.7 ___ Python tracker <http://bugs.python.org/issue30533> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue31033] Add argument to .cancel() of Task and Future
Change by Carl Meyer : -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue31033> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
New submission from Carl Meyer : We would like to set an environment variable that would cause Python to read and write `__pycache__` directories from a separate location on the filesystem (outside the source code tree). We have two reasons for this: 1. In our development setup (with a webserver running in a container on the dev-tree code), the `__pycache__` directories end up root-owned, and managing permissions on them so that they don't disrupt VCS operations on the code repo is untenable. (Currently we use PYTHONDONTWRITEBYTECODE as a workaround, but we have enough code that this costs us multiple seconds of developer time on every restart; we'd like to take advantage of cached bytecode without requiring that it pollute the code tree.) 2. In addition to just _having_ cached bytecode, we'd like to keep it on a ramdisk to minimize filesystem overhead. Proposal: a `PYTHON_BYTECODE_PATH` environment variable. If set, `source_from_cache` and `cache_from_source` in `importlib._bootstrap_external` will respect it, creating a directory tree under that prefix that mirrors the source tree. -- messages: 316518 nosy: brett.cannon, carljm, eric.snow, lukasz.langa, ncoghlan priority: normal severity: normal status: open title: Environment variable to set alternate location for pycache tree type: enhancement ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33501] split existing optimization levels into granular options
New submission from Carl Meyer : It doesn't make sense for e.g. docstring-stripping to necessarily imply assert-stripping. These are totally separate options, useful for separate reasons, but currently tied together in the `-O` option. This is not just a theoretical problem; at work we must strip docstrings in production for memory reasons, but we would prefer not to strip asserts. In fact we currently lint against use of `assert` because it is stripped in production, and we replace it with our own assertion function, which is less efficient and also integrates poorly with mypy's type binder. A better option would be to enable each of these separate optimizations with a separate command-line flag (probably a string tag passed to a single flag, e.g. `-o strip_docstrings`). PYC filename generation will also need to include all individually-enabled optimization string tags as part of the filename. For backwards-compatibility, the existing `-O` flags should still be supported with the same meaning they currently have; `-O` and the new granular `-o` should be additive. (A version of this was previously proposed as a minor part of PEP 511.) Please let me know if this proposal is of sufficient complexity that a PEP is needed instead of just an issue. -- messages: 316531 nosy: brett.cannon, carljm, eric.snow, lukasz.langa, ncoghlan, vstinner priority: normal severity: normal status: open title: split existing optimization levels into granular options type: enhancement ___ Python tracker <https://bugs.python.org/issue33501> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Carl Meyer added the comment: > I don't think it makes sense to try to make cached_property itself work > implicitly with both normal attributes and slot entries - instead, > cached_property can handle the common case as simply and efficiently as > possible, and the cached_slot case can be either handled separately or else > not at all. So it sounds like the current approach here is good to move forward? If I update the patch during the PyCon sprints, we could merge it? -- ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
Carl Meyer added the comment: Per vstinner Python prefers to not have underscores in environment variable names, for historical reasons. So I'm using `PYTHONBYTECODEPATH` as the env var. Other open questions: 1) Does there need to be a corresponding CLI flag, or is env-var-only sufficient? 2) Is it OK to check the environ every time, or do we need to cache its value in a `sys` flag at startup? Will push an initial version for review that has no CLI flag nor `sys` attribute. -- ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
Change by Carl Meyer : -- keywords: +patch pull_requests: +6517 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
Carl Meyer added the comment: Environment variable seems to make a bit more sense for this, since it's not per-invocation; there's no point writing bytecode cache to a particular location unless the next invocation reads the cache from there. Our use case includes a webserver process that embeds Python; I'm not sure if we could pass a CLI arg to it or not. Python has lots of precedent for similar environment variables (e.g. `PYTHONHOME`, `PYTHONDONTWRITEBYTECODE`, `PYTHONPATH`, etc). Compared to those, `PYTHONBYTECODEPATH` is pretty much harmless if it "leaks" to an unintended process. I asked Brett Cannon in the sprints if I should add a CLI flag in addition to the env var; he suggested it wasn't worth it. I'm not opposed to adding the CLI flag, but I think removing the env var option would be a mistake. -- ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
Carl Meyer added the comment: > a system-wide environment variable Environment variables aren't system-wide, they are per-process (though they can be inherited by child processes). -- ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Carl Meyer added the comment: > a way to invalidate or clear the cache This is already supported by the simple implementation in the patch, it's spelled `del obj.the_cached_property`. > mock patching the underlying function for testing This is easy to do with the current implementation, you can replace the cached-property descriptor on the class with `mock.patch`. > consistency between multiple cached properties cached in different threads The patch attached here is already thread-safe and will be consistent between threads. > inability to run the method through a debugger If you `s` in the Python debugger on a line where a property or cached property is accessed, you will step into the decorated method. I've done this often, so I'm not sure what the issue would be here. > moving the cache valued from an instance variables to an external weakref > dictionary This would be a totally different descriptor that doesn't share much implementation with the one proposed here, so I don't see how providing the common version inhibits anyone from writing something different they need for their case. > The basic recipe is simple so there isn't much of a value add by putting this > in the standard library. It's simple once you understand what it does, but it's quite subtle in the way it relies on priority order of instance-dict attributes vs non-data descriptors. My experience over the past decade is different from yours; I've found that the simple `cached_property` proposed here is widely and frequently useful (basically it should be the preferred approach anytime an object which is intended to be immutable after construction has some calculated properties based on its other attributes), and additional complexity is rarely if ever needed. I think the wide usage of the version proposed here (without extensions) in code in the wild bears this out. Likely a main reason there hasn't been a stronger push to include this in the standard library sooner is that so many people are just using it from `django.utils.functional.cached_property` today. -- ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
Carl Meyer added the comment: Can we have a named -X option that also takes a parameter? I don't see any existing examples of that. This option needs to take the path where bytecode should be written. Are there strong use-cases for having a CLI arg for this? I don't mind doing the implementation work if there are, but right now I'm struggling to think of any case where it would be better to run `python -C /tmp/bytecode` than `PYTHONBYTECODEPATH=/tmp/bytecode python`. Our existing "takes a path" env variables (`PYTHONHOME` and `PYTHONPATH`) do not have CLI equivalents. -- ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33499] Environment variable to set alternate location for pycache tree
Carl Meyer added the comment: Cool, thanks for the pointer on -X. PR is updated with `-X bytecode_path=PATH`; don't think it's critical to have it, but it wasn't that hard to add. -- ___ Python tracker <https://bugs.python.org/issue33499> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Change by Carl Meyer : -- pull_requests: +6636 stage: -> patch review ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Carl Meyer added the comment: Sent a PR with the patch. Nick, I tried your `__set_name__` proposal to get an earlier error in case of an object with slots, but it has the downside that Python seems to always raise a new chained exception if `__set_name__` raises any exception. So instead of getting a clear error, you get an opaque one about "error raised when calling __set_name__ on...", and you have to scroll up to see the real error message. I felt that this was too much usability regression and not worth the benefit of raising the error sooner. Let me know if you feel otherwise. -- ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33577] remove wrapping of __set_name__ exceptions in RuntimeError
New submission from Carl Meyer : Per Nick Coghlan in discussion on issue21145: "I think it would make sense to remove the exception wrapping from the __set_name__ calls - I don't think we're improving the ease of understanding the tracebacks by converting everything to a generic RuntimeError, and we're hurting the UX of descriptor validation cases like this one." https://github.com/python/cpython/blob/master/Objects/typeobject.c#L7263 -- components: Interpreter Core messages: 317099 nosy: carljm priority: normal severity: normal status: open title: remove wrapping of __set_name__ exceptions in RuntimeError versions: Python 3.8 ___ Python tracker <https://bugs.python.org/issue33577> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Carl Meyer added the comment: Makes sense to me. Sounds like a separate issue and PR; I filed issue33577 and will work on a patch. -- ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33577] remove wrapping of __set_name__ exceptions in RuntimeError
Carl Meyer added the comment: Oops, duplicate of issue33576. -- resolution: -> duplicate stage: -> resolved status: open -> closed ___ Python tracker <https://bugs.python.org/issue33577> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Carl Meyer added the comment: Oops, never mind; closed mine as dupe. -- ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33576] Remove exception wrapping from __set_name__ calls
Change by Carl Meyer : -- keywords: +patch pull_requests: +6637 stage: needs patch -> patch review ___ Python tracker <https://bugs.python.org/issue33576> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33576] Remove exception wrapping from __set_name__ calls
Carl Meyer added the comment: Nick, I think the reason this exception wrapping was added is because the stack trace for these exceptions is currently a bit lacking. The "caller" for the `__set_name__` function is the `class` line for the class containing the descriptors. For exceptions raised _inside_ `__set_name__` this is kind of OK, although if a class has multiple instances of the same descriptor class on it, it doesn't give you an obvious clue which instance raised the exception ( though you can probably figure it out quickly enough by checking the value of the `name` argument). For cases where the exception is raised at the caller (e.g. a TypeError due to a `__set_name__` method with wrong signature) it's worse; you get no pointer to either the problematic descriptor instance, or its name, or the class; all you get is the TypeError and the class that contains a broken descriptor. In practice I don't know how much of a problem this is; it doesn't seem likely that it would take too long to narrow down the source of the issue. Let me know what you think. -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue33576> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33576] Remove exception wrapping from __set_name__ calls
Carl Meyer added the comment: Awkwardly, the motivating use case in issue21145 is a TypeError that we wanted to raise within __set_name__, and not have replaced. It feels a little ugly to special case TypeError this way. I like the _PyErr_TrySetFromCause idea. That function is a bit ugly too, in the way it has to try and sniff out whether an exception has extra state or is safe to copy and add extra context to. But in practice I think the results would be pretty good here. Most of the time you’d get the original exception but with added useful context; occasionally for some exception types you might just not get the extra context. But as long as TypeError falls in the former category it would help with the worst case. I’ll look at using that in the PR. -- ___ Python tracker <https://bugs.python.org/issue33576> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue21145] Add the @cached_property decorator
Carl Meyer added the comment: Thanks everyone for the thoughtful and careful reviews! Patch is much improved from where it started. And thanks Nick for merging. -- ___ Python tracker <https://bugs.python.org/issue21145> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue34995] functools.cached_property does not maintain the wrapped method's __isabstractmethod__
Carl Meyer added the comment: FWIW, it seems to me (author of `cached_property` patch) that while just using `@property` on the abstract method plus a comment is a reasonable and functional workaround that sacrifices only a small documentation value, there's no reason why `@cached_property` shouldn't propagate this flag in the same way `@property` does. It seems reasonable to me to consider this behavior discrepancy a bug; I'd have fixed it if made aware of it while developing `cached_property`. -- nosy: +carljm ___ Python tracker <https://bugs.python.org/issue34995> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue15374] venv environment variable should follow the conventions
Carl Meyer added the comment: Yes, there are a number of third-party utility packages (and many, many e.g. personal custom bash prompts) that check the value of the $VIRTUAL_ENV variable to detect whether one is currently active, and display its name. Unless there's an overriding reason, it would be nice to not require changing all of this code. Certainly not all third-party virtualenv-related tools will be compatible with pyvenv unchanged; for instance tools that create envs will need to use the updated pyvenv API. But there is a lot of code using $VIRTUAL_ENV that won't require any other changes, if we can keep using the same env var. -- ___ Python tracker <http://bugs.python.org/issue15374> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16116] Can not install C extension modules to inside a venv on Python 3.3.0 for Win32
Carl Meyer added the comment: On cursory inspection, I agree that this is precisely what the "if win32" block in `virtualenv_embedded/distutils-init.py` is intended to fix, and it seems to me the correct fix is likely to just make the equivalent fix directly in distutils: change the library_dirs-building code in `distutils.command.build_ext:finalize_options` (under the "if os.name == 'nt'" block) to build the path relative to `sys.base_exec_prefix` rather than `sys.exec_prefix`. -- ___ Python tracker <http://bugs.python.org/issue16116> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16116] Can not install C extension modules to inside a venv on Python 3.3.0 for Win32
Carl Meyer added the comment: (Actually, to match virtualenv's fix it should add the paths based on both exec_prefix and base_exec_prefix, if they are different.) -- ___ Python tracker <http://bugs.python.org/issue16116> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16480] pyvenv 3.3 fails to create symlinks for /local/{bin, lib} to /{bin, lib}
Carl Meyer added the comment: Here is the bug filed against virtualenv that led to the addition of the local/ directory: https://github.com/pypa/virtualenv/issues/118 As Vinay pointed out, the original fix was later modified to be friendlier to tools that dislike recursive symlinks. That's about all I know; I don't know if any of this is still needed in 3.3/3.4. -- ___ Python tracker <http://bugs.python.org/issue16480> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue16480] pyvenv 3.3 fails to create symlinks for /local/{bin, lib} to /{bin, lib}
Carl Meyer added the comment: What OS are you on, Marco? It looks to me like pyvenv probably does need the same hack as virtualenv here, to deal with OSes who set posix_local as the default installation scheme. -- ___ Python tracker <http://bugs.python.org/issue16480> ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue19139] In venv, __VENV_NAME__ is the prompt, not the name
Carl Meyer added the comment: Makes sense to me. -- ___ Python tracker <http://bugs.python.org/issue19139> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27715] call-matcher breaks if a method is mocked with spec=True
New submission from Carl Meyer: When constructing call-matchers to match expected vs actual calls, if `spec=True` was used when patching a function, mock attempts to bind the recorded (and expected) call args to the function signature. But if a method was mocked, the signature includes `self` and the recorded call args don't. This can easily lead to a `TypeError`: ``` from unittest.mock import patch class Foo: def bar(self, x): return x with patch.object(Foo, 'bar', spec=True) as mock_bar: f = Foo() f.bar(7) mock_bar.assert_called_once_with(7) ``` The above code worked in mock 1.0, but fails in Python 3.5 and 3.6 tip with this error: ``` TypeError: missing a required argument: 'x' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "../mock-method.example.py", line 11, in mock_bar.assert_called_once_with(7) File "/home/carljm/projects/python/cpython/Lib/unittest/mock.py", line 203, in assert_called_once_with return mock.assert_called_once_with(*args, **kwargs) File "/home/carljm/projects/python/cpython/Lib/unittest/mock.py", line 822, in assert_called_once_with return self.assert_called_with(*args, **kwargs) File "/home/carljm/projects/python/cpython/Lib/unittest/mock.py", line 811, in assert_called_with raise AssertionError(_error_message()) from cause AssertionError: Expected call: bar(7) Actual call: bar(<__main__.Foo object at 0x7fdca80b7550>, 7) ``` ``` If you try to pass in the instance as an expected call arg, the error goes away but it just fails to match: ``` AssertionError: Expected call: bar(<__main__.Foo object at 0x7f5cbab35fd0>, 7) Actual call: bar(7) ``` So AFAICT there is no way to successfully use `spec=True` when patching a method of a class. Oddly, using `autospec=True` instead of `spec=True` _does_ record the instance as an argument in the recorded call args, meaning that you have to pass it in as an argument to e.g. `assert_called_with`. But in many (most?) cases where you're patching a method of a class, your test doesn't have access to the instance, elsewise you'd likely just patch the instance instead of the class in the first place. I don't see a good reason why `autospec=True` and `spec=True` should differ in this way (if both options are needed, there should be a separate flag to control that behavior; it doesn't seem related to the documented differences between autospec and spec). I do think a) there needs to be some way to record call args to a method and assert against those call args without needing the instance (or resorting to manual assertions against a sliced `call_args`), and b) there should be some way to successfully use `spec=True` when patching a method of a class. -- components: Library (Lib) files: mock-method.example.py messages: 272209 nosy: carljm priority: normal severity: normal status: open title: call-matcher breaks if a method is mocked with spec=True versions: Python 3.5, Python 3.6 Added file: http://bugs.python.org/file44054/mock-method.example.py ___ Python tracker <http://bugs.python.org/issue27715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27715] call-matcher breaks if a method is mocked with spec=True
Carl Meyer added the comment: (This bug is also present in Python 3.4.4.) -- type: -> crash versions: +Python 3.4 ___ Python tracker <http://bugs.python.org/issue27715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27715] call-matcher breaks if a method is mocked with spec=True
Carl Meyer added the comment: It seems likely that this regression originated with https://hg.python.org/cpython/rev/b888c9043566/ (can't confirm via bisection as the commits around that time fail to compile for me). -- nosy: +michael.foord, pitrou ___ Python tracker <http://bugs.python.org/issue27715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27715] call-matcher breaks if a method is mocked with spec=True
Changes by Carl Meyer : Removed file: http://bugs.python.org/file44054/mock-method.example.py ___ Python tracker <http://bugs.python.org/issue27715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27715] call-matcher breaks if a method is mocked with spec=True
Carl Meyer added the comment: `hg clean --all` resolved the compilation issues; confirmed that https://hg.python.org/cpython/rev/b888c9043566/ is at fault. Also, the exception trace I provided above looks wrong; it must be from when I was messing about with `autospec=True` or passing in the instance. The actual trace from the sample code in the original report has no mention of the instance: ``` TypeError: missing a required argument: 'x' The above exception was the direct cause of the following exception: Traceback (most recent call last): File "../mock-method.example.py", line 11, in mock_bar.assert_called_once_with(7) File "/home/carljm/projects/python/cpython/Lib/unittest/mock.py", line 822, in assert_called_once_with return self.assert_called_with(*args, **kwargs) File "/home/carljm/projects/python/cpython/Lib/unittest/mock.py", line 811, in assert_called_with raise AssertionError(_error_message()) from cause AssertionError: Expected call: bar(7) Actual call: bar(7) ``` -- ___ Python tracker <http://bugs.python.org/issue27715> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com