Re: [Python-Dev] Questions for the PEP 418: monotonic vs steady, is_adjusted
On 14 April 2012 06:41, Stephen J. Turnbull wrote: > A clock can be accurate in measuring > duration even though it is not accurate in measuring the point in > time. [It's hard to see how the opposite could be true.] Pedantic point: A clock that is stepped (say, by NTP) is precisely one that is accurate in measuring the point in time (that's what stepping is *for*) but not in measuring duration. Paul. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Questions for the PEP 418: monotonic vs steady, is_adjusted
On Sat, 14 Apr 2012 02:51:09 +0200 Victor Stinner wrote: > > time.monotonic() does not fallback to the system clock anymore, it is > now always monotonic. Then just call it "monotonic" :-) > I prefer "steady" over "monotonic" because the steady property is what > users really expect from a "monotonic" clock. A monotonic but not > steady clock may be useless. "steady" is ambiguous IMO. It can only be "steady" in reference to another clock - but which one ? (real time presumably, but perhaps not, e.g. if the clock gets suspended on suspend) Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Questions for the PEP 418: monotonic vs steady, is_adjusted
>> I prefer "steady" over "monotonic" because the steady property is what >> users really expect from a "monotonic" clock. A monotonic but not >> steady clock may be useless. > > "steady" is ambiguous IMO. It can only be "steady" in reference to > another clock - but which one ? (real time presumably, but perhaps not, > e.g. if the clock gets suspended on suspend) Yes, real time is the reference when I say that CLOCK_MONOTONIC is steadier than CLOCK_MONOTONIC_RAW. I agree that CLOCK_MONOTONIC is not steady from the real time reference when the system is suspended. CLOCK_BOOTTIME includes suspend time, but it was only introduced recently in Linux. Because the "steady" name is controversal, I agree to use the "monotonic" name. I will complete the section explaning why time.monotonic() is not called steady :-) Victor ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Compiling Python on Linux with Intel's icc
Thought I'd tie this thread up with a successful method, as I've just compiled Python-2.7.3 and have got the benchmarks to run slightly faster than the system Python :D ** First benchmark ** metabuntu:benchmarks> python perf.py -r -b apps /usr/bin/python ../Python-2.7.3/python Running 2to3... INFO:root:Running ../Python-2.7.3/python lib/2to3/2to3 -f all lib/2to3_data INFO:root:Running `['../Python-2.7.3/python', 'lib/2to3/2to3', '-f', 'all', 'lib/2to3_data']` 5 times INFO:root:Running /usr/bin/python lib/2to3/2to3 -f all lib/2to3_data INFO:root:Running `['/usr/bin/python', 'lib/2to3/2to3', '-f', 'all', 'lib/2to3_data']` 5 times Running html5lib... INFO:root:Running ../Python-2.7.3/python performance/bm_html5lib.py -n 1 INFO:root:Running `['../Python-2.7.3/python', 'performance/bm_html5lib.py', '-n', '1']` 10 times INFO:root:Running /usr/bin/python performance/bm_html5lib.py -n 1 INFO:root:Running `['/usr/bin/python', 'performance/bm_html5lib.py', '-n', '1']` 10 times Running rietveld... INFO:root:Running ../Python-2.7.3/python performance/bm_rietveld.py -n 100 INFO:root:Running /usr/bin/python performance/bm_rietveld.py -n 100 Running spambayes... INFO:root:Running ../Python-2.7.3/python performance/bm_spambayes.py -n 100 INFO:root:Running /usr/bin/python performance/bm_spambayes.py -n 100 Report on Linux metabuntu 3.0.0-19-server #32-Ubuntu SMP Thu Apr 5 20:05:13 UTC 2012 x86_64 x86_64 Total CPU cores: 12 ### html5lib ### Min: 8.132508 -> 7.316457: 1.11x faster Avg: 8.297318 -> 7.460066: 1.11x faster Significant (t=11.15) Stddev: 0.21605 -> 0.09843: 2.1950x smaller Timeline: http://tinyurl.com/bqql4oa ### rietveld ### Min: 0.297604 -> 0.276587: 1.08x faster Avg: 0.302667 -> 0.279202: 1.08x faster Significant (t=37.06) Stddev: 0.00529 -> 0.00348: 1.5188x smaller Timeline: http://tinyurl.com/brb3dk5 ### spambayes ### Min: 0.152264 -> 0.143518: 1.06x faster Avg: 0.156512 -> 0.146559: 1.07x faster Significant (t=6.66) Stddev: 0.00847 -> 0.01232: 1.4547x larger Timeline: http://tinyurl.com/d2dzz6k The following not significant results are hidden, use -v to show them: 2to3. ( I just noticed the date's wrong in the above report... But I did run that just now, being April 14th 2012, ~1300GMT ) ** Required patch ** Only file that breaks compilation is Modules/_ctypes/libffi/src/x86/ffi64.c I uploaded a patch to http://bugs.python.org/issue4130 that corrects the __int128_t issue. ** Compilation method ** I used a two-step compilation process, with Profile-Guided Optimisation. Relevant environment variables are at the bottom. In the build directory, make a separate directory for the PGO files. mkdir PGO Then, configure command:- CFLAGS="-O3 -fomit-frame-pointer -shared-intel -fpic -prof-gen -prof-dir $PWD/PGO -fp-model strict -no-prec-div -xHost -fomit-frame-pointer" \ ./configure --with-libm="-limf" --with-libc="-lirc" --with-signal-module --with-cxx-main="icpc" --without-gcc --build=x86_64-linux-intel Then I ran `make -j9` and `make test`. Running the tests ensures that (almost) every module is run at least once. As the -prof-gen option was used, this means that PGO information is written to files in -prof-dir, when the binaries are running. To give the code even more rigorous usage, I also ran the benchmark suite, which generates even more PGO information. The results are useless though. Then, need to do a `make clean`, and reconfigure. This time, add "-ipo" to CFLAGS, enabling inter-procedural optimisation, and change "-prof-gen" for "-prof-use":- CFLAGS="-O3 -fomit-frame-pointer -ipo -shared-intel -fpic -prof-use -prof-dir $PWD/PGO -fp-model strict -no-prec-div -xHost -fomit-frame-pointer" \ ./configure --with-libm="-limf" --with-libc="-lirc" --with-signal-module --with-cxx-main="icpc" --without-gcc --build=x86_64-linux-intel Then, of course make -j9 && make test At this point, I produced the above benchmark results. ** Failed test summary ** I'm happy with most of them, except I don't get what the test_gdbm failure is on about..? I should probably add --enable-curses to the configure command, and I wouldn't mind getting the network and audio modules to build, but I can't see any relevant configure options nor find any missing dependencies. Any suggestions would be appreciated. 349 tests OK. 2 tests failed: test_cmath test_gdb 1 test altered the execution environment: test_distutils 37 tests skipped: test_aepack test_al test_applesingle test_bsddb test_bsddb185 test_bsddb3 test_cd test_cl test_codecmaps_cn test_codecmaps_hk test_codecmaps_jp test_codecmaps_kr test_codecmaps_tw test_curses test_dl test_gl test_imageop test_imgfile test_kqueue test_linuxaudiodev test_macos test_macostools test_msilib test_ossaudiodev test_scriptpackages test_smtpnet test_socketserver test_startfile test_sunaudiodev test_timeout test_tk test_ttk_guionly test_urllib2net test_urllibnet test_winreg test_winsound test_zipfile6
[Python-Dev] importlib is now bootstrapped (and what that means)
My multi-year project -- started in 2006 according to my blog -- to rewrite import in pure Python and then bootstrap it into CPython as *the* implementation of __import__() is finally over (mostly)! Hopefully I didn't break too much code in the process. =) Now this is "mostly" finished because the single incompatibility that importlib has is that it doesn't check the Windows registry for paths to search since I lack a Windows installation to develop and test on. If someone can tackle that issue that would be greatly appreciated ( http://bugs.python.org/issue14578). Next up is how to maintain/develop for all of this. So the Makefile will regenerate Python/importlib.h whenever Lib/importlib/_bootstrap.py or Python/freeze_importlib.py changes. So if you make a change to importlib make sure to get it working and tested before running 'make' again else you will generate a bad frozen importlib (if you do mess up you can also revert the changes to importlib.h and re-run make; a perk to having importlib.h checked in). Otherwise keep in mind that you can't use any module that isn't a builtin (sys.builtin_module_names) in importlib._bootstrap since you can't import something w/o import working. =) Where does this leave imp and Python/import.c? I want to make imp into _imp and then implement as much as possible in pure Python (either in importlib itself or in Lib/imp.py). Once that has happened then more C code in import.c can be gutted (see http://bugs.python.org/issue13959 for tracking this work which I will start piecemeal shortly). I have some ideas on how to improve things for import, but I'm going to do them as separate emails to have separate discussion threads on them so all of this is easier to follow (e.g. actually following through on PEP 302 and exposing the import machinery as importers instead of having anything be implicit, etc.). And the only outstanding point of contention in all of this is that some people don't like having freeze_importlib.py in Python/ and instead want it in Tools/. I didn't leave it in Tools/ as I have always viewed that Python should build w/o the Tools directory, but maybe the Linux distros actually ship with it and thus this is an unneeded worry. Plus the scripts to generate the AST are in Parser so there is precedent for what I have done. Anyway, I will write up the What's New entry and double-check the language spec for updating once all of the potential changes I want to talk about in other emails have been resolved. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] importlib is now bootstrapped (and what that means)
On 14.04.2012 20:12, Brett Cannon wrote: My multi-year project -- started in 2006 according to my blog -- to rewrite import in pure Python and then bootstrap it into CPython as *the* implementation of __import__() is finally over (mostly)! Hopefully I didn't break too much code in the process. =) Now this is "mostly" finished because the single incompatibility that importlib has is that it doesn't check the Windows registry for paths to search since I lack a Windows installation to develop and test on. If someone can tackle that issue that would be greatly appreciated (http://bugs.python.org/issue14578). Next up is how to maintain/develop for all of this. So the Makefile will regenerate Python/importlib.h whenever Lib/importlib/_bootstrap.py or Python/freeze_importlib.py changes. So if you make a change to importlib make sure to get it working and tested before running 'make' again else you will generate a bad frozen importlib (if you do mess up you can also revert the changes to importlib.h and re-run make; a perk to having importlib.h checked in). Otherwise keep in mind that you can't use any module that isn't a builtin (sys.builtin_module_names) in importlib._bootstrap since you can't import something w/o import working. =) We've just now talked on IRC about this regeneration. Since both files -- _bootstrap.py and importlib.h -- are checked in, a make run can try to re- generate importlib.h. This depends on the timestamps of the two files, which I don't think Mercurial makes any guarantees about. We have other instances of this (e.g. the Objects/typeslots.inc file is generated and checked in), but in the case of importlib, we have to use the ./python binary for freezing to avoid bytecode incompatibilities, which obviously is a problem if ./python isn't built yet. And the only outstanding point of contention in all of this is that some people don't like having freeze_importlib.py in Python/ and instead want it in Tools/. I didn't leave it in Tools/ as I have always viewed that Python should build w/o the Tools directory, but maybe the Linux distros actually ship with it and thus this is an unneeded worry. Plus the scripts to generate the AST are in Parser so there is precedent for what I have done. I would have no objections to Python/. There is also e.g. Objects/typeslots.py. Georg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] making the import machinery explicit
To start off, what I am about to propose was brought up at the PyCon language summit and the whole room agreed with what I want to do here, so I honestly don't expect much of an argument (famous last words). In the "ancient" import.c days, a lot of import's stuff was hidden deep in the C code and in no way exposed to the user. But with importlib finishing PEP 302's phase 2 plans of getting imoprt to be properly refactored to use importers, path hooks, etc., this need no longer be the case. So what I propose to do is stop having import have any kind of implicit machinery. This means sys.meta_path gets a path finder that does the heavy lifting for import and sys.path_hooks gets a hook which provides a default finder. As of right now those two pieces of machinery are entirely implicit in importlib and can't be modified, stopped, etc. If this happens, what changes? First, more of importlib will get publicly exposed (e.g. the meta path finder would become public instead of private like it is along with everything else that is publicly exposed). Second, import itself technically becomes much simpler since it really then is about resolving module names, traversing sys.meta_path, and then handling fromlist w/ everything else coming from how the path finder and path hook work. What also changes is that sys.meta_path and sys.path_hooks cannot be blindly reset w/o blowing out import. I doubt anyone is even touching those attributes in the common case, and the few that do can easily just stop wiping out those two lists. If people really care we can do a warning in 3.3 if they are found to be empty and then fall back to old semantics, but I honestly don't see this being an issue as backwards-compatibility would just require being more careful of what you delete (which I have been warning people to do for years now) which is a minor code change which falls in line with what goes along with any new Python version. And lastly, sticking None in sys.path_importer_cache would no longer mean "do the implicit thing" and instead would mean the same as NullImporter does now (which also means import can put None into sys.path_importer_cache instead of NullImporter): no finder is available for an entry on sys.path when None is found. Once again, I don't see anyone explicitly sticking None into sys.path_importer_cache, and if they are they can easily stick what will be the newly exposed finder in there instead. The more common case would be people wiping out all entries of NullImporter so as to have a new sys.path_hook entry take effect. That code would instead need to clear out None on top of NullImporter as well (in Python 3.2 and earlier this would just be a performance loss, not a semantic change). So this too could change in Python 3.3 as long as people update their code like they do with any other new version of Python. In summary, I want no more magic "behind the curtain" for Python 3.3 and import: sys.meta_path and sys.path_hooks contain what they should and if they are emptied then imports will fail and None in sys.path_importer_cache means "no finder" instead of "use magical, implicit stuff". ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Require loaders set __package__ and __loader__
An open issue in PEP 302 is whether to require __loader__ attributes on modules. The claimed worry is memory consumption, but considering importlib and zipimport are already doing this that seems like a red herring. Requiring it, though, opens the door to people relying on its existence and thus starting to do things like loading assets with ``__loader__.get_data(path_to_internal_package_file)`` which allows code to not care how modules are stored (e.g. zip file, sqlite database, etc.). What I would like to do is update the PEP to state that loaders are expected to set __loader__. Now importlib will get updated to do that implicitly so external code can expect it post-import, but requiring loaders to set it would mean that code executed during import can rely on it as well. As for __package__, PEP 366 states that modules should set it but it isn't referenced by PEP 302. What I want to do is add a reference and make it required like __loader__. Importlib already sets it implicitly post-import, but once again it would be nice to do this pre-import. To help facilitate both new requirements, I would update the importlib.util.module_for_loader decorator to set both on a module that doesn't have them before passing the module down to the decorated method. That way people already using the decorator don't have to worry about anything and it is one less detail to have to worry about. I would also update the docs on importlib.util.set_package and importlib.util.set_loader to suggest people use importlib.util.module_for_loader and only use the other two decorators for backwards-compatibility. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] making the import machinery explicit
On Sat, Apr 14, 2012 at 2:03 PM, Brett Cannon wrote: > To start off, what I am about to propose was brought up at the PyCon > language summit and the whole room agreed with what I want to do here, so I > honestly don't expect much of an argument (famous last words). > > In the "ancient" import.c days, a lot of import's stuff was hidden deep in > the C code and in no way exposed to the user. But with importlib finishing > PEP 302's phase 2 plans of getting imoprt to be properly refactored to use > importers, path hooks, etc., this need no longer be the case. > > So what I propose to do is stop having import have any kind of implicit > machinery. This means sys.meta_path gets a path finder that does the heavy > lifting for import and sys.path_hooks gets a hook which provides a default > finder. As of right now those two pieces of machinery are entirely implicit > in importlib and can't be modified, stopped, etc. > > If this happens, what changes? First, more of importlib will get publicly > exposed (e.g. the meta path finder would become public instead of private > like it is along with everything else that is publicly exposed). Second, > import itself technically becomes much simpler since it really then is about > resolving module names, traversing sys.meta_path, and then handling fromlist > w/ everything else coming from how the path finder and path hook work. > > What also changes is that sys.meta_path and sys.path_hooks cannot be blindly > reset w/o blowing out import. I doubt anyone is even touching those > attributes in the common case, and the few that do can easily just stop > wiping out those two lists. If people really care we can do a warning in 3.3 > if they are found to be empty and then fall back to old semantics, but I > honestly don't see this being an issue as backwards-compatibility would just > require being more careful of what you delete (which I have been warning > people to do for years now) which is a minor code change which falls in line > with what goes along with any new Python version. > > And lastly, sticking None in sys.path_importer_cache would no longer mean > "do the implicit thing" and instead would mean the same as NullImporter does > now (which also means import can put None into sys.path_importer_cache > instead of NullImporter): no finder is available for an entry on sys.path > when None is found. Once again, I don't see anyone explicitly sticking None > into sys.path_importer_cache, and if they are they can easily stick what > will be the newly exposed finder in there instead. The more common case > would be people wiping out all entries of NullImporter so as to have a new > sys.path_hook entry take effect. That code would instead need to clear out > None on top of NullImporter as well (in Python 3.2 and earlier this would > just be a performance loss, not a semantic change). So this too could change > in Python 3.3 as long as people update their code like they do with any > other new version of Python. > > In summary, I want no more magic "behind the curtain" for Python 3.3 and > import: sys.meta_path and sys.path_hooks contain what they should and if > they are emptied then imports will fail and None in sys.path_importer_cache > means "no finder" instead of "use magical, implicit stuff". This is great, Brett. About sys.meta_path and sys.path_hooks, I see only one potential backwards-compatibility problem. Those implicit hooks were fallbacks, effectively always at the end of the list, no matter how you manipulated the them. Code that appended onto those lists would now have to insert the importers/finders in the right way. Otherwise the default hooks would be tried first, which has a good chance of being the wrong thing. That concern aside, I'm a big +1 on your proposal. -eric ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 2:56 PM, Brett Cannon wrote: > An open issue in PEP 302 is whether to require __loader__ attributes on > modules. The claimed worry is memory consumption, but considering importlib > and zipimport are already doing this that seems like a red herring. > Requiring it, though, opens the door to people relying on its existence and > thus starting to do things like loading assets with > ``__loader__.get_data(path_to_internal_package_file)`` which allows code to > not care how modules are stored (e.g. zip file, sqlite database, etc.). > > What I would like to do is update the PEP to state that loaders are expected > to set __loader__. Now importlib will get updated to do that implicitly so > external code can expect it post-import, but requiring loaders to set it > would mean that code executed during import can rely on it as well. > > As for __package__, PEP 366 states that modules should set it but it isn't > referenced by PEP 302. What I want to do is add a reference and make it > required like __loader__. Importlib already sets it implicitly post-import, > but once again it would be nice to do this pre-import. > > To help facilitate both new requirements, I would update the > importlib.util.module_for_loader decorator to set both on a module that > doesn't have them before passing the module down to the decorated method. > That way people already using the decorator don't have to worry about > anything and it is one less detail to have to worry about. I would also > update the docs on importlib.util.set_package and importlib.util.set_loader > to suggest people use importlib.util.module_for_loader and only use the > other two decorators for backwards-compatibility. +1 -eric ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] making the import machinery explicit
On 14 April 2012 21:03, Brett Cannon wrote: > So what I propose to do is stop having import have any kind of implicit > machinery. This means sys.meta_path gets a path finder that does the heavy > lifting for import and sys.path_hooks gets a hook which provides a default > finder. +1 to your proposal. And thanks for all of your work on importlib - it makes me very happy to see the ideas Just and I thrashed out in PEP 302 come together fully at last. Paul. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] making the import machinery explicit
On Sat, Apr 14, 2012 at 17:12, Eric Snow wrote: > On Sat, Apr 14, 2012 at 2:03 PM, Brett Cannon wrote: > > To start off, what I am about to propose was brought up at the PyCon > > language summit and the whole room agreed with what I want to do here, > so I > > honestly don't expect much of an argument (famous last words). > > > > In the "ancient" import.c days, a lot of import's stuff was hidden deep > in > > the C code and in no way exposed to the user. But with importlib > finishing > > PEP 302's phase 2 plans of getting imoprt to be properly refactored to > use > > importers, path hooks, etc., this need no longer be the case. > > > > So what I propose to do is stop having import have any kind of implicit > > machinery. This means sys.meta_path gets a path finder that does the > heavy > > lifting for import and sys.path_hooks gets a hook which provides a > default > > finder. As of right now those two pieces of machinery are entirely > implicit > > in importlib and can't be modified, stopped, etc. > > > > If this happens, what changes? First, more of importlib will get publicly > > exposed (e.g. the meta path finder would become public instead of private > > like it is along with everything else that is publicly exposed). Second, > > import itself technically becomes much simpler since it really then is > about > > resolving module names, traversing sys.meta_path, and then handling > fromlist > > w/ everything else coming from how the path finder and path hook work. > > > > What also changes is that sys.meta_path and sys.path_hooks cannot be > blindly > > reset w/o blowing out import. I doubt anyone is even touching those > > attributes in the common case, and the few that do can easily just stop > > wiping out those two lists. If people really care we can do a warning in > 3.3 > > if they are found to be empty and then fall back to old semantics, but I > > honestly don't see this being an issue as backwards-compatibility would > just > > require being more careful of what you delete (which I have been warning > > people to do for years now) which is a minor code change which falls in > line > > with what goes along with any new Python version. > > > > And lastly, sticking None in sys.path_importer_cache would no longer mean > > "do the implicit thing" and instead would mean the same as NullImporter > does > > now (which also means import can put None into sys.path_importer_cache > > instead of NullImporter): no finder is available for an entry on sys.path > > when None is found. Once again, I don't see anyone explicitly sticking > None > > into sys.path_importer_cache, and if they are they can easily stick what > > will be the newly exposed finder in there instead. The more common case > > would be people wiping out all entries of NullImporter so as to have a > new > > sys.path_hook entry take effect. That code would instead need to clear > out > > None on top of NullImporter as well (in Python 3.2 and earlier this would > > just be a performance loss, not a semantic change). So this too could > change > > in Python 3.3 as long as people update their code like they do with any > > other new version of Python. > > > > In summary, I want no more magic "behind the curtain" for Python 3.3 and > > import: sys.meta_path and sys.path_hooks contain what they should and if > > they are emptied then imports will fail and None in > sys.path_importer_cache > > means "no finder" instead of "use magical, implicit stuff". > > This is great, Brett. About sys.meta_path and sys.path_hooks, I see > only one potential backwards-compatibility problem. > > Those implicit hooks were fallbacks, effectively always at the end of > the list, no matter how you manipulated the them. Code that appended > onto those lists would now have to insert the importers/finders in the > right way. Otherwise the default hooks would be tried first, which > has a good chance of being the wrong thing. > > That concern aside, I'm a big +1 on your proposal. Once again, it's just code that needs updating to run on Python 3.3 so I don't view it as a concern. Going from list.append() to list.insert() (even if its ``list.insert(hook, len(list)-2)``) is not exactly difficult. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 2:15 PM, Eric Snow wrote:
> On Sat, Apr 14, 2012 at 2:56 PM, Brett Cannon wrote:
>> An open issue in PEP 302 is whether to require __loader__ attributes on
>> modules. The claimed worry is memory consumption, but considering importlib
>> and zipimport are already doing this that seems like a red herring.
>> Requiring it, though, opens the door to people relying on its existence and
>> thus starting to do things like loading assets with
>> ``__loader__.get_data(path_to_internal_package_file)`` which allows code to
>> not care how modules are stored (e.g. zip file, sqlite database, etc.).
>>
>> What I would like to do is update the PEP to state that loaders are expected
>> to set __loader__. Now importlib will get updated to do that implicitly so
>> external code can expect it post-import, but requiring loaders to set it
>> would mean that code executed during import can rely on it as well.
>>
>> As for __package__, PEP 366 states that modules should set it but it isn't
>> referenced by PEP 302. What I want to do is add a reference and make it
>> required like __loader__. Importlib already sets it implicitly post-import,
>> but once again it would be nice to do this pre-import.
>>
>> To help facilitate both new requirements, I would update the
>> importlib.util.module_for_loader decorator to set both on a module that
>> doesn't have them before passing the module down to the decorated method.
>> That way people already using the decorator don't have to worry about
>> anything and it is one less detail to have to worry about. I would also
>> update the docs on importlib.util.set_package and importlib.util.set_loader
>> to suggest people use importlib.util.module_for_loader and only use the
>> other two decorators for backwards-compatibility.
>
> +1
Funny, I was just thinking about having a simple standard API that
will let you open files (and list directories) relative to a given
module or package regardless of how the thing is loaded. If we
guarantee that there's always a __loader__ that's a first step, though
I think we may need to do a little more to get people who currently do
things like open(os.path.join(os.path.basename(__file__),
'some_file_name') to switch. I was thinking of having a stdlib
function that you give a module/package object, a relative filename,
and optionally a mode ('b' or 't') and returns a stream -- and sibling
functions that return a string or bytes object (depending on what API
the user is using either the stream or the data can be more useful).
What would we call thos functions and where would the live?
--
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
Am 15.04.2012 00:32, schrieb Guido van Rossum:
> Funny, I was just thinking about having a simple standard API that
> will let you open files (and list directories) relative to a given
> module or package regardless of how the thing is loaded. If we
> guarantee that there's always a __loader__ that's a first step, though
> I think we may need to do a little more to get people who currently do
> things like open(os.path.join(os.path.basename(__file__),
> 'some_file_name') to switch. I was thinking of having a stdlib
> function that you give a module/package object, a relative filename,
> and optionally a mode ('b' or 't') and returns a stream -- and sibling
> functions that return a string or bytes object (depending on what API
> the user is using either the stream or the data can be more useful).
> What would we call thos functions and where would the live?
pkg_resources has a similar API [1] that supports dotted names.
pkg_resources also does some caching for files that aren't stored on a
local file system (database, ZIP file, you name it). It should be
trivial to support both dotted names and module instances.
Christian
[1]
http://packages.python.org/distribute/pkg_resources.html#resourcemanager-api
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 18:32, Guido van Rossum wrote:
> On Sat, Apr 14, 2012 at 2:15 PM, Eric Snow
> wrote:
> > On Sat, Apr 14, 2012 at 2:56 PM, Brett Cannon wrote:
> >> An open issue in PEP 302 is whether to require __loader__ attributes on
> >> modules. The claimed worry is memory consumption, but considering
> importlib
> >> and zipimport are already doing this that seems like a red herring.
> >> Requiring it, though, opens the door to people relying on its existence
> and
> >> thus starting to do things like loading assets with
> >> ``__loader__.get_data(path_to_internal_package_file)`` which allows
> code to
> >> not care how modules are stored (e.g. zip file, sqlite database, etc.).
> >>
> >> What I would like to do is update the PEP to state that loaders are
> expected
> >> to set __loader__. Now importlib will get updated to do that implicitly
> so
> >> external code can expect it post-import, but requiring loaders to set it
> >> would mean that code executed during import can rely on it as well.
> >>
> >> As for __package__, PEP 366 states that modules should set it but it
> isn't
> >> referenced by PEP 302. What I want to do is add a reference and make it
> >> required like __loader__. Importlib already sets it implicitly
> post-import,
> >> but once again it would be nice to do this pre-import.
> >>
> >> To help facilitate both new requirements, I would update the
> >> importlib.util.module_for_loader decorator to set both on a module that
> >> doesn't have them before passing the module down to the decorated
> method.
> >> That way people already using the decorator don't have to worry about
> >> anything and it is one less detail to have to worry about. I would also
> >> update the docs on importlib.util.set_package and
> importlib.util.set_loader
> >> to suggest people use importlib.util.module_for_loader and only use the
> >> other two decorators for backwards-compatibility.
> >
> > +1
>
> Funny, I was just thinking about having a simple standard API that
> will let you open files (and list directories) relative to a given
> module or package regardless of how the thing is loaded. If we
> guarantee that there's always a __loader__ that's a first step, though
> I think we may need to do a little more to get people who currently do
> things like open(os.path.join(os.path.basename(__file__),
> 'some_file_name') to switch. I was thinking of having a stdlib
> function that you give a module/package object, a relative filename,
> and optionally a mode ('b' or 't') and returns a stream -- and sibling
> functions that return a string or bytes object (depending on what API
> the user is using either the stream or the data can be more useful).
> What would we call thos functions and where would the live?
IOW go one level lower than get_data() and return the stream and then just
have helper functions which I guess just exhaust the stream for you to
return bytes or str? Or are you thinking that somehow providing a function
that can get an explicit bytes or str object will be more optimized than
doing something with the stream? Either way you will need new methods on
loaders to make it work more efficiently since loaders only have get_data()
which returns bytes and not a stream object. Plus there is currently no API
for listing the contents of a directory.
As for what to call such functions, I really don't know since they are
essentially abstract functions above the OS which work on whatever storage
backend a module uses.
For where they should live, it depends if you are viewing this as more of a
file abstraction or something that ties into modules. For the former it
seems like shutil or something that dealt with higher order file
manipulation. If it's the latter I would say importlib.util.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 3:50 PM, Brett Cannon wrote:
> On Sat, Apr 14, 2012 at 18:32, Guido van Rossum wrote:
>> Funny, I was just thinking about having a simple standard API that
>> will let you open files (and list directories) relative to a given
>> module or package regardless of how the thing is loaded. If we
>> guarantee that there's always a __loader__ that's a first step, though
>> I think we may need to do a little more to get people who currently do
>> things like open(os.path.join(os.path.basename(__file__),
>> 'some_file_name') to switch. I was thinking of having a stdlib
>> function that you give a module/package object, a relative filename,
>> and optionally a mode ('b' or 't') and returns a stream -- and sibling
>> functions that return a string or bytes object (depending on what API
>> the user is using either the stream or the data can be more useful).
>> What would we call thos functions and where would the live?
> IOW go one level lower than get_data() and return the stream and then just
> have helper functions which I guess just exhaust the stream for you to
> return bytes or str? Or are you thinking that somehow providing a function
> that can get an explicit bytes or str object will be more optimized than
> doing something with the stream? Either way you will need new methods on
> loaders to make it work more efficiently since loaders only have get_data()
> which returns bytes and not a stream object. Plus there is currently no API
> for listing the contents of a directory.
Well, if it's a real file, and you need a stream, that's efficient,
and if you need the data, you can read it. But if it comes from a
loader, and you need a stream, you'd have to wrap it in a StringIO
instance. So having two APIs, one to get a stream, and one to get the
data, allows the implementation to be more optimal -- it would be bad
to wrap a StringIO instance around data only so you can read the data
from the stream again...
> As for what to call such functions, I really don't know since they are
> essentially abstract functions above the OS which work on whatever storage
> backend a module uses.
>
> For where they should live, it depends if you are viewing this as more of a
> file abstraction or something that ties into modules. For the former it
> seems like shutil or something that dealt with higher order file
> manipulation. If it's the latter I would say importlib.util.
if pkg_resources is in the stdlib that would be a fine place to put it.
--
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 18:41, Christian Heimes wrote:
> Am 15.04.2012 00:32, schrieb Guido van Rossum:
> > Funny, I was just thinking about having a simple standard API that
> > will let you open files (and list directories) relative to a given
> > module or package regardless of how the thing is loaded. If we
> > guarantee that there's always a __loader__ that's a first step, though
> > I think we may need to do a little more to get people who currently do
> > things like open(os.path.join(os.path.basename(__file__),
> > 'some_file_name') to switch. I was thinking of having a stdlib
> > function that you give a module/package object, a relative filename,
> > and optionally a mode ('b' or 't') and returns a stream -- and sibling
> > functions that return a string or bytes object (depending on what API
> > the user is using either the stream or the data can be more useful).
> > What would we call thos functions and where would the live?
>
> pkg_resources has a similar API [1] that supports dotted names.
> pkg_resources also does some caching for files that aren't stored on a
> local file system (database, ZIP file, you name it). It should be
> trivial to support both dotted names and module instances.
>
>
But that begs the question of whether this API should conflate module
hierarchies with file directories. Are we trying to support reading files
from within packages w/o caring about storage details but still
fundamentally working with files, or are we trying to abstract away the
concept of files and deal more with stored bytes inside packages? For the
former you would essentially want the root package and then simply specify
some file path. But for the latter you would want the module or package
that is next to or containing the data and grab it from there.
And I just realized that we would have to be quite clear that for namespace
packages it is what is in __file__ that people care about, else people
might expect some search to be performed on their behalf. Namespace
packages also dictate that you would want the module closest to the data in
the hierarchy to make sure you went down the right directory (e.g. if you
had the namespace package monty with modules spam and bacon but from
different directories, you really want to make sure you grab the right
module). I would argue that you can only go next to/within
modules/packages; going up would just cause confusion on where you were
grabbing from and going down could be done but makes things a little
messier.
-Brett
> Christian
>
> [1]
>
> http://packages.python.org/distribute/pkg_resources.html#resourcemanager-api
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 18:56, Guido van Rossum wrote:
> On Sat, Apr 14, 2012 at 3:50 PM, Brett Cannon wrote:
> > On Sat, Apr 14, 2012 at 18:32, Guido van Rossum
> wrote:
> >> Funny, I was just thinking about having a simple standard API that
> >> will let you open files (and list directories) relative to a given
> >> module or package regardless of how the thing is loaded. If we
> >> guarantee that there's always a __loader__ that's a first step, though
> >> I think we may need to do a little more to get people who currently do
> >> things like open(os.path.join(os.path.basename(__file__),
> >> 'some_file_name') to switch. I was thinking of having a stdlib
> >> function that you give a module/package object, a relative filename,
> >> and optionally a mode ('b' or 't') and returns a stream -- and sibling
> >> functions that return a string or bytes object (depending on what API
> >> the user is using either the stream or the data can be more useful).
> >> What would we call thos functions and where would the live?
>
> > IOW go one level lower than get_data() and return the stream and then
> just
> > have helper functions which I guess just exhaust the stream for you to
> > return bytes or str? Or are you thinking that somehow providing a
> function
> > that can get an explicit bytes or str object will be more optimized than
> > doing something with the stream? Either way you will need new methods on
> > loaders to make it work more efficiently since loaders only have
> get_data()
> > which returns bytes and not a stream object. Plus there is currently no
> API
> > for listing the contents of a directory.
>
> Well, if it's a real file, and you need a stream, that's efficient,
> and if you need the data, you can read it. But if it comes from a
> loader, and you need a stream, you'd have to wrap it in a StringIO
> instance. So having two APIs, one to get a stream, and one to get the
> data, allows the implementation to be more optimal -- it would be bad
> to wrap a StringIO instance around data only so you can read the data
> from the stream again...
>
Right, so you would need to grow, which is fine and can be done in a
backwards-compatible way using io.BytesIO and StringIO.
>
> > As for what to call such functions, I really don't know since they are
> > essentially abstract functions above the OS which work on whatever
> storage
> > backend a module uses.
> >
> > For where they should live, it depends if you are viewing this as more
> of a
> > file abstraction or something that ties into modules. For the former it
> > seems like shutil or something that dealt with higher order file
> > manipulation. If it's the latter I would say importlib.util.
>
> if pkg_resources is in the stdlib that would be a fine place to put it.
>
It's not.
-Brett
>
> --
> --Guido van Rossum (python.org/~guido)
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] making the import machinery explicit
On Sat, Apr 14, 2012 at 4:16 PM, Brett Cannon wrote: > Once again, it's just code that needs updating to run on Python 3.3 so I > don't view it as a concern. Going from list.append() to list.insert() (even > if its ``list.insert(hook, len(list)-2)``) is not exactly difficult. I'm fine with that. It's not a big deal either way, especially with how few people it directly impacts. -eric ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
Am 15.04.2012 00:56, schrieb Guido van Rossum: > Well, if it's a real file, and you need a stream, that's efficient, > and if you need the data, you can read it. But if it comes from a > loader, and you need a stream, you'd have to wrap it in a StringIO > instance. So having two APIs, one to get a stream, and one to get the > data, allows the implementation to be more optimal -- it would be bad > to wrap a StringIO instance around data only so you can read the data > from the stream again... We need a third way to access a file. The two methods get_data() and get_stream() aren't sufficient for libraries that need a read file that lifes on the file system. In order to have real files the loader (or some other abstraction layer) needs to create a temporary directory for the current process and clean it up when the process ends. The file is saved to the temporary directory the first time it's accessed. The get_file() feature has a neat benefit. Since it transparently extracts files from the loader, users can ship binary extensions and shared libraries (dlls) in a ZIP file and use them without too much hassle. Christian ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Require loaders set __package__ and __loader__
On Sat, Apr 14, 2012 at 5:06 PM, Christian Heimes wrote: > Am 15.04.2012 00:56, schrieb Guido van Rossum: >> Well, if it's a real file, and you need a stream, that's efficient, >> and if you need the data, you can read it. But if it comes from a >> loader, and you need a stream, you'd have to wrap it in a StringIO >> instance. So having two APIs, one to get a stream, and one to get the >> data, allows the implementation to be more optimal -- it would be bad >> to wrap a StringIO instance around data only so you can read the data >> from the stream again... > > We need a third way to access a file. The two methods get_data() and > get_stream() aren't sufficient for libraries that need a read file that > lives on the file system. In order to have real files the loader (or > some other abstraction layer) needs to create a temporary directory for > the current process and clean it up when the process ends. The file is > saved to the temporary directory the first time it's accessed. Hm... Can you give an example of a library that needs a real file? That sounds like a poorly designed API. Perhaps you're talking about APIs that take a filename instead of a stream? Maybe for those it would be best to start getting serious about a virtual filesystem... (Sorry, probably python-ideas stuff). > The get_file() feature has a neat benefit. Since it transparently > extracts files from the loader, users can ship binary extensions and > shared libraries (dlls) in a ZIP file and use them without too much hassle. Yeah, DLLs are about the only example I can think of where even a virtual filesystem doesn't help... -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] making the import machinery explicit
+1! Thanks for pushing this. On Apr 15, 2012 4:04 AM, "Brett Cannon" wrote: > To start off, what I am about to propose was brought up at the PyCon > language summit and the whole room agreed with what I want to do here, so I > honestly don't expect much of an argument (famous last words). > > In the "ancient" import.c days, a lot of import's stuff was hidden deep in > the C code and in no way exposed to the user. But with importlib finishing > PEP 302's phase 2 plans of getting imoprt to be properly refactored to use > importers, path hooks, etc., this need no longer be the case. > > So what I propose to do is stop having import have any kind of implicit > machinery. This means sys.meta_path gets a path finder that does the heavy > lifting for import and sys.path_hooks gets a hook which provides a default > finder. As of right now those two pieces of machinery are entirely implicit > in importlib and can't be modified, stopped, etc. > > If this happens, what changes? First, more of importlib will get publicly > exposed (e.g. the meta path finder would become public instead of private > like it is along with everything else that is publicly exposed). Second, > import itself technically becomes much simpler since it really then is > about resolving module names, traversing sys.meta_path, and then handling > fromlist w/ everything else coming from how the path finder and path hook > work. > > What also changes is that sys.meta_path and sys.path_hooks cannot be > blindly reset w/o blowing out import. I doubt anyone is even touching those > attributes in the common case, and the few that do can easily just stop > wiping out those two lists. If people really care we can do a warning in > 3.3 if they are found to be empty and then fall back to old semantics, but > I honestly don't see this being an issue as backwards-compatibility would > just require being more careful of what you delete (which I have been > warning people to do for years now) which is a minor code change which > falls in line with what goes along with any new Python version. > > And lastly, sticking None in sys.path_importer_cache would no longer mean > "do the implicit thing" and instead would mean the same as NullImporter > does now (which also means import can put None into sys.path_importer_cache > instead of NullImporter): no finder is available for an entry on sys.path > when None is found. Once again, I don't see anyone explicitly sticking None > into sys.path_importer_cache, and if they are they can easily stick what > will be the newly exposed finder in there instead. The more common case > would be people wiping out all entries of NullImporter so as to have a new > sys.path_hook entry take effect. That code would instead need to clear out > None on top of NullImporter as well (in Python 3.2 and earlier this would > just be a performance loss, not a semantic change). So this too could > change in Python 3.3 as long as people update their code like they do with > any other new version of Python. > > In summary, I want no more magic "behind the curtain" for Python 3.3 and > import: sys.meta_path and sys.path_hooks contain what they should and if > they are emptied then imports will fail and None in sys.path_importer_cache > means "no finder" instead of "use magical, implicit stuff". > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/anacrolix%40gmail.com > > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] [Python-checkins] Daily reference leaks (556b9bafdee8): sum=1144
I'm going to guess my bootstrap patch caused most of these. =) test_capi is now plugged, so I'm going to assume Python/pythonrun.c:import_init() is taken care of. The real question is where in http://hg.python.org/cpython/rev/2dd046be2c88 are the other leaks coming from. Any help would be great as I have been staring at this code for so long I really don't want to have to go hunting for refleaks right now. On Sat, Apr 14, 2012 at 23:43, wrote: > results for 556b9bafdee8 on branch "default" > > > test_support leaked [-2, 2, 0] references, sum=0 > test_bz2 leaked [-1, -1, -1] references, sum=-3 > test_capi leaked [78, 78, 78] references, sum=234 > test_concurrent_futures leaked [120, 120, 120] references, sum=360 > test_hashlib leaked [-1, -1, -1] references, sum=-3 > test_import leaked [4, 4, 4] references, sum=12 > test_lib2to3 leaked [14, 14, 14] references, sum=42 > test_multiprocessing leaked [149, 149, 150] references, sum=448 > test_runpy leaked [18, 18, 18] references, sum=54 > > > Command line was: ['./python', '-m', 'test.regrtest', '-uall', '-R', > '3:3:/home/antoine/cpython/refleaks/reflogBFPz19', '-x'] > ___ > Python-checkins mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-checkins > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
