Re: [Python-Dev] PEP 3147: PYC Repository Directories
> 3. In each top level directory on sys.path, shadow file heirarchy > Major Pro: trivial to separate out all cached files > Major Con: ??? (I got nuthin') The major con of this option (and option 2) is an ambiguity of where to look for in case of packages. In particular for namespace packages (of the setuptools kind, or the PEP 382 kind), the directory where a package is found on sys.path can change across Python runs. So when you run Python several times, and install additional eggs in-between, you get different directories all caching the same pyc files. If you then uninstall some of the eggs, it may be difficult to find out what pyc files to delete. > Note that with option two, creating a bytecode only zipfile would be > trivial: just add the __pycache__ directory as the top-level directory > in the zipfile and leave out everything else (assume there were no data > files in the package that were still needed). I think any scheme that uses directories for pyc files will cause stale pyc files to be located on disk. I then think it is important to never automatically use these in imports - i.e. only ever consider a file in a __pycache__ directory if you found a .py file earlier. If that is the policy, then a __pycache__ directory in a zipfile would have no effect (and rightly so). Instead, to run code from bytecode, the byte code files should be on sys.path themselves (probably still named the same way as they are named inside __pycache__). Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subprocess docs patch
> Any help you could provide would be appreciated.
Please use unified diffs in the future.
I'm -0 on this patch; it still has the negative, cautionary-patronizing
tone ("Do not", "can be tricky", "be mindful"), as if readers are unable
to grasp the description that they just read (and indeed, in the patch,
you claim that readers *are* unable to understand how command lines
work). I know that Raymond Hettinger is very much opposed to this style
of documentation. So I, whose native language is not English, cannot
approve/apply this patch. Try convincing Raymond.
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Hanno Schlichting wrote: >+1 for a single strategy that is used in all cases. The current >solution could be phased out across multiple releases, but in the end >there should be a single approach and no flag. Otherwise some code and >tools will only support one of the approaches, especially if this is >seen as something "only a minority of Linux distributions uses". -1. As far as I can tell, this PEP proposes to solve a specific problem that Linux distributions have. As they have decent package managers, this PEP makes their maintainers' lives a lot easier. If implemented, I believe it would eventually be used by all of them, not just "a minority". For just about anyone else, I believe the current situation works perfectly fine, and should not be changed. Personally, I work mainly on Windows, and things I install are placed in the site-packages directory of the Python version I use. There is no need to place .pyc files in subdirectories there, as there will only ever be one. Programs I write myself are also rarely, if ever, run by multiple Python versions. They get run by the default Python on my system; if I change the default, the .pyc files get overwritten, which is exactly what I want, I no longer need the old ones. As to the single cache directory per directory versus per .py file issue: a subdirectory per .py file is easier to manipulate manually; listing the .py file and the subdirectory containing the compiled versions belonging to it makes it somewhat easier to prevent errors due to deleting the source but not the compiled version. However, as the use-case for this PEP seems to be to make life easier for Linux packagers, it seems that a single __pycache__ subdirectory (or whatever the name would be) is preferable: less filesystem clutter, and no risks of forgetting to delete .pyc files, as this is about system-managed Python source. Gertjan. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
> Would you still be a -1 on making it the new scheme the default if it > used a single cache directory instead? That would actually be cleaner > than the current solution rather than messier. Well, I guess no, although additional directories are always more intrusive than additional files (visually, or with tools such as "du" for example). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Improved Traceback Module
On Fri, Jan 29, 2010 at 08:02:58PM -0500, P.J. Eby wrote: > At 01:24 AM 1/30/2010 +0100, Ludvig Ericson wrote: > >> On 28 jan 2010, at 22:47, P.J. Eby wrote: >> >> > At 07:47 PM 1/28/2010 +0100, Benjamin Schweizer wrote: >> >> >> >> I like the idea of configuring the list of variables with using a >> >> convention like __trace__, though this requires me to specify what >> >> variables cause an exception and that would be hard? >> > >> > The idea is that you simply use __trace__ to format the local >> variables for that frame. In other words, the programmer specifies >> what they want to see at that point in the traceback, by setting a >> variable in their code. If there's a __trace__ variable in the frame >> locals at that point in the traceback, simply format the traceback >> locals+globals using that string's .safe_format() method. >> >> This seems very naïve to me. It makes the code ten times as hard to >> maintain; adding or removing a *local variable* suddenly means you >> have to check what __trace__ or whatever refers to. > > You lost me; the purpose of a __trace__ is to show *context* -- the > primary *purpose* of the code at that point in the call stack. You > don't put every variable in it, as that would go directly against the > purpose. > > That is, it's intended to *reduce* information overload, not increase > it. If you're putting more than 2 variables in a given __trace__ you're > probably doing something wrong. Ok, so how do we get things done? Is there already an implementation of __trace__ around? I've not yet looked at Kristjáns traceback2.py; may be it's superior. I've tried to implement the variable dump feature with few changes, keeping the original structure of traceback.py as it was. It's finished and compatible, but it requires some testing. From here, I think we could 1) keep traceback.py as it was 2) add my non-invasive patch 3) further improve traceback.py to implement __trace__ 4) switch to Kristján's traceback2.py 5) improve Kristján's traceback2.py even further. Greetings -- http://benjamin-schweizer.de/contact ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Raymond Hettinger wrote: > > On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote: >> Abstract >> >> >> This PEP describes an extension to Python's import mechanism which >> improves sharing of Python source code files among multiple installed >> different versions of the Python interpreter. > > +1 +1 from here as well. >> It does this by >> allowing many different byte compilation files (.pyc files) to be >> co-located with the Python source file (.py file). +1 on the idea of having a standard for Python module cache files. +1 on having those files in the same directory as the associated module file, just like we already do. -1 on the idea of using directories for these. This only complicates cleanup, management and distribution of such files. Perhaps we could make this an option, though. Thanks, -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 01 2010) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
M.-A. Lemburg wrote: > Collin Winter wrote: >> I added startup benchmarks for Mercurial and Bazaar yesterday >> (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we >> can use them as more macro-ish benchmarks, rather than merely starting >> the CPython binary over and over again. If you have ideas for better >> Mercurial/Bazaar startup scenarios, I'd love to hear them. The new >> hg_startup and bzr_startup benchmarks should give us some more data >> points for measuring improvements in startup time. >> >> One idea we had for improving startup time for apps like Mercurial was >> to allow the creation of hermetic Python "binaries", with all >> necessary modules preloaded. This would be something like Smalltalk >> images. We haven't yet really fleshed out this idea, though. > > In Python you can do the same with the freeze.py utility. See > > http://www.egenix.com/www2002/python/mxCGIPython.html > > for an old project where we basically put the Python > interpreter and stdlib into a single executable. > > We've recently revisited that project and created something > we call "pyrun". It fits Python 2.5 into a single executable > and a set of shared modules (which for various reasons cannot > be linked statically)... 12MB in total. > > If you load lots of modules from the stdlib this does provide > a significant improvement over standard Python. > > Back to the PEP's proposal: > > Looking at the data you currently have, the negative results > currently don't really look good in the light of the small > performance improvements. > > Wouldn't it be possible to have the compiler approach work > in three phases in order to reduce the memory footprint and > startup time hit, ie. > > 1. run an instrumented Python interpreter to collect all > the needed compiler information; write this information into > a .pys file (Python stats) > > 2. create compiled versions of the code for various often > used code paths and type combinations by reading the > .pys file and generating an .so file as regular > Python extension module > > 3. run an uninstrumented Python interpreter and let it > use the .so files instead of the .py ones > > In production, you'd then only use step 3 and avoid the > overhead of steps 1 and 2. > > Moreover, the .so file approach > would only load the code for code paths and type combinations > actually used in a particular run of the Python code into > memory and allow multiple Python processes to share it. > > As side effect, you'd probably also avoid the need to have > C++ code in the production Python runtime - that is unless > LLVM requires some kind of runtime support which is written > in C++. BTW: Some years ago we discussed the idea of pluggable VMs for Python. Wouldn't U-S be a good motivation to revisit this idea ? We could then have a VM based on byte code using a stack machines, one based on word code using a register machine and perhaps one that uses the Stackless approach. Each VM type could use the PEP 3147 approach to store auxiliary files to store byte code, word code or machine compiled code. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 01 2010) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Hey MA, On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg wrote: > BTW: Some years ago we discussed the idea of pluggable VMs for > Python. Wouldn't U-S be a good motivation to revisit this idea ? > > We could then have a VM based on byte code using a stack > machines, one based on word code using a register machine > and perhaps one that uses the Stackless approach. What is the usecase for having pluggable VMs? Is the idea that, at runtime, the user would select which virtual machine they want to run their code under? How would the user make that determination intelligently? I think this idea underestimates a) how deeply the current CPython VM is intertwined with the rest of the implementation, and b) the nature of the changes required by these separate VMs. For example, Unladen Swallow adds fields to the C-level structs for dicts, code objects and frame objects; how would those changes be pluggable? Stackless requires so many modifications that it is effectively a fork; how would those changes be pluggable? Collin Winter ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 11:16, Guido van Rossum wrote: > Whoa. This thread already exploded. I'm picking this message to > respond to because it reflects my own view after reading the PEP. > > On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting wrote: >> On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross >> wrote: >>> I don't know whether I in favour of using a single pyr folder or not >>> but if a single folder is used I'd definitely prefer the folder to be >>> called __pyr__ rather than .pyr. > > Exactly what I would prefer. I worry that having many small > directories is a fairly poor use of the filesystem. A quick scan of > /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but > only 57 directories). > >> Do you have any specific reason for that? >> >> Using the leading dot notation is an established pattern to hide >> non-essential information from directory views. What makes this >> non-applicable in this situation and a custom Python notation better? > > Because we don't want to completely hide the pyc files. Also the dot > naming convention is somewhat platform-specific. > > FWIW in Python 3, the __file__ variable always points to the .py > source filename. I agreed with Georg that there ought to be an API for > finding the pyc file for a module. This could be a small addition to > the PEP. Importlib somewhat does this already through a module's loader: http://docs.python.org/py3k/library/importlib.html#importlib.abc.PyPycLoader.bytecode_path . If you want to work off of module names this is enough; if importlib did the import then you can do __loader__.bytecode_path(__name__). And if it has not been loaded yet then that simply requires me exposing an importlib.find_module() that returns a loader for the module. Trick comes down to when you want it based on __file__ instead of the module name. Oh, and me finally breaking up import so that it has proper loaders or bootstrapping importlib; small snag. =) But at least the code already exists for this stuff. -Brett > > -- > --Guido van Rossum (python.org/~guido) > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/brett%40python.org > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
Collin Winter wrote: > Hey MA, > > On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg wrote: >> BTW: Some years ago we discussed the idea of pluggable VMs for >> Python. Wouldn't U-S be a good motivation to revisit this idea ? >> >> We could then have a VM based on byte code using a stack >> machines, one based on word code using a register machine >> and perhaps one that uses the Stackless approach. > > What is the usecase for having pluggable VMs? Is the idea that, at > runtime, the user would select which virtual machine they want to run > their code under? How would the user make that determination > intelligently? The idea back then (IIRC) was to have a compile time option to select one of a few available VMs, in order to more easily experiment with new or optimized implementations such as e.g. a register based VM. It should even be possible to factor out the VM into a DLL/SO which is then selected and loaded via a command line option. > I think this idea underestimates a) how deeply the current CPython VM > is intertwined with the rest of the implementation, and b) the nature > of the changes required by these separate VMs. For example, Unladen > Swallow adds fields to the C-level structs for dicts, code objects and > frame objects; how would those changes be pluggable? Stackless > requires so many modifications that it is effectively a fork; how > would those changes be pluggable? They wouldn't be pluggable. Such changes would have to be made in a more general way in order to serve more than just one VM. Getting the right would certainly require a major effort, but it would also reduce the need to have several branches of C-based Python implementations. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 01 2010) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try our new mxODBC.Connect Python Database Interface for free ! eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48 D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg Registered at Amtsgericht Duesseldorf: HRB 46611 http://www.egenix.com/company/contact/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Sun, Jan 31, 2010 at 11:04, Raymond Hettinger wrote: > > On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote: >> Abstract >> >> >> This PEP describes an extension to Python's import mechanism which >> improves sharing of Python source code files among multiple installed >> different versions of the Python interpreter. > > +1 > > >> It does this by >> allowing many different byte compilation files (.pyc files) to be >> co-located with the Python source file (.py file). > > It would be nice if all the compilation files could be tucked > into one single zipfile per directory to reduce directory clutter. > > It has several benefits besides tidiness. It hides the implementation > details of when magic numbers get shifted. And it may allow faster > start-up times when the zipfile is in the disk cache. It also eliminates stat calls. I have not seen anyone mention this, but on filesystems where stat calls are expensive (e.g. NFS), this is going to increase import cost (and thus startup time which some people are already incredibly paranoid about). You are now going to shift from a single stat call to check for a bytecode file to two just in the search phase *per file check* (remember you need to search for module.py and module/__init__.py). And then you get to repeat all of this during the load process (potentially, depending on how aggressive the loader is with caching). As others have said, an uncompressed zip file could work here. Or even a file format where the first 4 bytes is the timestamp and then after that are chunks of length-of-bytecode|magic|bytecode. That allows for opening a file in append mode to add more bytecode instead of a zipfile's requirement of rewriting the TOC on the end of the file every time you mutate the file (if I remember the zip file format correctly). Biggest cost in this simple approach would be reading the file in (unless you mmap the thing when possible) since once read the code will be a bytes object which means constant time indexing until you find the right magic number. And adding support to differentiate between -O bytecode is simply adding a marker per chunk of bytecode. And I disagree this would be difficult as the PEP suggests given the proper file format. For zip files zipimport already has the read code in C; it just would require the code to write to a zip file. And as for the format I mentioned above, that's dead-simple to implement. -Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On Mon, Feb 1, 2010 at 11:17 AM, M.-A. Lemburg wrote: > Collin Winter wrote: >> I think this idea underestimates a) how deeply the current CPython VM >> is intertwined with the rest of the implementation, and b) the nature >> of the changes required by these separate VMs. For example, Unladen >> Swallow adds fields to the C-level structs for dicts, code objects and >> frame objects; how would those changes be pluggable? Stackless >> requires so many modifications that it is effectively a fork; how >> would those changes be pluggable? > > They wouldn't be pluggable. Such changes would have to be made > in a more general way in order to serve more than just one VM. I believe these VMs would have little overlap. I cannot imagine that Unladen Swallow's needs have much in common with Stackless's, or with those of a hypothetical register machine to replace the current stack machine. Let's consider that last example in more detail: a register machine would require completely different bytecode. This would require replacing the bytecode compiler, the peephole optimizer, and the bytecode eval loop. The frame object would need to be changed to hold the registers and a new blockstack design; the code object would have to potentially hold a new bytecode layout. I suppose making all this pluggable would be possible, but I don't see the point. This kind of experimentation is ideal for a branch: go off, test your idea, report your findings, merge back. Let the branch be long-lived, if need be. The Mercurial migration will make all this easier. > Getting the right would certainly require a major effort, but it > would also reduce the need to have several branches of C-based > Python implementations. If such a restrictive plugin-based scheme had been available when we began Unladen Swallow, I do not doubt that we would have ignored it entirely. I do not like the idea of artificially tying the hands of people trying to make CPython faster. I do not see any part of Unladen Swallow that would have been made easier by such a scheme. If anything, it would have made our project more difficult. Collin Winter ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
Le Mon, 01 Feb 2010 11:35:19 -0800, Brett Cannon a écrit : > > As others have said, an uncompressed zip file could work here. Or even a > file format where the first 4 bytes is the timestamp and then after that > are chunks of length-of-bytecode|magic|bytecode. That allows for opening > a file in append mode to add more bytecode instead of a zipfile's > requirement of rewriting the TOC on the end of the file every time you > mutate the file (if I remember the zip file format correctly). Making the file append-only doesn't eliminate the problems with concurrent modification. You still have to specify and implement a robust cross-platform file locking system which will have to be shared by all implementations. This is really a great deal of complication to add to the interpreter(s). And, besides, it might not even work on NFS which was the motivation for your proposal :) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
So, if a patch was proposed for the multiprocessing, allowing an unified "spawnl", thread-safe, semantic, do you think something could prevent its integration ? We may ignore the subprocess module, since fork+exec shouldn't be bothered by the (potentially disastrous) state of child process data. But it bothers me to think multithreading and multiprocessing are currently opposed whereas theoretically nothing justifies it... Regards, Pascal ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
On 2/1/2010 1:32 PM, Collin Winter wrote: Hey MA, On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg wrote: BTW: Some years ago we discussed the idea of pluggable VMs for Python. Wouldn't U-S be a good motivation to revisit this idea ? We could then have a VM based on byte code using a stack machines, one based on word code using a register machine and perhaps one that uses the Stackless approach. What is the usecase for having pluggable VMs? Running an application full time on multiple machines Is the idea that, at runtime, the user would select which virtual > machine they want to run their code under? From you comments below, I would presume the selection should be at startup, from the command line, before building Python objects. How would the user make that determination intelligently? The same way people would determine whether to select JIT or not -- by testing space and time performance for their app, with test time proportioned to expected run time. Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
> And I disagree this would be difficult as the PEP suggests given the > proper file format. For zip files zipimport already has the read code > in C; it just would require the code to write to a zip file. And as > for the format I mentioned above, that's dead-simple to implement. How do you write to a zipfile while others are reading it? Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 3:09 PM, Pascal Chambon wrote: > > So, if a patch was proposed for the multiprocessing, allowing an unified > "spawnl", thread-safe, semantic, do you think something could prevent its > integration ? > > We may ignore the subprocess module, since fork+exec shouldn't be bothered > by the (potentially disastrous) state of child process data. > But it bothers me to think multithreading and multiprocessing are currently > opposed whereas theoretically nothing justifies it... > > Regards, > Pascal I don't see the need for the change from fork as of yet (for multiprocessing) and I am leery to change the internal implementation and semantics right now, or anytime soon. I'd be interested in seeing the patch, but if the concern is that global threading objects could be left in the state that they're in at the time of the fork(), I think people know that or we can easily document this fact. jesse ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
On Mon, Feb 1, 2010 at 13:19, "Martin v. Löwis" wrote: >> And I disagree this would be difficult as the PEP suggests given the >> proper file format. For zip files zipimport already has the read code >> in C; it just would require the code to write to a zip file. And as >> for the format I mentioned above, that's dead-simple to implement. > > How do you write to a zipfile while others are reading it? > By hating concurrency (i.e. I don't have an answer which kills my idea). -Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
Jesse Noller gmail.com> writes: > > I don't see the need for the change from fork as of yet (for > multiprocessing) and I am leery to change the internal implementation > and semantics right now, or anytime soon. I'd be interested in seeing > the patch, but if the concern is that global threading objects could > be left in the state that they're in at the time of the fork(), I > think people know that or we can easily document this fact. If Pascal provides a patch I think it would really be good to consider it. Not being able to mix threads and multiprocessing is a potentially annoying wart. It means you must choose one or the other from the start, and once you've made this decision you are stuck with it. (not to mention that libraries can use threads and locks behind your back) Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 4:32 PM, Antoine Pitrou wrote: > Jesse Noller gmail.com> writes: >> >> I don't see the need for the change from fork as of yet (for >> multiprocessing) and I am leery to change the internal implementation >> and semantics right now, or anytime soon. I'd be interested in seeing >> the patch, but if the concern is that global threading objects could >> be left in the state that they're in at the time of the fork(), I >> think people know that or we can easily document this fact. > > If Pascal provides a patch I think it would really be good to consider it. > Not being able to mix threads and multiprocessing is a potentially annoying > wart. It means you must choose one or the other from the start, and once > you've > made this decision you are stuck with it. > (not to mention that libraries can use threads and locks behind your back) > I don't see spawnl as a viable alternative to fork. I imagine that I, and others successfully mix threads and multiprocessing on non-win32 platforms just fine, knowing of course that fork() can cause heartburn if you have global locks code within the forked() processes might be dependent on. In fact multiprocessing uses threading internally. For the win32 implementation, given win32 doesn't implement fork(), see http://svn.python.org/view/python/trunk/Lib/multiprocessing/forking.py?view=markup, about halfway down. We already have an implementation that spawns a subprocess and then pushes the required state to the child. The fundamental need for things to be pickleable *all the time* kinda makes it annoying to work with. jesse ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
> On Mon, Feb 1, 2010 at 13:19, "Martin v. Löwis" wrote: >> How do you write to a zipfile while others are reading it? On Mon, Feb 1, 2010 at 1:23 PM, Brett Cannon wrote: > By hating concurrency (i.e. I don't have an answer which kills my idea). The python I use (win32 2.6.2) does not complain if it cannot read from or write to a .pyc; and thus it handles multiple python processes trying to create .pyc files at the same time. Is the .zip case really any different? Since .pyc files are an optimization, it seems natural and correct that .pyc IO errors pass silently (apologies to Tim). It's an interesting challenge to write the file in such a way that it's safe for a reader and writer to co-exist. Like Brett, I considered an append-only scheme, but one needs to handle the case where the bytecode for a particular magic number changes. At some point you'd need to sweep garbage from the file. All solutions seem unnecessarily complex, and unnecessary since in practice the case should not come up. paul ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
Jesse Noller gmail.com> writes: > > I don't see spawnl as a viable alternative to fork. I imagine that I, > and others successfully mix threads and multiprocessing on non-win32 > platforms just fine, knowing of course that fork() can cause heartburn > if you have global locks code within the forked() processes might be > dependent on. In fact multiprocessing uses threading internally. Sure. But sometimes global locks are out of your control. They can be e.g. in a library (Pascal gave the example of the logging module; we may have more of them in the stdlib). It's certainly much more practical to give a choice in multithreading than ask all libraries writers to eliminate global locks. > We already have an implementation that spawns a > subprocess and then pushes the required state to the child. I know, and that's why it wouldn't be a very controversial change to add an option to enable this path under Linux, would it? > The > fundamental need for things to be pickleable *all the time* kinda > makes it annoying to work with. Oh, sure. But if you want to be Windows-compatible you'd better comply with this requirement anyway. Regards Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 5:08 PM, Antoine Pitrou wrote: >> I don't see spawnl as a viable alternative to fork. I imagine that I, >> and others successfully mix threads and multiprocessing on non-win32 >> platforms just fine, knowing of course that fork() can cause heartburn >> if you have global locks code within the forked() processes might be >> dependent on. In fact multiprocessing uses threading internally. > > Sure. But sometimes global locks are out of your control. They can be e.g. in > a > library (Pascal gave the example of the logging module; we may have more of > them > in the stdlib). I don't disagree there; but then again, I haven't seen this issue arise (in my own code)/no bug reports/no test cases that show this to be a consistent issue. I'm perfectly OK with being wrong, I'm just leery to tearing out the internals for something else "not forking". > It's certainly much more practical to give a choice in multithreading than ask > all libraries writers to eliminate global locks. Trust me, I'm all for giving users choice, and I'd never turn down help - I'm just doubtful/wary on this count. If we wanted a fork-free non-windows implementation, ala the current win32 implementation, then we should just clone the windows implementation and work forward from there. > I know, and that's why it wouldn't be a very controversial change to add an > option to enable this path under Linux, would it? Fair enough; but then you have the additional problem of trying to explain to users the difference between multiprocessing.Process(nofork=True) or .Process(fork=False) or whatever argument we'd decide to use "use this if you have no threads, global locks, etc, etc". > Oh, sure. But if you want to be Windows-compatible you'd better comply with > this > requirement anyway. > Kinda, sorta. So far, 99% of the multiprocessing code I've seen doesn't attempt to be compatible with both unixes, and win32 - commonly, it's either win32 *or* linux. Obviously, being compatible with both means picking the lowest common denominator. jesse ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
Antoine Pitrou wrote: > Jesse Noller gmail.com> writes: >> I don't see the need for the change from fork as of yet (for >> multiprocessing) and I am leery to change the internal implementation >> and semantics right now, or anytime soon. I'd be interested in seeing >> the patch, but if the concern is that global threading objects could >> be left in the state that they're in at the time of the fork(), I >> think people know that or we can easily document this fact. > > If Pascal provides a patch I think it would really be good to consider it. > Not being able to mix threads and multiprocessing is a potentially annoying > wart. I don't know what spawnl is supposed to do, but it really sounds like the wrong solution. Instead, we should aim to make Python fork-safe. If the primary concern is that locks get inherited, we should change the Python locks so that they get auto-released on fork (unless otherwise specified on lock creation). This may sound like an uphill battle, but if there was a smart and easy solution to the problem, POSIX would be providing it. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 5:20 PM, "Martin v. Löwis" wrote: > Antoine Pitrou wrote: >> Jesse Noller gmail.com> writes: >>> I don't see the need for the change from fork as of yet (for >>> multiprocessing) and I am leery to change the internal implementation >>> and semantics right now, or anytime soon. I'd be interested in seeing >>> the patch, but if the concern is that global threading objects could >>> be left in the state that they're in at the time of the fork(), I >>> think people know that or we can easily document this fact. >> >> If Pascal provides a patch I think it would really be good to consider it. >> Not being able to mix threads and multiprocessing is a potentially annoying >> wart. > > I don't know what spawnl is supposed to do, but it really sounds like > the wrong solution. > > Instead, we should aim to make Python fork-safe. If the primary concern > is that locks get inherited, we should change the Python locks so that > they get auto-released on fork (unless otherwise specified on lock > creation). This may sound like an uphill battle, but if there was a > smart and easy solution to the problem, POSIX would be providing it. > > Regards, > Martin Posix threads expose the _atfork function ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
> The python I use (win32 2.6.2) does not complain if it cannot read > from or write to a .pyc; and thus it handles multiple python processes > trying to create .pyc files at the same time. Is the .zip case really > any different? Since .pyc files are an optimization, it seems natural > and correct that .pyc IO errors pass silently (apologies to Tim). > > It's an interesting challenge to write the file in such a way that > it's safe for a reader and writer to co-exist. I grant you that this may actually work for concurrent readers (although on Windows, you'll have to pick the file share mode carefully). The reader would have to be fairly robust, as the central directory may disappear or get garbled while it is reading. So what would you do for concurrent writers, then? The current implementation relies on creat(O_EXCL) to be atomic, so a second writer would just fail. This is but the only IO operation that is guaranteed to be atomic (along with mkdir(2)), so reusing the current approach doesn't work. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
Le lundi 01 février 2010 à 23:20 +0100, "Martin v. Löwis" a écrit : > > I don't know what spawnl is supposed to do, but it really sounds like > the wrong solution. As far as I understand, I think the idea is to use the same mechanism as under Windows: spawn a new Python interpreter (in a separate process) in order to run the right Python module with the right command-line options. > Instead, we should aim to make Python fork-safe. If the primary concern > is that locks get inherited, we should change the Python locks so that > they get auto-released on fork (unless otherwise specified on lock > creation). This may sound like an uphill battle, but if there was a > smart and easy solution to the problem, POSIX would be providing it. We must distinguish between locks owned by the thread which survived the fork(), and locks owned by other threads. I guess it's possible if we keep track of the thread id which acquired the lock, and if we give _whatever_after_fork() the thread id of the thread which initiated the fork() in the parent process. Do you think it would be enough to guarantee correctness? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 5:18 PM, Jesse Noller wrote: > I don't disagree there; but then again, I haven't seen this issue > arise (in my own code)/no bug reports/no test cases that show this to > be a consistent issue. I'm perfectly OK with being wrong, I'm just > leery to tearing out the internals for something else "not forking". I'd appreciate it. It made my life a lot harder when trying to move JIT compilation to a background thread, for exactly the reasons we've been talking about. All the locks in the queue can be left in an undefined state. I solved my problem by digging into the posix module and inserting the code I needed to stop the background thread. Another problem with forking from a threaded Python application is that you leak all the references held by the other thread's stack. This isn't a problem if you're planning on exec'ing soon, but it's something we don't usually think about. It would be nice if threads + multiprocessing worked out of the box without people having to think about it. Using threads and fork without exec is evil. Reid ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Feb 1, 2010, at 5:35 PM, Reid Kleckner wrote: On Mon, Feb 1, 2010 at 5:18 PM, Jesse Noller wrote: I don't disagree there; but then again, I haven't seen this issue arise (in my own code)/no bug reports/no test cases that show this to be a consistent issue. I'm perfectly OK with being wrong, I'm just leery to tearing out the internals for something else "not forking". I'd appreciate it. It made my life a lot harder when trying to move JIT compilation to a background thread, for exactly the reasons we've been talking about. All the locks in the queue can be left in an undefined state. I solved my problem by digging into the posix module and inserting the code I needed to stop the background thread. Another problem with forking from a threaded Python application is that you leak all the references held by the other thread's stack. This isn't a problem if you're planning on exec'ing soon, but it's something we don't usually think about. It would be nice if threads + multiprocessing worked out of the box without people having to think about it. Using threads and fork without exec is evil. Reid Your reasonable argument is making it difficult for me to be irrational about this. This begs the question - assuming a patch that clones the behavior of win32 for multiprocessing, would the default continue to be forking behavior, or the new? Jesse ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
>> Instead, we should aim to make Python fork-safe. If the primary concern >> is that locks get inherited, we should change the Python locks so that >> they get auto-released on fork (unless otherwise specified on lock >> creation). This may sound like an uphill battle, but if there was a >> smart and easy solution to the problem, POSIX would be providing it. >> >> Regards, >> Martin > > Posix threads expose the _atfork function Yes - that's a solution, but one that I would call neither smart nor simple. It requires you to register all resources with atfork that you want to see released. I was puzzled as to how precisely use atfork, but the opengroup atfork documentation actually explains it: you need to acquire all mutexes in the prepare handler, and then release them in the parent and child handlers. This, again, is tedious; ISTM it can also cause deadlocks if the prepare handler acquires locks in the wrong order. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 5:20 PM, "Martin v. Löwis" wrote: > Instead, we should aim to make Python fork-safe. If the primary concern > is that locks get inherited, we should change the Python locks so that > they get auto-released on fork (unless otherwise specified on lock > creation). This may sound like an uphill battle, but if there was a > smart and easy solution to the problem, POSIX would be providing it. The "right" (as if you can actually use fork and threads at the same time correctly) way to do this is to acquire all locks before the fork, and release them after the fork. The reason is that if you don't, whatever data the locks guarded will be in an undefined state, because the thread that used to own the lock was in the middle of modifying it. POSIX does provide pthread_atfork, but it's not quite enough. It's basically good enough for things like libc's malloc or other global locks held for a short duration buried in libraries. No one will ever try to fork while doing an allocation, for example. The problem is that if you have a complicated set of locks that must be acquired in a certain order, you can express that to pthread_atfork by giving it the callbacks in the right order, but it's hard. However, I think for Python, it would be good enough to have an at_fork registration mechanism so people can acquire and release locks at fork. If we assume that most library locks are like malloc, and won't actually be held while forking, it's good enough. Reid ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
> We must distinguish between locks owned by the thread which survived the > fork(), and locks owned by other threads. I guess it's possible if we > keep track of the thread id which acquired the lock, and if we give > _whatever_after_fork() the thread id of the thread which initiated the > fork() in the parent process. Do you think it would be enough to > guarantee correctness? Interestingly, the POSIX pthread_atfork documentation defines how you are supposed to do that: create an atfork handler set, and acquire all mutexes in the prepare handler. Then fork, and have the parent and child handlers release the locks. Of course, unless your locks are recursive, you still have to know what locks you are already holding. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Mon, Feb 1, 2010 at 5:48 PM, Jesse Noller wrote: > Your reasonable argument is making it difficult for me to be irrational > about this. No problem. :) > This begs the question - assuming a patch that clones the behavior of win32 > for multiprocessing, would the default continue to be forking behavior, or > the new? Pros of forking: - probably faster (control doesn't start back at Py_Main) - more shared memory (but not that much because of refcounts) - objects sent to child processes don't have to be pickleable Cons: - leaks memory with threads - can lead to deadlocks or races with threads I think the fork+exec or spawnl version is probably the better default because it's safer. If people can't be bothered to make their objects pickleable or really want the old behavior, it can be left as an option. Reid ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Reid Kleckner wrote: > On Mon, Feb 1, 2010 at 5:18 PM, Jesse Noller wrote: >> I don't disagree there; but then again, I haven't seen this issue >> arise (in my own code)/no bug reports/no test cases that show this to >> be a consistent issue. I'm perfectly OK with being wrong, I'm just >> leery to tearing out the internals for something else "not forking". > > I'd appreciate it. It made my life a lot harder when trying to move > JIT compilation to a background thread, for exactly the reasons we've > been talking about. All the locks in the queue can be left in an > undefined state. I solved my problem by digging into the posix > module and inserting the code I needed to stop the background thread. > > Another problem with forking from a threaded Python application is > that you leak all the references held by the other thread's stack. > This isn't a problem if you're planning on exec'ing soon, but it's > something we don't usually think about. > > It would be nice if threads + multiprocessing worked out of the box > without people having to think about it. Using threads and fork > without exec is evil. Yup, but that's true for *any* POSIXy envirnnment, not just Python. The only sane non-exec mixture is to have a single-thread parent fork, and restrict spawning threads to the children. Tres. - -- === Tres Seaver +1 540-429-0999 [email protected] Palladion Software "Excellence by Design"http://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktnXmUACgkQ+gerLs4ltQ7HvwCgibnpYbG2hSZUq7BbtUtQuXRu yJUAn19nh9yQ0hlBxa7tc3VviBbZ2sVn =VjKm -END PGP SIGNATURE- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
Le lundi 01 février 2010 à 23:58 +0100, "Martin v. Löwis" a écrit : > > Interestingly, the POSIX pthread_atfork documentation defines how you > are supposed to do that: create an atfork handler set, and acquire all > mutexes in the prepare handler. Then fork, and have the parent and child > handlers release the locks. Of course, unless your locks are recursive, > you still have to know what locks you are already holding. So, if we restrict ourselves to Python-level locks (thread.Lock and thread.RLock), I guess we could just chain them in a doubly-linked list and add an internal _PyThread_AfterFork() function? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On 01/02/2010 23:03, Reid Kleckner wrote: On Mon, Feb 1, 2010 at 5:48 PM, Jesse Noller wrote: Your reasonable argument is making it difficult for me to be irrational about this. No problem. :) This begs the question - assuming a patch that clones the behavior of win32 for multiprocessing, would the default continue to be forking behavior, or the new? Pros of forking: - probably faster (control doesn't start back at Py_Main) - more shared memory (but not that much because of refcounts) - objects sent to child processes don't have to be pickleable Cons: - leaks memory with threads - can lead to deadlocks or races with threads I think the fork+exec or spawnl version is probably the better default because it's safer. If people can't be bothered to make their objects pickleable or really want the old behavior, it can be left as an option. Wouldn't changing the default be backwards incompatible? Michael Reid ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk -- http://www.ironpythoninaction.com/ http://www.voidspace.org.uk/blog READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
Tres Seaver palladion.com> writes: > > Yup, but that's true for *any* POSIXy envirnnment, not just Python. The > only sane non-exec mixture is to have a single-thread parent fork, and > restrict spawning threads to the children. The problem is that we're advocating multiprocessing as the solution for multiprocessor scalability. We can't just say "oh and, by the way, shouldn't use it with several threads, hope you don't mind". Antoine. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
On Feb 1, 2010, at 6:25 PM, Michael Foord wrote: On 01/02/2010 23:03, Reid Kleckner wrote: On Mon, Feb 1, 2010 at 5:48 PM, Jesse Noller wrote: Your reasonable argument is making it difficult for me to be irrational about this. No problem. :) This begs the question - assuming a patch that clones the behavior of win32 for multiprocessing, would the default continue to be forking behavior, or the new? Pros of forking: - probably faster (control doesn't start back at Py_Main) - more shared memory (but not that much because of refcounts) - objects sent to child processes don't have to be pickleable Cons: - leaks memory with threads - can lead to deadlocks or races with threads I think the fork+exec or spawnl version is probably the better default because it's safer. If people can't be bothered to make their objects pickleable or really want the old behavior, it can be left as an option. Wouldn't changing the default be backwards incompatible? Michael Yes, it would, which is why it would have to be a switch for 2.x, and could only possibly be changed/broken for 3.x. Note, this is only off the top of my head, if Pascal is still game to do a patch (skipping spawnl, and going with something more akin to the current windows implementation) and it comes out as agreeable to all parties the exact integration details can be worked out then. Part of me wonders though - this is a problem with python, fork and threads in general. Things in the community such as gunicorn which are "bringing forking back" are going to slam into this to. Jesse ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3147: PYC Repository Directories
>> The python I use (win32 2.6.2) does not complain if it cannot read >> from or write to a .pyc; and thus it handles multiple python processes >> trying to create .pyc files at the same time. Is the .zip case really >> any different? [ snip discussion of difficulty of writing a sharing-safe update ] On Mon, Feb 1, 2010 at 2:28 PM, "Martin v. Löwis" wrote: > So what would you do for concurrent writers, then? The current > implementation relies on creat(O_EXCL) to be atomic, so a second > writer would just fail. This is but the only IO operation that is > guaranteed to be atomic (along with mkdir(2)), so reusing the current > approach doesn't work. Sorry, I'm guilty of having assumed that the POSIX API has an operation analogous to win32 CreateFile(GENERIC_WRITE, 0 /* ie, "FILE_SHARE_NONE"*/). If shared-reader/single-writer semantics are not available, the only other possibility I can think of is to avoid opening the .pyc for write. To write a .pyc one would read it, write and flush updates to a temp file, and rename(). This isn't atomic, but given the invariant that the .pyc always contains consistent data, the new file will also only contain consistent data. Races manifest as updates getting lost. One obvious drawback is that the the .pyc inode would change on every update. paul ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Forking and Multithreading - enemy brothers
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Antoine Pitrou wrote: > Tres Seaver palladion.com> writes: >> Yup, but that's true for *any* POSIXy envirnnment, not just Python. The >> only sane non-exec mixture is to have a single-thread parent fork, and >> restrict spawning threads to the children. > > The problem is that we're advocating multiprocessing as the solution for > multiprocessor scalability. We can't just say "oh and, by the way, shouldn't > use > it with several threads, hope you don't mind". I think it is perfectly reasonable to say, "Oh, by the way, *don't* spawn any threads before calling fork(), or else exec() a new process immediately": wishing won't make the underlying realities any different. Note that the "we" in your sentence is not anything like the "quod semper quod ubique quod ab omnibus" criterion for accepting dogma: mutliprocessing is a tool, and needs to be used according to its nature, just as with threading. Tres. - -- === Tres Seaver +1 540-429-0999 [email protected] Palladion Software "Excellence by Design"http://palladion.com -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAktniWgACgkQ+gerLs4ltQ7c6wCfZ9ohkbehfU5fbOfwH+l7jVX0 6WwAn1ZywfDsIJCB0KS0/DPwaiPq1LNJ =86yK -END PGP SIGNATURE- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] PEP 3147: PYC Repository Directories
On Mon, Feb 1, 2010 at 11:56 PM, Pablo Mouzo wrote: > On Mon, Feb 1, 2010 at 10:23 PM, Paul Du Bois wrote: > [...] >> Sorry, I'm guilty of having assumed that the POSIX API has an >> operation analogous to win32 CreateFile(GENERIC_WRITE, 0 /* ie, >> "FILE_SHARE_NONE"*/). >> >> If shared-reader/single-writer semantics are not available, the only >> other possibility I can think of is to avoid opening the .pyc for > [...] Actually, there are (sadly) many ways to do that. Projects like sqlite support shared-reader/single-writer on many platforms, so that code could be reused. Pablo ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython
2010/2/1 Collin Winter > I believe these VMs would have little overlap. I cannot imagine that > Unladen Swallow's needs have much in common with Stackless's, or with > those of a hypothetical register machine to replace the current stack > machine. > > Let's consider that last example in more detail: a register machine > would require completely different bytecode. This would require > replacing the bytecode compiler, the peephole optimizer, and the > bytecode eval loop. The frame object would need to be changed to hold > the registers and a new blockstack design; the code object would have > to potentially hold a new bytecode layout. > > I suppose making all this pluggable would be possible, but I don't see > the point. This kind of experimentation is ideal for a branch: go off, > test your idea, report your findings, merge back. Let the branch be > long-lived, if need be. The Mercurial migration will make all this > easier. > > > Getting the right would certainly require a major effort, but it > > would also reduce the need to have several branches of C-based > > Python implementations. > > If such a restrictive plugin-based scheme had been available when we > began Unladen Swallow, I do not doubt that we would have ignored it > entirely. I do not like the idea of artificially tying the hands of > people trying to make CPython faster. I do not see any part of Unladen > Swallow that would have been made easier by such a scheme. If > anything, it would have made our project more difficult. > > Collin Winter I completely agree. Working with wpython I have changed a lot of code ranging from the ASDL grammar to the eval loop, including some library module and tests (primarily the Python-based parser and the disassembly tools; module finder required work, too). I haven't changed the Python objects or the object model (except in the alpha release; then I dropped this "invasive" change), but I've added some helper functions in object.c, dict.c, etc. A pluggable VM isn't feasible because we are talking about a brand new CPython (library included), to be chosen each time. If approved, this model will limit a lot the optimizations that can be implemented to make CPython running faster. Cesare Di Mauro ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subprocess docs patch
On Mon, Feb 1, 2010 at 12:14 AM, "Martin v. Löwis" wrote:
>> Any help you could provide would be appreciated.
>
> Please use unified diffs in the future.
Duly noted.
> I'm -0 on this patch; it still has the negative, cautionary-patronizing
> tone ("Do not", "can be tricky", "be mindful"),
Thanks to yours and other feedback, I've tried to address this in a
newer version of the patch.
> as if readers are unable
> to grasp the description that they just read (and indeed, in the patch,
> you claim that readers *are* unable to understand how command lines
> work).
I don't think I made any statement quite that blunt; however, as the
c.l.p threads that drove me to write this patch show, there are indeed
some people who don't (correctly) understand the details of command
line tokenization.
Thanks for responding!
Cheers,
Chris
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
