Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Martin v. Löwis
> 3. In each top level directory on sys.path, shadow file heirarchy
>   Major Pro: trivial to separate out all cached files
>   Major Con: ??? (I got nuthin')

The major con of this option (and option 2) is an ambiguity of where to
look for in case of packages. In particular for namespace packages
(of the setuptools kind, or the PEP 382 kind), the directory where a
package is found on sys.path can change across Python runs.

So when you run Python several times, and install additional eggs
in-between, you get different directories all caching the same pyc
files. If you then uninstall some of the eggs, it may be difficult to
find out what pyc files to delete.

> Note that with option two, creating a bytecode only zipfile would be
> trivial: just add the __pycache__ directory as the top-level directory
> in the zipfile and leave out everything else (assume there were no data
> files in the package that were still needed).

I think any scheme that uses directories for pyc files will cause stale
pyc files to be located on disk. I then think it is important to never
automatically use these in imports - i.e. only ever consider a file in
a __pycache__ directory if you found a .py file earlier.

If that is the policy, then a __pycache__ directory in a zipfile would
have no effect (and rightly so). Instead, to run code from bytecode,
the byte code files should be on sys.path themselves (probably still
named the same way as they are named inside __pycache__).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subprocess docs patch

2010-02-01 Thread Martin v. Löwis
> Any help you could provide would be appreciated.

Please use unified diffs in the future.

I'm -0 on this patch; it still has the negative, cautionary-patronizing
tone ("Do not", "can be tricky", "be mindful"), as if readers are unable
to grasp the description that they just read (and indeed, in the patch,
you claim that readers *are* unable to understand how command lines
work). I know that Raymond Hettinger is very much opposed to this style
of documentation. So I, whose native language is not English, cannot
approve/apply this patch. Try convincing Raymond.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Gertjan Klein
Hanno Schlichting wrote:

>+1 for a single strategy that is used in all cases. The current
>solution could be phased out across multiple releases, but in the end
>there should be a single approach and no flag. Otherwise some code and
>tools will only support one of the approaches, especially if this is
>seen as something "only a minority of Linux distributions uses".

-1. As far as I can tell, this PEP proposes to solve a specific problem
that Linux distributions have. As they have decent package managers,
this PEP makes their maintainers' lives a lot easier. If implemented, I
believe it would eventually be used by all of them, not just "a
minority".

For just about anyone else, I believe the current situation works
perfectly fine, and should not be changed. Personally, I work mainly on
Windows, and things I install are placed in the site-packages directory
of the Python version I use. There is no need to place .pyc files in
subdirectories there, as there will only ever be one. Programs I write
myself are also rarely, if ever, run by multiple Python versions. They
get run by the default Python on my system; if I change the default, the
.pyc files get overwritten, which is exactly what I want, I no longer
need the old ones.

As to the single cache directory per directory versus per .py file
issue: a subdirectory per .py file is easier to manipulate manually;
listing the .py file and the subdirectory containing the compiled
versions belonging to it makes it somewhat easier to prevent errors due
to deleting the source but not the compiled version. However, as the
use-case for this PEP seems to be to make life easier for Linux
packagers, it seems that a single __pycache__ subdirectory (or whatever
the name would be) is preferable: less filesystem clutter, and no risks
of forgetting to delete .pyc files, as this is about system-managed
Python source.

Gertjan.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Antoine Pitrou

> Would you still be a -1 on making it the new scheme the default if it
> used a single cache directory instead? That would actually be cleaner
> than the current solution rather than messier.

Well, I guess no, although additional directories are always more
intrusive than additional files (visually, or with tools such as "du"
for example).



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Improved Traceback Module

2010-02-01 Thread Benjamin Schweizer
On Fri, Jan 29, 2010 at 08:02:58PM -0500, P.J. Eby wrote:
> At 01:24 AM 1/30/2010 +0100, Ludvig Ericson wrote:
>
>> On 28 jan 2010, at 22:47, P.J. Eby wrote:
>>
>> > At 07:47 PM 1/28/2010 +0100, Benjamin Schweizer wrote:
>> >>
>> >> I like the idea of configuring the list of variables with using a
>> >> convention like __trace__, though this requires me to specify what
>> >> variables cause an exception and that would be hard?
>> >
>> > The idea is that you simply use __trace__ to format the local  
>> variables for that frame.  In other words, the programmer specifies  
>> what they want to see at that point in the traceback, by setting a  
>> variable in their code.  If there's a __trace__ variable in the frame 
>> locals at that point in the traceback, simply format the traceback 
>> locals+globals using that string's .safe_format() method.
>>
>> This seems very naïve to me. It makes the code ten times as hard to  
>> maintain; adding or removing a *local variable* suddenly means you  
>> have to check what __trace__ or whatever refers to.
>
> You lost me; the purpose of a __trace__ is to show *context* -- the  
> primary *purpose* of the code at that point in the call stack.  You  
> don't put every variable in it, as that would go directly against the 
> purpose.
>
> That is, it's intended to *reduce* information overload, not increase  
> it.  If you're putting more than 2 variables in a given __trace__ you're 
> probably doing something wrong.

Ok, so how do we get things done?
Is there already an implementation of __trace__ around?
I've not yet looked at Kristjáns traceback2.py; may be it's superior.

I've tried to implement the variable dump feature with few changes, keeping
the original structure of traceback.py as it was. It's finished and
compatible, but it requires some testing.

From here, I think we could
1) keep traceback.py as it was
2) add my non-invasive patch
3) further improve traceback.py to implement __trace__
4) switch to Kristján's traceback2.py
5) improve Kristján's traceback2.py even further.


Greetings

-- 
http://benjamin-schweizer.de/contact
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread M.-A. Lemburg
Raymond Hettinger wrote:
> 
> On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
>> Abstract
>> 
>>
>> This PEP describes an extension to Python's import mechanism which
>> improves sharing of Python source code files among multiple installed
>> different versions of the Python interpreter.
> 
> +1 

+1 from here as well.

>>  It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).  

+1 on the idea of having a standard for Python module cache
files.

+1 on having those files in the same directory as the associated
module file, just like we already do.

-1 on the idea of using directories for these. This only
complicates cleanup, management and distribution of such
files. Perhaps we could make this an option, though.

Thanks,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 01 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread M.-A. Lemburg
M.-A. Lemburg wrote:
> Collin Winter wrote:
>> I added startup benchmarks for Mercurial and Bazaar yesterday
>> (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we
>> can use them as more macro-ish benchmarks, rather than merely starting
>> the CPython binary over and over again. If you have ideas for better
>> Mercurial/Bazaar startup scenarios, I'd love to hear them. The new
>> hg_startup and bzr_startup benchmarks should give us some more data
>> points for measuring improvements in startup time.
>>
>> One idea we had for improving startup time for apps like Mercurial was
>> to allow the creation of hermetic Python "binaries", with all
>> necessary modules preloaded. This would be something like Smalltalk
>> images. We haven't yet really fleshed out this idea, though.
> 
> In Python you can do the same with the freeze.py utility. See
> 
> http://www.egenix.com/www2002/python/mxCGIPython.html
> 
> for an old project where we basically put the Python
> interpreter and stdlib into a single executable.
> 
> We've recently revisited that project and created something
> we call "pyrun". It fits Python 2.5 into a single executable
> and a set of shared modules (which for various reasons cannot
> be linked statically)... 12MB in total.
> 
> If you load lots of modules from the stdlib this does provide
> a significant improvement over standard Python.
> 
> Back to the PEP's proposal:
> 
> Looking at the data you currently have, the negative results
> currently don't really look good in the light of the small
> performance improvements.
> 
> Wouldn't it be possible to have the compiler approach work
> in three phases in order to reduce the memory footprint and
> startup time hit, ie.
> 
>  1. run an instrumented Python interpreter to collect all
> the needed compiler information; write this information into
> a .pys file (Python stats)
> 
>  2. create compiled versions of the code for various often
> used code paths and type combinations by reading the
> .pys file and generating an .so file as regular
> Python extension module
> 
>  3. run an uninstrumented Python interpreter and let it
> use the .so files instead of the .py ones
> 
> In production, you'd then only use step 3 and avoid the
> overhead of steps 1 and 2.
> 
> Moreover, the .so file approach
> would only load the code for code paths and type combinations
> actually used in a particular run of the Python code into
> memory and allow multiple Python processes to share it.
> 
> As side effect, you'd probably also avoid the need to have
> C++ code in the production Python runtime - that is unless
> LLVM requires some kind of runtime support which is written
> in C++.

BTW: Some years ago we discussed the idea of pluggable VMs for
Python. Wouldn't U-S be a good motivation to revisit this idea ?

We could then have a VM based on byte code using a stack
machines, one based on word code using a register machine
and perhaps one that uses the Stackless approach.

Each VM type could use the PEP 3147 approach to store
auxiliary files to store byte code, word code or machine
compiled code.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 01 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread Collin Winter
Hey MA,

On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg  wrote:
> BTW: Some years ago we discussed the idea of pluggable VMs for
> Python. Wouldn't U-S be a good motivation to revisit this idea ?
>
> We could then have a VM based on byte code using a stack
> machines, one based on word code using a register machine
> and perhaps one that uses the Stackless approach.

What is the usecase for having pluggable VMs? Is the idea that, at
runtime, the user would select which virtual machine they want to run
their code under? How would the user make that determination
intelligently?

I think this idea underestimates a) how deeply the current CPython VM
is intertwined with the rest of the implementation, and b) the nature
of the changes required by these separate VMs. For example, Unladen
Swallow adds fields to the C-level structs for dicts, code objects and
frame objects; how would those changes be pluggable? Stackless
requires so many modifications that it is effectively a fork; how
would those changes be pluggable?

Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Brett Cannon
On Sun, Jan 31, 2010 at 11:16, Guido van Rossum  wrote:
> Whoa. This thread already exploded. I'm picking this message to
> respond to because it reflects my own view after reading the PEP.
>
> On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting  wrote:
>> On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
>>  wrote:
>>> I don't know whether I in favour of using a single pyr folder or not
>>> but if a single folder is used I'd definitely prefer the folder to be
>>> called __pyr__ rather than .pyr.
>
> Exactly what I would prefer. I worry that having many small
> directories is a fairly poor use of the filesystem. A quick scan of
> /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but
> only 57 directories).
>
>> Do you have any specific reason for that?
>>
>> Using the leading dot notation is an established pattern to hide
>> non-essential information from directory views. What makes this
>> non-applicable in this situation and a custom Python notation better?
>
> Because we don't want to completely hide the pyc files. Also the dot
> naming convention is somewhat platform-specific.
>
> FWIW in Python 3, the __file__ variable always points to the .py
> source filename. I agreed with Georg that there ought to be an API for
> finding the pyc file for a module. This could be a small addition to
> the PEP.

Importlib somewhat does this already through a module's loader:
http://docs.python.org/py3k/library/importlib.html#importlib.abc.PyPycLoader.bytecode_path
. If you want to work off of module names this is enough; if importlib
did the import then you can do __loader__.bytecode_path(__name__). And
if it has not been loaded yet then that simply requires me exposing an
importlib.find_module() that returns a loader for the module.

Trick comes down to when you want it based on __file__ instead of the
module name. Oh, and me finally breaking up import so that it has
proper loaders or bootstrapping importlib; small snag. =) But at least
the code already exists for this stuff.

-Brett

>
> --
> --Guido van Rossum (python.org/~guido)
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/brett%40python.org
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread M.-A. Lemburg
Collin Winter wrote:
> Hey MA,
> 
> On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg  wrote:
>> BTW: Some years ago we discussed the idea of pluggable VMs for
>> Python. Wouldn't U-S be a good motivation to revisit this idea ?
>>
>> We could then have a VM based on byte code using a stack
>> machines, one based on word code using a register machine
>> and perhaps one that uses the Stackless approach.
> 
> What is the usecase for having pluggable VMs? Is the idea that, at
> runtime, the user would select which virtual machine they want to run
> their code under? How would the user make that determination
> intelligently?

The idea back then (IIRC) was to have a compile time option to select
one of a few available VMs, in order to more easily experiment with
new or optimized implementations such as e.g. a register based VM.

It should even be possible to factor out the VM into a DLL/SO which
is then selected and loaded via a command line option.

> I think this idea underestimates a) how deeply the current CPython VM
> is intertwined with the rest of the implementation, and b) the nature
> of the changes required by these separate VMs. For example, Unladen
> Swallow adds fields to the C-level structs for dicts, code objects and
> frame objects; how would those changes be pluggable? Stackless
> requires so many modifications that it is effectively a fork; how
> would those changes be pluggable?

They wouldn't be pluggable. Such changes would have to be made
in a more general way in order to serve more than just one VM.

Getting the right would certainly require a major effort, but it
would also reduce the need to have several branches of C-based
Python implementations.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 01 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Brett Cannon
On Sun, Jan 31, 2010 at 11:04, Raymond Hettinger
 wrote:
>
> On Jan 30, 2010, at 4:00 PM, Barry Warsaw wrote:
>> Abstract
>> 
>>
>> This PEP describes an extension to Python's import mechanism which
>> improves sharing of Python source code files among multiple installed
>> different versions of the Python interpreter.
>
> +1
>
>
>>  It does this by
>> allowing many different byte compilation files (.pyc files) to be
>> co-located with the Python source file (.py file).
>
> It would be nice if all the compilation files could be tucked
> into one single zipfile per directory to reduce directory clutter.
>
> It has several benefits besides tidiness. It hides the implementation
> details of when magic numbers get shifted.  And it may allow faster
> start-up times when the zipfile is in the disk cache.

It also eliminates stat calls. I have not seen anyone mention this,
but on filesystems where stat calls are expensive (e.g. NFS), this is
going to increase import cost (and thus startup time which some people
are already incredibly paranoid about). You are now going to shift
from a single stat call to check for a bytecode file to two just in
the search phase *per file check* (remember you need to search for
module.py and module/__init__.py). And then you get to repeat all of
this during the load process (potentially, depending on how aggressive
the loader is with caching).

As others have said, an uncompressed zip file could work here. Or even
a file format where the first 4 bytes is the timestamp and then after
that are chunks of length-of-bytecode|magic|bytecode. That allows for
opening a file in append mode to add more bytecode instead of a
zipfile's requirement of rewriting the TOC on the end of the file
every time you mutate the file (if I remember the zip file format
correctly). Biggest cost in this simple approach would be reading the
file in (unless you mmap the thing when possible) since once read the
code will be a bytes object which means constant time indexing until
you find the right magic number. And adding support to differentiate
between -O bytecode is simply adding a marker per chunk of bytecode.

And I disagree this would be difficult as the PEP suggests given the
proper file format. For zip files zipimport already has the read code
in C; it just would require the code to write to a zip file. And as
for the format I mentioned above, that's dead-simple to implement.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread Collin Winter
On Mon, Feb 1, 2010 at 11:17 AM, M.-A. Lemburg  wrote:
> Collin Winter wrote:
>> I think this idea underestimates a) how deeply the current CPython VM
>> is intertwined with the rest of the implementation, and b) the nature
>> of the changes required by these separate VMs. For example, Unladen
>> Swallow adds fields to the C-level structs for dicts, code objects and
>> frame objects; how would those changes be pluggable? Stackless
>> requires so many modifications that it is effectively a fork; how
>> would those changes be pluggable?
>
> They wouldn't be pluggable. Such changes would have to be made
> in a more general way in order to serve more than just one VM.

I believe these VMs would have little overlap. I cannot imagine that
Unladen Swallow's needs have much in common with Stackless's, or with
those of a hypothetical register machine to replace the current stack
machine.

Let's consider that last example in more detail: a register machine
would require completely different bytecode. This would require
replacing the bytecode compiler, the peephole optimizer, and the
bytecode eval loop. The frame object would need to be changed to hold
the registers and a new blockstack design; the code object would have
to potentially hold a new bytecode layout.

I suppose making all this pluggable would be possible, but I don't see
the point. This kind of experimentation is ideal for a branch: go off,
test your idea, report your findings, merge back. Let the branch be
long-lived, if need be. The Mercurial migration will make all this
easier.

> Getting the right would certainly require a major effort, but it
> would also reduce the need to have several branches of C-based
> Python implementations.

If such a restrictive plugin-based scheme had been available when we
began Unladen Swallow, I do not doubt that we would have ignored it
entirely. I do not like the idea of artificially tying the hands of
people trying to make CPython faster. I do not see any part of Unladen
Swallow that would have been made easier by such a scheme. If
anything, it would have made our project more difficult.

Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Antoine Pitrou
Le Mon, 01 Feb 2010 11:35:19 -0800, Brett Cannon a écrit :
> 
> As others have said, an uncompressed zip file could work here. Or even a
> file format where the first 4 bytes is the timestamp and then after that
> are chunks of length-of-bytecode|magic|bytecode. That allows for opening
> a file in append mode to add more bytecode instead of a zipfile's
> requirement of rewriting the TOC on the end of the file every time you
> mutate the file (if I remember the zip file format correctly).

Making the file append-only doesn't eliminate the problems with 
concurrent modification. You still have to specify and implement a robust 
cross-platform file locking system which will have to be shared by all 
implementations. This is really a great deal of complication to add to 
the interpreter(s).

And, besides, it might not even work on NFS which was the motivation for 
your proposal :)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Pascal Chambon


So, if a patch was proposed for the multiprocessing, allowing an unified 
"spawnl", thread-safe, semantic, do you think something could prevent 
its integration ?


We may ignore the subprocess module, since fork+exec shouldn't be 
bothered by the (potentially disastrous) state of child process data.
But it bothers me to think multithreading and multiprocessing are 
currently opposed whereas theoretically nothing justifies it...


Regards,
Pascal





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread Terry Reedy

On 2/1/2010 1:32 PM, Collin Winter wrote:

Hey MA,

On Mon, Feb 1, 2010 at 9:58 AM, M.-A. Lemburg  wrote:

BTW: Some years ago we discussed the idea of pluggable VMs for
Python. Wouldn't U-S be a good motivation to revisit this idea ?

We could then have a VM based on byte code using a stack
machines, one based on word code using a register machine
and perhaps one that uses the Stackless approach.


What is the usecase for having pluggable VMs?


Running an application full time on multiple machines


Is the idea that, at runtime, the user would select which virtual

>  machine they want to run their code under?

From you comments below, I would presume the selection should be at 
startup, from the command line, before building Python objects.



How would the user make that determination intelligently?


The same way people would determine whether to select JIT or not -- by 
testing space and time performance for their app, with test time 
proportioned to expected run time.


Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Martin v. Löwis
> And I disagree this would be difficult as the PEP suggests given the
> proper file format. For zip files zipimport already has the read code
> in C; it just would require the code to write to a zip file. And as
> for the format I mentioned above, that's dead-simple to implement.

How do you write to a zipfile while others are reading it?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Jesse Noller
On Mon, Feb 1, 2010 at 3:09 PM, Pascal Chambon  wrote:
>
> So, if a patch was proposed for the multiprocessing, allowing an unified
> "spawnl", thread-safe, semantic, do you think something could prevent its
> integration ?
>
> We may ignore the subprocess module, since fork+exec shouldn't be bothered
> by the (potentially disastrous) state of child process data.
> But it bothers me to think multithreading and multiprocessing are currently
> opposed whereas theoretically nothing justifies it...
>
> Regards,
> Pascal


I don't see the need for the change from fork as of yet (for
multiprocessing) and I am leery to change the internal implementation
and semantics right now, or anytime soon. I'd be interested in seeing
the patch, but if the concern is that global threading objects could
be left in the state that they're in at the time of the fork(), I
think people know that or we can easily document this fact.

jesse
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Brett Cannon
On Mon, Feb 1, 2010 at 13:19, "Martin v. Löwis"  wrote:
>> And I disagree this would be difficult as the PEP suggests given the
>> proper file format. For zip files zipimport already has the read code
>> in C; it just would require the code to write to a zip file. And as
>> for the format I mentioned above, that's dead-simple to implement.
>
> How do you write to a zipfile while others are reading it?
>

By hating concurrency (i.e. I don't have an answer which kills my idea).

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Antoine Pitrou
Jesse Noller  gmail.com> writes:
> 
> I don't see the need for the change from fork as of yet (for
> multiprocessing) and I am leery to change the internal implementation
> and semantics right now, or anytime soon. I'd be interested in seeing
> the patch, but if the concern is that global threading objects could
> be left in the state that they're in at the time of the fork(), I
> think people know that or we can easily document this fact.

If Pascal provides a patch I think it would really be good to consider it.
Not being able to mix threads and multiprocessing is a potentially annoying
wart. It means you must choose one or the other from the start, and once you've
made this decision you are stuck with it.
(not to mention that libraries can use threads and locks behind your back)

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Jesse Noller
On Mon, Feb 1, 2010 at 4:32 PM, Antoine Pitrou  wrote:
> Jesse Noller  gmail.com> writes:
>>
>> I don't see the need for the change from fork as of yet (for
>> multiprocessing) and I am leery to change the internal implementation
>> and semantics right now, or anytime soon. I'd be interested in seeing
>> the patch, but if the concern is that global threading objects could
>> be left in the state that they're in at the time of the fork(), I
>> think people know that or we can easily document this fact.
>
> If Pascal provides a patch I think it would really be good to consider it.
> Not being able to mix threads and multiprocessing is a potentially annoying
> wart. It means you must choose one or the other from the start, and once 
> you've
> made this decision you are stuck with it.
> (not to mention that libraries can use threads and locks behind your back)
>

I don't see spawnl as a viable alternative to fork. I imagine that I,
and others successfully mix threads and multiprocessing on non-win32
platforms just fine, knowing of course that fork() can cause heartburn
if you have global locks code within the forked() processes might be
dependent on. In fact multiprocessing uses threading internally.

For the win32 implementation, given win32 doesn't implement fork(),
see 
http://svn.python.org/view/python/trunk/Lib/multiprocessing/forking.py?view=markup,
about halfway down. We already have an implementation that spawns a
subprocess and then pushes the required state to the child. The
fundamental need for things to be pickleable *all the time* kinda
makes it annoying to work with.

jesse
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Paul Du Bois
> On Mon, Feb 1, 2010 at 13:19, "Martin v. Löwis"  wrote:
>> How do you write to a zipfile while others are reading it?

On Mon, Feb 1, 2010 at 1:23 PM, Brett Cannon  wrote:
> By hating concurrency (i.e. I don't have an answer which kills my idea).

The python I use (win32 2.6.2) does not complain if it cannot read
from or write to a .pyc; and thus it handles multiple python processes
trying to create .pyc files at the same time. Is the .zip case really
any different? Since .pyc files are an optimization, it seems natural
and correct that .pyc IO errors pass silently (apologies to Tim).

It's an interesting challenge to write the file in such a way that
it's safe for a reader and writer to co-exist. Like Brett, I
considered an append-only scheme, but one needs to handle the case
where the bytecode for a particular magic number changes. At some
point you'd need to sweep garbage from the file. All solutions seem
unnecessarily complex, and unnecessary since in practice the case
should not come up.

paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Antoine Pitrou
Jesse Noller  gmail.com> writes:
> 
> I don't see spawnl as a viable alternative to fork. I imagine that I,
> and others successfully mix threads and multiprocessing on non-win32
> platforms just fine, knowing of course that fork() can cause heartburn
> if you have global locks code within the forked() processes might be
> dependent on. In fact multiprocessing uses threading internally.

Sure. But sometimes global locks are out of your control. They can be e.g. in a
library (Pascal gave the example of the logging module; we may have more of them
in the stdlib).

It's certainly much more practical to give a choice in multithreading than ask
all libraries writers to eliminate global locks.

> We already have an implementation that spawns a
> subprocess and then pushes the required state to the child.

I know, and that's why it wouldn't be a very controversial change to add an
option to enable this path under Linux, would it?

> The
> fundamental need for things to be pickleable *all the time* kinda
> makes it annoying to work with.

Oh, sure. But if you want to be Windows-compatible you'd better comply with this
requirement anyway.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Jesse Noller
On Mon, Feb 1, 2010 at 5:08 PM, Antoine Pitrou  wrote:

>> I don't see spawnl as a viable alternative to fork. I imagine that I,
>> and others successfully mix threads and multiprocessing on non-win32
>> platforms just fine, knowing of course that fork() can cause heartburn
>> if you have global locks code within the forked() processes might be
>> dependent on. In fact multiprocessing uses threading internally.
>
> Sure. But sometimes global locks are out of your control. They can be e.g. in 
> a
> library (Pascal gave the example of the logging module; we may have more of 
> them
> in the stdlib).

I don't disagree there; but then again, I haven't seen this issue
arise (in my own code)/no bug reports/no test cases that show this to
be a consistent issue. I'm perfectly OK with being wrong, I'm just
leery to tearing out the internals for something else "not forking".

> It's certainly much more practical to give a choice in multithreading than ask
> all libraries writers to eliminate global locks.

Trust me, I'm all for giving users choice, and I'd never turn down
help - I'm just doubtful/wary on this count. If we wanted a fork-free
non-windows implementation, ala the current win32 implementation, then
we should just clone the windows implementation and work forward from
there.

> I know, and that's why it wouldn't be a very controversial change to add an
> option to enable this path under Linux, would it?

Fair enough; but then you have the additional problem of trying to
explain to users the difference between
multiprocessing.Process(nofork=True) or .Process(fork=False) or
whatever argument we'd decide to use "use this if you have no threads,
global locks, etc, etc".

> Oh, sure. But if you want to be Windows-compatible you'd better comply with 
> this
> requirement anyway.
>

Kinda, sorta. So far, 99% of the multiprocessing code I've seen
doesn't attempt to be compatible with both unixes, and win32 -
commonly, it's either win32 *or* linux. Obviously, being compatible
with both means picking the lowest common denominator.

jesse
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Martin v. Löwis
Antoine Pitrou wrote:
> Jesse Noller  gmail.com> writes:
>> I don't see the need for the change from fork as of yet (for
>> multiprocessing) and I am leery to change the internal implementation
>> and semantics right now, or anytime soon. I'd be interested in seeing
>> the patch, but if the concern is that global threading objects could
>> be left in the state that they're in at the time of the fork(), I
>> think people know that or we can easily document this fact.
> 
> If Pascal provides a patch I think it would really be good to consider it.
> Not being able to mix threads and multiprocessing is a potentially annoying
> wart.

I don't know what spawnl is supposed to do, but it really sounds like
the wrong solution.

Instead, we should aim to make Python fork-safe. If the primary concern
is that locks get inherited, we should change the Python locks so that
they get auto-released on fork (unless otherwise specified on lock
creation). This may sound like an uphill battle, but if there was a
smart and easy solution to the problem, POSIX would be providing it.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Jesse Noller
On Mon, Feb 1, 2010 at 5:20 PM, "Martin v. Löwis"  wrote:
> Antoine Pitrou wrote:
>> Jesse Noller  gmail.com> writes:
>>> I don't see the need for the change from fork as of yet (for
>>> multiprocessing) and I am leery to change the internal implementation
>>> and semantics right now, or anytime soon. I'd be interested in seeing
>>> the patch, but if the concern is that global threading objects could
>>> be left in the state that they're in at the time of the fork(), I
>>> think people know that or we can easily document this fact.
>>
>> If Pascal provides a patch I think it would really be good to consider it.
>> Not being able to mix threads and multiprocessing is a potentially annoying
>> wart.
>
> I don't know what spawnl is supposed to do, but it really sounds like
> the wrong solution.
>
> Instead, we should aim to make Python fork-safe. If the primary concern
> is that locks get inherited, we should change the Python locks so that
> they get auto-released on fork (unless otherwise specified on lock
> creation). This may sound like an uphill battle, but if there was a
> smart and easy solution to the problem, POSIX would be providing it.
>
> Regards,
> Martin

Posix threads expose the _atfork function
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Martin v. Löwis
> The python I use (win32 2.6.2) does not complain if it cannot read
> from or write to a .pyc; and thus it handles multiple python processes
> trying to create .pyc files at the same time. Is the .zip case really
> any different? Since .pyc files are an optimization, it seems natural
> and correct that .pyc IO errors pass silently (apologies to Tim).
> 
> It's an interesting challenge to write the file in such a way that
> it's safe for a reader and writer to co-exist. 

I grant you that this may actually work for concurrent readers
(although on Windows, you'll have to pick the file share mode
carefully). The reader would have to be fairly robust, as the central
directory may disappear or get garbled while it is reading.

So what would you do for concurrent writers, then? The current
implementation relies on creat(O_EXCL) to be atomic, so a second
writer would just fail. This is but the only IO operation that is
guaranteed to be atomic (along with mkdir(2)), so reusing the current
approach doesn't work.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Antoine Pitrou
Le lundi 01 février 2010 à 23:20 +0100, "Martin v. Löwis" a écrit :
> 
> I don't know what spawnl is supposed to do, but it really sounds like
> the wrong solution.

As far as I understand, I think the idea is to use the same mechanism as
under Windows: spawn a new Python interpreter (in a separate process) in
order to run the right Python module with the right command-line
options.

> Instead, we should aim to make Python fork-safe. If the primary concern
> is that locks get inherited, we should change the Python locks so that
> they get auto-released on fork (unless otherwise specified on lock
> creation). This may sound like an uphill battle, but if there was a
> smart and easy solution to the problem, POSIX would be providing it.

We must distinguish between locks owned by the thread which survived the
fork(), and locks owned by other threads. I guess it's possible if we
keep track of the thread id which acquired the lock, and if we give
_whatever_after_fork() the thread id of the thread which initiated the
fork() in the parent process. Do you think it would be enough to
guarantee correctness?



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Reid Kleckner
On Mon, Feb 1, 2010 at 5:18 PM, Jesse Noller  wrote:
> I don't disagree there; but then again, I haven't seen this issue
> arise (in my own code)/no bug reports/no test cases that show this to
> be a consistent issue. I'm perfectly OK with being wrong, I'm just
> leery to tearing out the internals for something else "not forking".

I'd appreciate it.  It made my life a lot harder when trying to move
JIT compilation to a background thread, for exactly the reasons we've
been talking about.  All the locks in the queue can be left in an
undefined state.   I solved my problem by digging into the posix
module and inserting the code I needed to stop the background thread.

Another problem with forking from a threaded Python application is
that you leak all the references held by the other thread's stack.
This isn't a problem if you're planning on exec'ing soon, but it's
something we don't usually think about.

It would be nice if threads + multiprocessing worked out of the box
without people having to think about it.  Using threads and fork
without exec is evil.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Jesse Noller



On Feb 1, 2010, at 5:35 PM, Reid Kleckner  wrote:

On Mon, Feb 1, 2010 at 5:18 PM, Jesse Noller   
wrote:

I don't disagree there; but then again, I haven't seen this issue
arise (in my own code)/no bug reports/no test cases that show this to
be a consistent issue. I'm perfectly OK with being wrong, I'm just
leery to tearing out the internals for something else "not forking".


I'd appreciate it.  It made my life a lot harder when trying to move
JIT compilation to a background thread, for exactly the reasons we've
been talking about.  All the locks in the queue can be left in an
undefined state.   I solved my problem by digging into the posix
module and inserting the code I needed to stop the background thread.

Another problem with forking from a threaded Python application is
that you leak all the references held by the other thread's stack.
This isn't a problem if you're planning on exec'ing soon, but it's
something we don't usually think about.

It would be nice if threads + multiprocessing worked out of the box
without people having to think about it.  Using threads and fork
without exec is evil.

Reid


Your reasonable argument is making it difficult for me to be  
irrational about this.


This begs the question - assuming a patch that clones the behavior of  
win32 for multiprocessing, would the default continue to be forking  
behavior, or the new?


Jesse
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Martin v. Löwis
>> Instead, we should aim to make Python fork-safe. If the primary concern
>> is that locks get inherited, we should change the Python locks so that
>> they get auto-released on fork (unless otherwise specified on lock
>> creation). This may sound like an uphill battle, but if there was a
>> smart and easy solution to the problem, POSIX would be providing it.
>>
>> Regards,
>> Martin
> 
> Posix threads expose the _atfork function

Yes - that's a solution, but one that I would call neither smart nor
simple. It requires you to register all resources with atfork that you
want to see released.

I was puzzled as to how precisely use atfork, but the opengroup atfork
documentation actually explains it: you need to acquire all mutexes in
the prepare handler, and then release them in the parent and child
handlers. This, again, is tedious; ISTM it can also cause deadlocks if
the prepare handler acquires locks in the wrong order.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Reid Kleckner
On Mon, Feb 1, 2010 at 5:20 PM, "Martin v. Löwis"  wrote:
> Instead, we should aim to make Python fork-safe. If the primary concern
> is that locks get inherited, we should change the Python locks so that
> they get auto-released on fork (unless otherwise specified on lock
> creation). This may sound like an uphill battle, but if there was a
> smart and easy solution to the problem, POSIX would be providing it.

The "right" (as if you can actually use fork and threads at the same
time correctly) way to do this is to acquire all locks before the
fork, and release them after the fork.  The reason is that if you
don't, whatever data the locks guarded will be in an undefined state,
because the thread that used to own the lock was in the middle of
modifying it.

POSIX does provide pthread_atfork, but it's not quite enough.  It's
basically good enough for things like libc's malloc or other global
locks held for a short duration buried in libraries.  No one will ever
try to fork while doing an allocation, for example.  The problem is
that if you have a complicated set of locks that must be acquired in a
certain order, you can express that to pthread_atfork by giving it the
callbacks in the right order, but it's hard.

However, I think for Python, it would be good enough to have an
at_fork registration mechanism so people can acquire and release locks
at fork.  If we assume that most library locks are like malloc, and
won't actually be held while forking, it's good enough.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Martin v. Löwis
> We must distinguish between locks owned by the thread which survived the
> fork(), and locks owned by other threads. I guess it's possible if we
> keep track of the thread id which acquired the lock, and if we give
> _whatever_after_fork() the thread id of the thread which initiated the
> fork() in the parent process. Do you think it would be enough to
> guarantee correctness?

Interestingly, the POSIX pthread_atfork documentation defines how you
are supposed to do that: create an atfork handler set, and acquire all
mutexes in the prepare handler. Then fork, and have the parent and child
handlers release the locks. Of course, unless your locks are recursive,
you still have to know what locks you are already holding.

Regards,
Martin


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Reid Kleckner
On Mon, Feb 1, 2010 at 5:48 PM, Jesse Noller  wrote:
> Your reasonable argument is making it difficult for me to be irrational
> about this.

No problem.  :)

> This begs the question - assuming a patch that clones the behavior of win32
> for multiprocessing, would the default continue to be forking behavior, or
> the new?

Pros of forking:
- probably faster (control doesn't start back at Py_Main)
- more shared memory (but not that much because of refcounts)
- objects sent to child processes don't have to be pickleable

Cons:
- leaks memory with threads
- can lead to deadlocks or races with threads

I think the fork+exec or spawnl version is probably the better default
because it's safer.  If people can't be bothered to make their objects
pickleable or really want the old behavior, it can be left as an
option.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Reid Kleckner wrote:
> On Mon, Feb 1, 2010 at 5:18 PM, Jesse Noller  wrote:
>> I don't disagree there; but then again, I haven't seen this issue
>> arise (in my own code)/no bug reports/no test cases that show this to
>> be a consistent issue. I'm perfectly OK with being wrong, I'm just
>> leery to tearing out the internals for something else "not forking".
> 
> I'd appreciate it.  It made my life a lot harder when trying to move
> JIT compilation to a background thread, for exactly the reasons we've
> been talking about.  All the locks in the queue can be left in an
> undefined state.   I solved my problem by digging into the posix
> module and inserting the code I needed to stop the background thread.
> 
> Another problem with forking from a threaded Python application is
> that you leak all the references held by the other thread's stack.
> This isn't a problem if you're planning on exec'ing soon, but it's
> something we don't usually think about.
> 
> It would be nice if threads + multiprocessing worked out of the box
> without people having to think about it.  Using threads and fork
> without exec is evil.

Yup, but that's true for *any* POSIXy envirnnment, not just Python.  The
only sane non-exec mixture is to have a single-thread parent fork, and
restrict spawning threads to the children.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktnXmUACgkQ+gerLs4ltQ7HvwCgibnpYbG2hSZUq7BbtUtQuXRu
yJUAn19nh9yQ0hlBxa7tc3VviBbZ2sVn
=VjKm
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Antoine Pitrou
Le lundi 01 février 2010 à 23:58 +0100, "Martin v. Löwis" a écrit :
> 
> Interestingly, the POSIX pthread_atfork documentation defines how you
> are supposed to do that: create an atfork handler set, and acquire all
> mutexes in the prepare handler. Then fork, and have the parent and child
> handlers release the locks. Of course, unless your locks are recursive,
> you still have to know what locks you are already holding.

So, if we restrict ourselves to Python-level locks (thread.Lock and
thread.RLock), I guess we could just chain them in a doubly-linked list
and add an internal _PyThread_AfterFork() function?



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Michael Foord

On 01/02/2010 23:03, Reid Kleckner wrote:

On Mon, Feb 1, 2010 at 5:48 PM, Jesse Noller  wrote:
   

Your reasonable argument is making it difficult for me to be irrational
about this.
 

No problem.  :)

   

This begs the question - assuming a patch that clones the behavior of win32
for multiprocessing, would the default continue to be forking behavior, or
the new?
 

Pros of forking:
- probably faster (control doesn't start back at Py_Main)
- more shared memory (but not that much because of refcounts)
- objects sent to child processes don't have to be pickleable

Cons:
- leaks memory with threads
- can lead to deadlocks or races with threads

I think the fork+exec or spawnl version is probably the better default
because it's safer.  If people can't be bothered to make their objects
pickleable or really want the old behavior, it can be left as an
option.
   


Wouldn't changing the default be backwards incompatible?

Michael


Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
   



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Antoine Pitrou
Tres Seaver  palladion.com> writes:
> 
> Yup, but that's true for *any* POSIXy envirnnment, not just Python.  The
> only sane non-exec mixture is to have a single-thread parent fork, and
> restrict spawning threads to the children.

The problem is that we're advocating multiprocessing as the solution for
multiprocessor scalability. We can't just say "oh and, by the way, shouldn't use
it with several threads, hope you don't mind".


Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Jesse Noller



On Feb 1, 2010, at 6:25 PM, Michael Foord   
wrote:



On 01/02/2010 23:03, Reid Kleckner wrote:
On Mon, Feb 1, 2010 at 5:48 PM, Jesse Noller   
wrote:


Your reasonable argument is making it difficult for me to be  
irrational

about this.


No problem.  :)


This begs the question - assuming a patch that clones the behavior  
of win32
for multiprocessing, would the default continue to be forking  
behavior, or

the new?


Pros of forking:
- probably faster (control doesn't start back at Py_Main)
- more shared memory (but not that much because of refcounts)
- objects sent to child processes don't have to be pickleable

Cons:
- leaks memory with threads
- can lead to deadlocks or races with threads

I think the fork+exec or spawnl version is probably the better  
default
because it's safer.  If people can't be bothered to make their  
objects

pickleable or really want the old behavior, it can be left as an
option.



Wouldn't changing the default be backwards incompatible?

Michael


Yes, it would, which is why it would have to be a switch for 2.x, and  
could only possibly be changed/broken for 3.x.


Note, this is only off the top of my head, if Pascal is still game to  
do a patch (skipping spawnl, and going with something more akin to the  
current windows implementation) and it comes out as agreeable to all  
parties the exact integration details can be worked out then.


Part of me wonders though - this is a problem with python, fork and  
threads in general. Things in the community such as gunicorn which are  
"bringing forking back" are going to slam into this to.


Jesse
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Paul Du Bois
>> The python I use (win32 2.6.2) does not complain if it cannot read
>> from or write to a .pyc; and thus it handles multiple python processes
>> trying to create .pyc files at the same time. Is the .zip case really
>> any different?

[ snip discussion of difficulty of writing a sharing-safe update ]

On Mon, Feb 1, 2010 at 2:28 PM, "Martin v. Löwis"  wrote:
> So what would you do for concurrent writers, then? The current
> implementation relies on creat(O_EXCL) to be atomic, so a second
> writer would just fail. This is but the only IO operation that is
> guaranteed to be atomic (along with mkdir(2)), so reusing the current
> approach doesn't work.

Sorry, I'm guilty of having assumed that the POSIX API has an
operation analogous to win32 CreateFile(GENERIC_WRITE, 0 /* ie,
"FILE_SHARE_NONE"*/).

If shared-reader/single-writer semantics are not available, the only
other possibility I can think of is to avoid opening the .pyc for
write. To write a .pyc one would read it, write and flush updates to a
temp file, and rename(). This isn't atomic, but given the invariant
that the .pyc always contains consistent data, the new file will also
only contain consistent data. Races manifest as updates getting lost.

One obvious drawback is that the the .pyc inode would change on every update.

paul
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-01 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Antoine Pitrou wrote:
> Tres Seaver  palladion.com> writes:
>> Yup, but that's true for *any* POSIXy envirnnment, not just Python.  The
>> only sane non-exec mixture is to have a single-thread parent fork, and
>> restrict spawning threads to the children.
> 
> The problem is that we're advocating multiprocessing as the solution for
> multiprocessor scalability. We can't just say "oh and, by the way, shouldn't 
> use
> it with several threads, hope you don't mind".

I think it is perfectly reasonable to say, "Oh, by the way, *don't*
spawn any threads before calling fork(), or else exec() a new process
immediately":  wishing won't make the underlying realities any different.

Note that the "we" in your sentence is not anything like the "quod
semper quod ubique quod ab omnibus" criterion for accepting dogma:
mutliprocessing is a tool, and needs to be used according to its nature,
just as with threading.


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktniWgACgkQ+gerLs4ltQ7c6wCfZ9ohkbehfU5fbOfwH+l7jVX0
6WwAn1ZywfDsIJCB0KS0/DPwaiPq1LNJ
=86yK
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] PEP 3147: PYC Repository Directories

2010-02-01 Thread Pablo Mouzo
On Mon, Feb 1, 2010 at 11:56 PM, Pablo Mouzo  wrote:
> On Mon, Feb 1, 2010 at 10:23 PM, Paul Du Bois  wrote:
> [...]
>> Sorry, I'm guilty of having assumed that the POSIX API has an
>> operation analogous to win32 CreateFile(GENERIC_WRITE, 0 /* ie,
>> "FILE_SHARE_NONE"*/).
>>
>> If shared-reader/single-writer semantics are not available, the only
>> other possibility I can think of is to avoid opening the .pyc for
> [...]

Actually, there are (sadly) many ways to do that. Projects like sqlite support
shared-reader/single-writer on many platforms, so that code could be reused.

Pablo
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-01 Thread Cesare Di Mauro
2010/2/1 Collin Winter 

> I believe these VMs would have little overlap. I cannot imagine that
> Unladen Swallow's needs have much in common with Stackless's, or with
> those of a hypothetical register machine to replace the current stack
> machine.
>
> Let's consider that last example in more detail: a register machine
> would require completely different bytecode. This would require
> replacing the bytecode compiler, the peephole optimizer, and the
> bytecode eval loop. The frame object would need to be changed to hold
> the registers and a new blockstack design; the code object would have
> to potentially hold a new bytecode layout.
>
> I suppose making all this pluggable would be possible, but I don't see
> the point. This kind of experimentation is ideal for a branch: go off,
> test your idea, report your findings, merge back. Let the branch be
> long-lived, if need be. The Mercurial migration will make all this
> easier.
>
> > Getting the right would certainly require a major effort, but it
> > would also reduce the need to have several branches of C-based
> > Python implementations.
>
> If such a restrictive plugin-based scheme had been available when we
> began Unladen Swallow, I do not doubt that we would have ignored it
> entirely. I do not like the idea of artificially tying the hands of
> people trying to make CPython faster. I do not see any part of Unladen
> Swallow that would have been made easier by such a scheme. If
> anything, it would have made our project more difficult.
>
>  Collin Winter


I completely agree. Working with wpython I have changed a lot of code
ranging from the ASDL grammar to the eval loop, including some library
module and tests (primarily the Python-based parser and the disassembly
tools; module finder required work, too).
I haven't changed the Python objects or the object model (except in the
alpha release; then I dropped this "invasive" change), but I've added some
helper functions in object.c, dict.c, etc.

A pluggable VM isn't feasible because we are talking about a brand new
CPython (library included), to be chosen each time.

If approved, this model will limit a lot the optimizations that can be
implemented to make CPython running faster.

Cesare Di Mauro
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] subprocess docs patch

2010-02-01 Thread Chris Rebert
On Mon, Feb 1, 2010 at 12:14 AM, "Martin v. Löwis"  wrote:
>> Any help you could provide would be appreciated.
>
> Please use unified diffs in the future.

Duly noted.

> I'm -0 on this patch; it still has the negative, cautionary-patronizing
> tone ("Do not", "can be tricky", "be mindful"),

Thanks to yours and other feedback, I've tried to address this in a
newer version of the patch.

> as if readers are unable
> to grasp the description that they just read (and indeed, in the patch,
> you claim that readers *are* unable to understand how command lines
> work).

I don't think I made any statement quite that blunt; however, as the
c.l.p threads that drove me to write this patch show, there are indeed
some people who don't (correctly) understand the details of command
line tokenization.

Thanks for responding!

Cheers,
Chris
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com