date:20100202

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread M.-A. Lemburg

Collin Winter wrote:
> On Mon, Feb 1, 2010 at 11:17 AM, M.-A. Lemburg  wrote:
>> Collin Winter wrote:
>>> I think this idea underestimates a) how deeply the current CPython VM
>>> is intertwined with the rest of the implementation, and b) the nature
>>> of the changes required by these separate VMs. For example, Unladen
>>> Swallow adds fields to the C-level structs for dicts, code objects and
>>> frame objects; how would those changes be pluggable? Stackless
>>> requires so many modifications that it is effectively a fork; how
>>> would those changes be pluggable?
>>
>> They wouldn't be pluggable. Such changes would have to be made
>> in a more general way in order to serve more than just one VM.
> 
> I believe these VMs would have little overlap. I cannot imagine that
> Unladen Swallow's needs have much in common with Stackless's, or with
> those of a hypothetical register machine to replace the current stack
> machine.
> 
> Let's consider that last example in more detail: a register machine
> would require completely different bytecode. This would require
> replacing the bytecode compiler, the peephole optimizer, and the
> bytecode eval loop. The frame object would need to be changed to hold
> the registers and a new blockstack design; the code object would have
> to potentially hold a new bytecode layout.
> 
> I suppose making all this pluggable would be possible, but I don't see
> the point. This kind of experimentation is ideal for a branch: go off,
> test your idea, report your findings, merge back. Let the branch be
> long-lived, if need be. The Mercurial migration will make all this
> easier.
> 
>> Getting the right would certainly require a major effort, but it
>> would also reduce the need to have several branches of C-based
>> Python implementations.
> 
> If such a restrictive plugin-based scheme had been available when we
> began Unladen Swallow, I do not doubt that we would have ignored it
> entirely. I do not like the idea of artificially tying the hands of
> people trying to make CPython faster. I do not see any part of Unladen
> Swallow that would have been made easier by such a scheme. If
> anything, it would have made our project more difficult.

I don't think that it has to be restrictive - much to the contrary,
it would provide a consistent API to those CPython internals and
also clarify the separation between the various parts. Something
which currently does not exist in CPython.

Note that it may be easier for you (and others) to just take
CPython and patch it as necessary. However, this doesn't relieve
you from the needed maintenance - which, I presume, is one of the
reasons why you are suggesting to merge U-S back into CPython ;-)

The same problem exists for all other branches, such as e.g.
Stackless. Now, why should we merge in your particular branch
and make it harder for those other teams ?

Instead of having 2-3 teams maintain complete branches, it would
be more efficient to just have them take care of their particular
VM implementation. Furthermore, we wouldn't need to decide
which VM variant to merge into the core, since all of them
would be equally usable.

The idea is based on a big picture perspective and focuses on long
term benefits for more than just one team of VM implementers.

BTW: I also doubt that Mercurial will make any of this easier.
It makes creating branches easier for non-committers, but the
problem of having to maintain the branches remains.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 02 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Reworking the GIL

2010-02-02 Thread Eric Hopper

I recently saw the video of David Beazley's presentation on how poorly
the old GIL implementation handled certain cases and thought "I can fix
that!".  Unfortunately for me, someone else has beaten me to it, and
done a somewhat better job than I would've because I wasn't thinking of
doing anything about changing how ticks worked.

Antoine Pitrou wrote:
> - priority requests, which is an option for a thread requesting the
> GIL to be scheduled as soon as possible, and forcibly (rather than any
> other threads). This is meant to be used by GIL-releasing methods such
> as read() on files and sockets. The scheme, again, is very simple:
> when a priority request is done by a thread, the GIL is released as
> soon as possible by the thread holding it (including in the eval
> loop), and then the thread making the priority request is forcibly
> scheduled (by making all other GIL-awaiting threads wait in the
> meantime).

I notice that Guido vetoed this idea, but just in case it should come up
again, I have some thoughts that likely have already occurred to people,
but I didn't notice on the list.

I would be very careful with this.  The goal should be to avoid
re-implementing a scheduler of any kind in Python.  This comes
perilously close.  In my opinion this mechanism should _only_ be used
for signals.

I think implementing this for IO is fraught with peril and would almost
certainly require a scheduler to be implemented in Python in order to
work better than not having it at all while also avoiding pathological
cases.

Also, has anybody thought of starting work on making a set of Py* calls
that can be made without having the GIL acquired?

I don't suppose it will ever be ported back to Python 2.x?  It doesn't
look like the whole GIL concept has changed much between Python 2.x and
3.x so I expect back-porting it would be pretty easy.

This discussion is, of course, rather old.  But I browsed through all
the stuff people said in reply and I (perhaps foolishly) thought my two
cents might be helpful.

-- 
A word is nothing more or less than the series of historical
connotations given to it. That's HOW we derive meaning, and to claim
that there is an arbitrary meaning of words above and beyond the way
people use them is a blatant misunderstanding of the nature of language.
-- Anonymous blogger
-- Eric Hopper ([email protected] http://www.omnifarious.org/~hopper)--

signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Mail Archives

2010-02-02 Thread Ben.Young

Hi,

Currently the Python-Dev archives at
http://mail.python.org/pipermail/python-dev/ doesn't appear to have been
updated for the last week or so. Is that a known issue?

Thanks,
Ben

Ben Young - Senior Software Engineer
SunGard - Enterprise House, Vision Park, Histon, Cambridge, CB24 9ZR
Tel +44 1223 266042 - Main +44 1223 266100 - http://www.sungard.com/
 
CONFIDENTIALITY:  This email (including any attachments) may contain
confidential, proprietary and privileged information, and unauthorized
disclosure or use is prohibited.  If you received this email in error,
please notify the sender and delete this email from your system.  Thank
you.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mail Archives

2010-02-02 Thread Chris Rebert

On Tue, Feb 2, 2010 at 1:50 AM,   wrote:
> Hi,
>
> Currently the Python-Dev archives at
> http://mail.python.org/pipermail/python-dev/ doesn't appear to have been
> updated for the last week or so. Is that a known issue?

I think so: http://mail.python.org/pipermail/python-dev/2010-January/097388.htm

Cheers,
Chris
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Pascal Chambon



Although I would be in favor of an atfork callback registration system 
(similar to atexit), it seems there is no way to solve the fork() 
problem automatically with this. Any attempt to acquire/release locks 
automatically will lead to deadlocks, as it is necessary to know the 
exact program workflow to take locks in the right order.


I guess spawnl semantic (i.e, like win32's CreateProcess()) can't become 
the default multiprocessing behaviour, as too many programs implicitly 
rely on the whole sharing of data under unix (and py3k itself is maybe 
becoming a little too mature for new compatility breaks) ; but well, as 
long as there are options to enforce this behaviour, it should be fine 
for everyone.


I'm quite busy with other libraries at the moment, but I'll study the 
integration of spawnl into the multiprocessing package, during coming 
weeks. B-)



Regards,
Pascal
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Reworking the GIL

2010-02-02 Thread Neil Hodgson

Eric Hopper:

> I don't suppose it will ever be ported back to Python 2.x?  It doesn't
> look like the whole GIL concept has changed much between Python 2.x and
> 3.x so I expect back-porting it would be pretty easy.

   There was a patch but it has been rejected.
http://bugs.python.org/issue7753

   Neil
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Nick Coghlan

M.-A. Lemburg wrote:
> BTW: I also doubt that Mercurial will make any of this easier.
> It makes creating branches easier for non-committers, but the
> problem of having to maintain the branches remains.

It greatly simplifies the process of syncing the branch with the main
line of development so yes, it should help with branch maintenance
(svnmerge is a pale shadow of what a true DVCS can handle). That aspect
is one of the DVCS selling points that also applies to core development.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Nick Coghlan

Pascal Chambon wrote:
> I guess spawnl semantic (i.e, like win32's CreateProcess()) can't become
> the default multiprocessing behaviour, as too many programs implicitly
> rely on the whole sharing of data under unix (and py3k itself is maybe
> becoming a little too mature for new compatility breaks) ; but well, as
> long as there are options to enforce this behaviour, it should be fine
> for everyone.

It would also make it much easier to write cross-platform
multiprocessing code (by always using the non-forking semantics even on
fork-capable systems)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mail Archives

2010-02-02 Thread Nick Coghlan

Chris Rebert wrote:
> On Tue, Feb 2, 2010 at 1:50 AM,   wrote:
>> Hi,
>>
>> Currently the Python-Dev archives at
>> http://mail.python.org/pipermail/python-dev/ doesn't appear to have been
>> updated for the last week or so. Is that a known issue?
> 
> I think so: 
> http://mail.python.org/pipermail/python-dev/2010-January/097388.htm

Lost the trailing 'l' on that link:
http://mail.python.org/pipermail/python-dev/2010-January/097388.html

(I actually thought it was a very Zen link for a moment there...)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
---
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Absolute imports in Python 2.x?

2010-02-02 Thread Mark Dickinson

What are the current plans for PEP 328 (the absolute imports PEP) in Python 2.x?

The PEP says:

"""In Python 2.6, any import statement that results in an
intra-package import will raise DeprecationWarning (this also applies
to from <> import that fails to use the relative import syntax). In
Python 2.7, import will always be an absolute import (and the
__future__ directive will no longer be needed)."""

As far as I can tell, there's no DeprecationWarning in 2.6.

I'm wondering whether this decision was revised at some point (I
wasn't able to find anything by searching the archives), or whether
the DeprecationWarning just got forgotten about.

If the latter, should the DeprecationWarning be introduced in 2.7?

Mark
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Reworking the GIL

2010-02-02 Thread Antoine Pitrou


Hello Eric,

> I notice that Guido vetoed this idea, but just in case it should come up
> again, I have some thoughts that likely have already occurred to people,
> but I didn't notice on the list.

Yes, I hindsight I think Guido was right.
(no, I'm not trying to make myself agreeable :-))

> Also, has anybody thought of starting work on making a set of Py* calls
> that can be made without having the GIL acquired?

In the current state of affairs it would be quite difficult. Each memory
allocation can trigger a garbage collection which itself needs the GIL. And
Python is full of memory allocations.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread skip


>> I guess spawnl semantic (i.e, like win32's CreateProcess()) can't
>> become the default multiprocessing behaviour...

Nick> It would also make it much easier to write cross-platform
Nick> multiprocessing code (by always using the non-forking semantics
Nick> even on fork-capable systems)

I don't understand.  On Unix-y systems isn't spawn* layered on top of
fork/exec?

One thing that nobody seems to have pointed out is that the subprocess
module was originally written as a multi-processing module with an API very
similar to the threading module.  That is, it was intended to be used as an
alternative to threading.  I would find it odd to use both together, and in
particular, to create threads first, then fork.  If you were going to
combine both threading and multiprocessing it seems much more logical to me
to fork first (coarse-grained subdivision) then create threads (finer
grained threads of control) in those processes.

Skip
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Antoine Pitrou

 pobox.com> writes:
> 
> I don't understand.  On Unix-y systems isn't spawn* layered on top of
> fork/exec?

The point is that exec() relinquishes all the existing resources, so the initial
fork() becomes an implementation detail (IIUC).

> If you were going to
> combine both threading and multiprocessing it seems much more logical to me
> to fork first (coarse-grained subdivision) then create threads (finer
> grained threads of control) in those processes.

It depends what you're doing. For example, if you're running an event loop in
your main thread, you will want to first spawn a worker thread before doing any
kind of blocking communication with any child processes you spawn.

There's also, as already stated, the situation where threads are launched behind
your back by a third-party library.

Regards

Antoine.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Antoine Pitrou

Tres Seaver  palladion.com> writes:
> 
> Note that the "we" in your sentence is not anything like the "quod
> semper quod ubique quod ab omnibus" criterion for accepting dogma:
> mutliprocessing is a tool, and needs to be used according to its nature,
> just as with threading.

I don't remember enough latin to understand what it means :-S

The word "dogma" is a good one in this context however. "We" ( ;-)) have
accepted and promoted the dogma that multiprocessing is the solution to
parallelism in the face of the GIL. While it needn't be applicable in any and
every situation, we should make it so that it is applicable often enough.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Reworking the GIL

2010-02-02 Thread Larry Hastings


Antoine Pitrou wrote:

Also, the MSDN doc (*) says timeBeginPeriod() can have a detrimental effect on
system performance; I don't know how much of it is true.

(*) http://msdn.microsoft.com/en-us/library/dd757624(VS.85).aspx


Indeed it does.  This is ancient, dusty wisdom, from the days of 50mhz 
computers, passed down from OS generation to generation.  As a (former) 
professional Windows developer I can assure you it was no longer a 
concern even ten years ago; its performance effect was so small as to be 
unmeasurable.


You may call timeBeginPeriod(1) with impunity,


/larry/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Antoine Pitrou wrote:
> Tres Seaver  palladion.com> writes:
>> Note that the "we" in your sentence is not anything like the "quod
>> semper quod ubique quod ab omnibus" criterion for accepting dogma:
>> mutliprocessing is a tool, and needs to be used according to its nature,
>> just as with threading.
> 
> I don't remember enough latin to understand what it means :-S

"[What has been believed] lways, everywhere, and by everyone":  it was
the classic theologian's test for orthodoxy, because it prevents
innovating dogma.

> The word "dogma" is a good one in this context however. "We" ( ;-)) have
> accepted and promoted the dogma that multiprocessing is the solution to
> parallelism in the face of the GIL. While it needn't be applicable in any and
> every situation, we should make it so that it is applicable often enough.

Again, wishing won't make it so:  there is no sane way to mix threading
and fork-without-exec except by keeping the parent process single
threaded until after any fork() calls.  Some applications may seem to
work when violating this rule, but their developers are doomed to hair
loss over time.



Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktoJhwACgkQ+gerLs4ltQ5E+gCeOz5GGYvNkkx8Y/LTBLtdMN0X
av8AoMy7wSEr7D+xgps8g89yBNyBEPk+
=n+FE
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Absolute imports in Python 2.x?

2010-02-02 Thread Eric Smith


Mark Dickinson wrote:

What are the current plans for PEP 328 (the absolute imports PEP) in Python 2.x?

The PEP says:

"""In Python 2.6, any import statement that results in an
intra-package import will raise DeprecationWarning (this also applies
to from <> import that fails to use the relative import syntax). In
Python 2.7, import will always be an absolute import (and the
__future__ directive will no longer be needed)."""

As far as I can tell, there's no DeprecationWarning in 2.6.

I'm wondering whether this decision was revised at some point (I
wasn't able to find anything by searching the archives), or whether
the DeprecationWarning just got forgotten about.

If the latter, should the DeprecationWarning be introduced in 2.7?


Not sure about the decision one way or the other. But if there's not 
going to be a 2.8, and if DeprecationWarnings are off by default anyway, 
I'm not sure it makes any sense to add a DeprecationWarning in 2.7. From 
my quick testing, -3 doesn't warn about relative imports. Perhaps a 
better strategy in this particular case is to make -3 give that warning?


Aside:
We really need a better way to track things we need to do in the next 
version of Python so things like this don't fall through the cracks. We 
added a 3.2 version tag before 3.1 was released so that we could add a 
few "remember to do this in 3.2" issues dealing with deprecations. 
Perhaps it's time to add a 3.3 version tag? I don't think we should add 
a 2.8 tag, that would give false hope.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Reworking the GIL

2010-02-02 Thread Stephen J. Turnbull

Antoine Pitrou writes:

 > Yes, I hindsight I think Guido was right.

Guido does too.  One of the benefits of having a time machine is
getting to turn your hindsight into foresight.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Larry Hastings



On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross 
 wrote:

I don't know whether I in favour of using a single pyr folder or not
but if a single folder is used I'd definitely prefer the folder to be
called __pyr__ rather than .pyr.


Guido van Rossum wrote:
Exactly what I would prefer. I worry that having many small 
directories is a fairly poor use of the filesystem. A quick scan of 
/usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but 
only 57 directories).


Just to be clear: what should go in the __pyr__ folder?  I can see two 
possibilities:


1) All files go directly into __pyr__, a flat directory tree.
   foo.py
   bar.py
   __pyr__/
   foo.py.c.3160
   bar.py.c.3160

2) Each source file gets its own subdirectory of __pyr__.
   foo.py
   bar.py
   __pyr__/
   foo.py/
 c.3160
   bar.py/
 c.3160

2 makes it easier to clear the cache for a particular source file--just 
delete its matching directory.  The downside is that we're back to lots 
of small directories.  And it's not that onerous to do a "rm 
__pyr__/foo.py.*".  So I suspect you prefer option 1.



The proposal gets a +1 from me,


/larry/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread skip


Tres> Some applications may seem to work when violating this rule, but
Tres> their developers are doomed to hair loss over time.

Then for us bald guys it should be okay, right? ;-)

Skip
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Tres Seaver

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

[email protected] wrote:
> Tres> Some applications may seem to work when violating this rule, but
> Tres> their developers are doomed to hair loss over time.
> 
> Then for us bald guys it should be okay, right? ;-)

Sure:  you might even grow hair elsewhere. ;)


Tres.
- --
===
Tres Seaver  +1 540-429-0999  [email protected]
Palladion Software   "Excellence by Design"http://palladion.com
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAktoQYcACgkQ+gerLs4ltQ6T4ACgiuAMtWVEY4k7+9nfqCaT0TYq
QDoAoNHDVKEXD2P0XeCPsc4wrsX22T/N
=ZtQR
-END PGP SIGNATURE-

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Pascal Chambon



  

The word "dogma" is a good one in this context however. "We" ( ;-)) have
accepted and promoted the dogma that multiprocessing is the solution to
parallelism in the face of the GIL. While it needn't be applicable in any and
every situation, we should make it so that it is applicable often enough.



Again, wishing won't make it so:  there is no sane way to mix threading
and fork-without-exec except by keeping the parent process single
threaded until after any fork() calls.  Some applications may seem to
work when violating this rule, but their developers are doomed to hair
loss over time.

  
You pointed it out : fork() was not designed to work together with 
multithreading ; furthermore in many cases its data-duplication semantic 
is absolutely unneeded to solve the real problem.


So we can let fork-without-exec multiprocessing (with or without 
threads) for those who need it, and offer safer multiprocessing for 
those who just seek use of ease and portability - via spawn() semantic.


Regards, Pascal
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Jesse Noller

On Tue, Feb 2, 2010 at 10:34 AM, Pascal Chambon
 wrote:
>
>
>
> The word "dogma" is a good one in this context however. "We" ( ;-)) have
> accepted and promoted the dogma that multiprocessing is the solution to
> parallelism in the face of the GIL. While it needn't be applicable in any
> and
> every situation, we should make it so that it is applicable often enough.
>
>
> Again, wishing won't make it so:  there is no sane way to mix threading
> and fork-without-exec except by keeping the parent process single
> threaded until after any fork() calls.  Some applications may seem to
> work when violating this rule, but their developers are doomed to hair
> loss over time.
>
>
>
> You pointed it out : fork() was not designed to work together with
> multithreading ; furthermore in many cases its data-duplication semantic is
> absolutely unneeded to solve the real problem.
>
> So we can let fork-without-exec multiprocessing (with or without threads)
> for those who need it, and offer safer multiprocessing for those who just
> seek use of ease and portability - via spawn() semantic.
>
> Regards, Pascal

And I've migrated to the camp wherein the safer semantics should be
the default semantics. However, once we have a patch, I'm going to
socialize it with some of the heavier multiprocessing users *I* know
of to get feedback.

jesse
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Mecurial (was: PEP 3146: Merge Unladen Swallow into CPython)

2010-02-02 Thread M.-A. Lemburg

Nick Coghlan wrote:
> M.-A. Lemburg wrote:
>> BTW: I also doubt that Mercurial will make any of this easier.
>> It makes creating branches easier for non-committers, but the
>> problem of having to maintain the branches remains.
> 
> It greatly simplifies the process of syncing the branch with the main
> line of development so yes, it should help with branch maintenance
> (svnmerge is a pale shadow of what a true DVCS can handle). That aspect
> is one of the DVCS selling points that also applies to core development.

Sure, the merge process is simplified and faster, but you still have to
go through the changes and conflicts after each merge and that part
is IMHO what makes maintaining a branch difficult.

I don't think that a version control system can really help a
lot with this manual step. I'd love to be proved wrong, though :-)

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Feb 02 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

::: Try our new mxODBC.Connect Python Database Interface for free ! 

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Nir Aides

Seems the problem under discussion is already taken care of in Python.
Possibly remains to verify that the logic described below does not possibly
generate deadlocks.

>From the Python docs: http://docs.python.org/c-api/init.html
"Another important thing to note about threads is their behaviour in the
face of the C fork() call. On most systems with fork(), after a process
forks only the thread that issued the fork will exist. That also means any
locks held by other threads will never be released. Python solves this
for os.fork() by acquiring the locks it uses internally before the fork, and
releasing them afterwards. In addition, it resets any Lock Objects in the
child. When extending or embedding Python, there is no way to inform Python
of additional (non-Python) locks that need to be acquired before or reset
after a fork. OS facilities such as posix_atfork() would need to be used to
accomplish the same thing. Additionally, when extending or embedding Python,
calling fork() directly rather than through os.fork() (and returning to or
calling into Python) may result in a deadlock by one of Python’s internal
locks being held by a thread that is defunct after the
fork. PyOS_AfterFork() tries to reset the necessary locks, but is not always
able to."

On Sat, Jan 30, 2010 at 12:14 PM, Pascal Chambon
wrote:

>
> *[...]
> What dangers do you refer to specifically? Something reproducible?
> -L*
>
>
> Since it's a race condition issue, it's not easily reproducible with normal
> libraries - which only take threading locks for small moments.
> But it can appear if your threads make good use of the threading module. By
> forking randomly, you have chances that the main locks of the logging module
> you frozen in an "acquired" state (even though their owner threads are not
> existing in the child process), and your next attempt to use logging will
> result in a pretty deadlock (on some *nix platforms, at least). This issue
> led to the creation of python-atfork by the way.
>
>
> Stefan Behnel a écrit :
>
> Stefan Behnel, 30.01.2010 07:36:
>
>
>  Pascal Chambon, 29.01.2010 22:58:
>
>
>  I've just recently realized the huge problems surrounding the mix of
> multithreading and fork() - i.e that only the main thread actually
> survived the fork(), and that process data (in particular,
> synchronization primitives) could be left in a dangerously broken state
> because of such forks, if multithreaded programs.
>
>
>  I would *never* have even tried that, but it doesn't surprise me that it
> works basically as expected. I found this as a quick intro:
> http://unix.derkeiler.com/Newsgroups/comp.unix.programmer/2003-09/0672.html
>
>  ... and another interesting link that also describes exec() usage in this
> context.
> http://www.linuxprogrammingblog.com/threads-and-fork-think-twice-before-using-them
>
> Stefan
>
>
>
>  Yep, these links sum it up quite well.
> But to me it's not a matter of "trying" to mix threads and fork - most
> people won't on purpose seek trouble.
> It's simply the fact that, in a multithreaded program (i.e, any program of
> some importance), multiprocessing modules will be impossible to use safely
> without a complex synchronization of all threads to prepare the underlying
> forking (and we know that using multiprocessing can be a serious benefit,
> for GIL/performance reasons).
> Solutions to fork() issues clearly exist - just add a "use_forking=yes"
> attribute to subprocess functions, and users will be free to use the
> spawnl() semantic, which is already implemented on win32 platforms, and
> which gives full control over both threads and subprocesses. Honestly, I
> don't see how it will complicate stuffs, except slightly for the programmer
> which will have to edit the code to add spwawnl() support (I might help on
> that).
>
> Regards,
> Pascal
>
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/nir%40winpdb.org
>
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Python 2.6.5

2010-02-02 Thread Barry Warsaw

I'm thinking about doing a Python 2.6.5 release soon.  I've added the
following dates to the Python release schedule Google calendar:

2009-03-01 Python 2.6.5 rc 1
2009-03-15 Python 2.6.5 final

This allows us to spend some time on 2.6.5 at Pycon if we want.  Please let me
know if you have any concerns about those dates.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Martin v. Löwis

> Le lundi 01 février 2010 à 23:58 +0100, "Martin v. Löwis" a écrit :
>> Interestingly, the POSIX pthread_atfork documentation defines how you
>> are supposed to do that: create an atfork handler set, and acquire all
>> mutexes in the prepare handler. Then fork, and have the parent and child
>> handlers release the locks. Of course, unless your locks are recursive,
>> you still have to know what locks you are already holding.
> 
> So, if we restrict ourselves to Python-level locks (thread.Lock and
> thread.RLock), I guess we could just chain them in a doubly-linked list
> and add an internal _PyThread_AfterFork() function?

According to the text book, that's not good enough: if some other thread
is holding a lock, it's probably because the data structures are
inconsistent. If you follow the approach of acquiring all locks before
forking, you get more sane data structures in the child processes (in
addition to having all locks released).

If you would just release the locks in the afterfork function, chances
are that something crashes when you next try to access what was
protected by the lock.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Forking and Multithreading - enemy brothers

2010-02-02 Thread Martin v. Löwis

> Although I would be in favor of an atfork callback registration system
> (similar to atexit), it seems there is no way to solve the fork()
> problem automatically with this. Any attempt to acquire/release locks
> automatically will lead to deadlocks, as it is necessary to know the
> exact program workflow to take locks in the right order.

That's not true. If the creator of the locks would register an atfork
callback set, they could arrange it to acquire the locks in the right
order (in cases where the order matters).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] API for VM branches (was: PEP 3146)

2010-02-02 Thread Collin Winter

[Moving to python-ideas; python-dev to bcc]

On Tue, Feb 2, 2010 at 2:02 AM, M.-A. Lemburg  wrote:
> Collin Winter wrote:
[snip]
>> If such a restrictive plugin-based scheme had been available when we
>> began Unladen Swallow, I do not doubt that we would have ignored it
>> entirely. I do not like the idea of artificially tying the hands of
>> people trying to make CPython faster. I do not see any part of Unladen
>> Swallow that would have been made easier by such a scheme. If
>> anything, it would have made our project more difficult.
>
> I don't think that it has to be restrictive - much to the contrary,
> it would provide a consistent API to those CPython internals and
> also clarify the separation between the various parts. Something
> which currently does not exist in CPython.

We do not need an API to CPython's internals: we are not interfacing
with them, we are replacing and augmenting them.

> Note that it may be easier for you (and others) to just take
> CPython and patch it as necessary. However, this doesn't relieve
> you from the needed maintenance - which, I presume, is one of the
> reasons why you are suggesting to merge U-S back into CPython ;-)

That is incorrect. In the year we have been working on Unladen
Swallow, we have only updated our vendor branch of CPython 2.6 once,
going from 2.6.1 to 2.6.4. We have occasionally cherrypicked patches
from the 2.6 maintenance branch to fix specific problems. The
maintenance required by upstream CPython changes has been effectively
zero.

We are seeking to merge with CPython for three reasons: 1) verify that
python-dev is interested in this project, and that we are not wasting
our time; 2) expose the codebase to a wider, more heterogenous testing
environment; 3) accelerate development by having more hands on the
code. Upstream maintenance requirements have had zero impact on our
planning.

In any case, I'll be interested in reading your PEP that outlines how
the plugin interface should work, which systems will be pluggable, and
exactly how Unladen Swallow, WPython and Stackless would benefit.
Let's move further discussion of this to python-ideas until there's
something more concrete here. The py3k-jit branch will live long
enough that we could update it to work with a plugin system, assuming
it is demonstrated to be beneficial.

Thanks,
Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Absolute imports in Python 2.x?

2010-02-02 Thread Brett Cannon

On Tue, Feb 2, 2010 at 05:20, Eric Smith  wrote:
> Mark Dickinson wrote:
>>
>> What are the current plans for PEP 328 (the absolute imports PEP) in
>> Python 2.x?
>>
>> The PEP says:
>>
>> """In Python 2.6, any import statement that results in an
>> intra-package import will raise DeprecationWarning (this also applies
>> to from <> import that fails to use the relative import syntax). In
>> Python 2.7, import will always be an absolute import (and the
>> __future__ directive will no longer be needed)."""
>>
>> As far as I can tell, there's no DeprecationWarning in 2.6.
>>
>> I'm wondering whether this decision was revised at some point (I
>> wasn't able to find anything by searching the archives), or whether
>> the DeprecationWarning just got forgotten about.
>>
>> If the latter, should the DeprecationWarning be introduced in 2.7?
>
> Not sure about the decision one way or the other. But if there's not going
> to be a 2.8, and if DeprecationWarnings are off by default anyway, I'm not
> sure it makes any sense to add a DeprecationWarning in 2.7. From my quick
> testing, -3 doesn't warn about relative imports. Perhaps a better strategy
> in this particular case is to make -3 give that warning?

+1 on the -3 option.

>
> Aside:
> We really need a better way to track things we need to do in the next
> version of Python so things like this don't fall through the cracks. We
> added a 3.2 version tag before 3.1 was released so that we could add a few
> "remember to do this in 3.2" issues dealing with deprecations. Perhaps it's
> time to add a 3.3 version tag? I don't think we should add a 2.8 tag, that
> would give false hope.

It should definitely be added by the alpha, but I don't know if now is
a bit early. Regardless, I looked at the tracker and the code there
doesn't seem to match what has been pushed to bugs.python.org (3.2 is
not listed and 2.4 still is). Martin, is that correct? Otherwise why
does initial_data.py not match what's live?

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Python 2.6.5

2010-02-02 Thread Raymond Hettinger

+1

On Feb 2, 2010, at 10:08 AM, Barry Warsaw wrote:

> I'm thinking about doing a Python 2.6.5 release soon.  I've added the
> following dates to the Python release schedule Google calendar:
> 
> 2009-03-01 Python 2.6.5 rc 1
> 2009-03-15 Python 2.6.5 final
> 
> This allows us to spend some time on 2.6.5 at Pycon if we want.  Please let me
> know if you have any concerns about those dates.
> 
> -Barry
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/raymond.hettinger%40gmail.com

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Collin Winter

Hey Dirkjan,

[Circling back to this part of the thread]

On Thu, Jan 21, 2010 at 1:37 PM, Dirkjan Ochtman  wrote:
> On Thu, Jan 21, 2010 at 21:14, Collin Winter  wrote:
[snip]
>> My quick take on Cython and Shedskin is that they are
>> useful-but-limited workarounds for CPython's historically-poor
>> performance. Shedskin, for example, does not support the entire Python
>> language or standard library
>> (http://shedskin.googlecode.com/files/shedskin-tutorial-0.3.html).
>
> Perfect, now put something like this in the PEP, please. ;)

Done. The diff is at
http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed
Cython, Shedskin and a bunch of other alternatives to pure CPython.
Some of that information is based on conversations I've had with the
respective developers, and I'd appreciate corrections if I'm out of
date.

Thanks,
Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Sebastian Rittau

On Sun, Jan 31, 2010 at 12:44:33PM +0100, Georg Brandl wrote:

> At least to me, this does not explain why an "unwanted" (why unwanted? If
> it's unwanted, set PYTHONDONTWRITEBYTECODE=1) directory is worse than an
> "unwanted" file.

A directory "feels" different than. For example, typing "ls" in my shell
regular print files in black, but directories in bold and blue. File
managers and IDE also highlight directories differently. In tree views,
directories have expander buttons that also make them stand out.

As a concrete example, have a look at these two screenshots:

  http://tinyurl.com/yz2fr6c and http://tinyurl.com/yg38uqt

In the first one, the subpackages stand out, while in the second one they
are hard to make out among the *.pyr directories. A directory just adds
more clutter than a file.

But overall I like the idea of using just a single __pycache__ or
__pyr__ directory per path. This would also reduce the *.pyc clutter.

 - Sebastian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Neil Schemenauer

Nick Coghlan  wrote:
> Henning von Bargen wrote:
>> The solution is so obvious:
>> 
>> Why not use a .pyr file that is internally a zip file?

I think a Zip file might be the right approach too.  Either you
could have directories in the zip file for each version, e.g.

2.7/foo.pyc
3.3/foo.pyc
2.7/bar.pyc
3.3/bar.pyc

Or a Zip directory for each module:

foo/2.7.pyc
foo/3.3.pyc

I think you could get away without funky names because dot would
always be in the version number.

This would be implemented simply as an extension to the zip import
mechanism we already have.  Using the zip format would allow people
to use existing zip utilities to manipulate them.

> Agreed this should be discussed in the PEP, but one obvious problem is
> the speed impact. Picking up a file from a subdirectory is going to
> introduce less overhead than unpacking it from a zipfile.

I'm pretty sure it would be better than using directories.  A
directory for every module is not performance friendly.  Really, our
current module per file is not performance friendly.

Zip files could use "store" as the compression method if you are
really worried about CPU time.

  Neil

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Guido van Rossum

Argh. zipfiles are way to complex to be writing. If you want to use
zipfiles, compile your whole world ahead of time, stuff it into a
zipfile, and install / distribute that. But for the automatic writing
of bytecode files as a side effect of importing the source code,
please let the filesystem do its job.

--Guido

On Tue, Feb 2, 2010 at 4:24 PM, Neil Schemenauer  wrote:
> Nick Coghlan  wrote:
>> Henning von Bargen wrote:
>>> The solution is so obvious:
>>>
>>> Why not use a .pyr file that is internally a zip file?
>
> I think a Zip file might be the right approach too.  Either you
> could have directories in the zip file for each version, e.g.
>
>    2.7/foo.pyc
>    3.3/foo.pyc
>    2.7/bar.pyc
>    3.3/bar.pyc
>
> Or a Zip directory for each module:
>
>    foo/2.7.pyc
>    foo/3.3.pyc
>
> I think you could get away without funky names because dot would
> always be in the version number.
>
> This would be implemented simply as an extension to the zip import
> mechanism we already have.  Using the zip format would allow people
> to use existing zip utilities to manipulate them.
>
>> Agreed this should be discussed in the PEP, but one obvious problem is
>> the speed impact. Picking up a file from a subdirectory is going to
>> introduce less overhead than unpacking it from a zipfile.
>
> I'm pretty sure it would be better than using directories.  A
> directory for every module is not performance friendly.  Really, our
> current module per file is not performance friendly.
>
> Zip files could use "store" as the compression method if you are
> really worried about CPU time.
>
>  Neil
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Craig Citro

> Done. The diff is at
> http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed
> Cython, Shedskin and a bunch of other alternatives to pure CPython.
> Some of that information is based on conversations I've had with the
> respective developers, and I'd appreciate corrections if I'm out of
> date.
>

Well, it's a minor nit, but it might be more fair to say something
like "Cython provides the biggest improvements once type annotations
are added to the code." After all, Cython is more than happy to take
arbitrary Python code as input -- it's just much more effective when
it knows something about types. The code to make Cython handle
closures has just been merged ... hopefully support for the full
Python language isn't so far off. (Let me know if you want me to
actually make a comment on Rietveld ...)

Now what's more interesting is whether or not U-S and Cython could
play off one another -- take a Python program, run it with some
"generic input data" under Unladen and record info about which
functions are hot, and what types they tend to take, then let
Cython/gcc -O3 have a go at these, and lather, rinse, repeat ... JIT
compilation and static compilation obviously serve different purposes,
but I'm curious if there aren't other interesting ways to take
advantage of both.

-cc
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Glenn Linderman

On approximately 2/2/2010 4:28 PM, came the following characters from 
the keyboard of Guido van Rossum:

Argh. zipfiles are way to complex to be writing.


Agreed.  But in reading that, it somehow triggered a question: does 
zipimport only work for zipfiles, or does it work for any archive format 
that Python stdlib knows how to decode?  And if only the former, why are 
they so special?


--
Glenn -- http://nevcal.com/
===
A protocol is complete when there is nothing left to remove.
-- Stuart Cheshire, Apple Computer, regarding Zero Configuration Networking

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Collin Winter

Hey MA,

On Fri, Jan 29, 2010 at 11:14 AM, M.-A. Lemburg  wrote:
> Collin Winter wrote:
>> I added startup benchmarks for Mercurial and Bazaar yesterday
>> (http://code.google.com/p/unladen-swallow/source/detail?r=1019) so we
>> can use them as more macro-ish benchmarks, rather than merely starting
>> the CPython binary over and over again. If you have ideas for better
>> Mercurial/Bazaar startup scenarios, I'd love to hear them. The new
>> hg_startup and bzr_startup benchmarks should give us some more data
>> points for measuring improvements in startup time.
>>
>> One idea we had for improving startup time for apps like Mercurial was
>> to allow the creation of hermetic Python "binaries", with all
>> necessary modules preloaded. This would be something like Smalltalk
>> images. We haven't yet really fleshed out this idea, though.
>
> In Python you can do the same with the freeze.py utility. See
>
> http://www.egenix.com/www2002/python/mxCGIPython.html
>
> for an old project where we basically put the Python
> interpreter and stdlib into a single executable.
>
> We've recently revisited that project and created something
> we call "pyrun". It fits Python 2.5 into a single executable
> and a set of shared modules (which for various reasons cannot
> be linked statically)... 12MB in total.
>
> If you load lots of modules from the stdlib this does provide
> a significant improvement over standard Python.

Good to know there are options. One feature we had in mind for a
system of this sort would be the ability to take advantage of the
limited/known set of modules in the image to optimize the application
further, similar to link-time optimizations in gcc/LLVM
(http://www.airs.com/blog/archives/100).

> Back to the PEP's proposal:
>
> Looking at the data you currently have, the negative results
> currently don't really look good in the light of the small
> performance improvements.

The JIT compiler we are offering is more than just its current
performance benefit. An interpreter loop will simply never be as fast
as machine code. An interpreter loop, no matter how well-optimized,
will hit a performance ceiling and before that ceiling will run into
diminishing returns. Machine code is a more versatile optimization
target, and as such, allows many optimizations that would be
impossible or prohibitively difficult in an interpreter.

Unladen Swallow offers a platform to extract increasing performance
for years to come. The current generation of modern, JIT-based
JavaScript engines are instructive in this regard: V8 (which I'm most
familiar with) delivers consistently improving performance
release-over-release (see the graphs at the top of
http://googleblog.blogspot.com/2009/09/google-chrome-after-year-sporting-new.html).
 I'd like to see CPython be able to achieve the same thing, like the
new implementations of JavaScript and Ruby are able to do.

We are aware that Unladen Swallow is not finished; that's why we're
not asking to go into py3k directly. Unladen Swallow's memory usage
will continue to decrease, and its performance will only go up. The
current state is not its permanent state; I'd hate to see the perfect
become the enemy of the good.

> Wouldn't it be possible to have the compiler approach work
> in three phases in order to reduce the memory footprint and
> startup time hit, ie.
>
>  1. run an instrumented Python interpreter to collect all
>    the needed compiler information; write this information into
>    a .pys file (Python stats)
>
>  2. create compiled versions of the code for various often
>    used code paths and type combinations by reading the
>    .pys file and generating an .so file as regular
>    Python extension module
>
>  3. run an uninstrumented Python interpreter and let it
>    use the .so files instead of the .py ones
>
> In production, you'd then only use step 3 and avoid the
> overhead of steps 1 and 2.

That is certainly a possibility if we are unable to reduce memory
usage to a satisfactory level. I've added a "Contingency Plans"
section to the PEP, including this option:
http://codereview.appspot.com/186247/diff2/8004:7005/8006.

Thanks,
Collin Winter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Reid Kleckner

On Tue, Feb 2, 2010 at 8:57 PM, Collin Winter  wrote:
>> Wouldn't it be possible to have the compiler approach work
>> in three phases in order to reduce the memory footprint and
>> startup time hit, ie.
>>
>>  1. run an instrumented Python interpreter to collect all
>>    the needed compiler information; write this information into
>>    a .pys file (Python stats)
>>
>>  2. create compiled versions of the code for various often
>>    used code paths and type combinations by reading the
>>    .pys file and generating an .so file as regular
>>    Python extension module
>>
>>  3. run an uninstrumented Python interpreter and let it
>>    use the .so files instead of the .py ones
>>
>> In production, you'd then only use step 3 and avoid the
>> overhead of steps 1 and 2.
>
> That is certainly a possibility if we are unable to reduce memory
> usage to a satisfactory level. I've added a "Contingency Plans"
> section to the PEP, including this option:
> http://codereview.appspot.com/186247/diff2/8004:7005/8006.

This would be another good research problem for someone to take and
run.  The trick is that you would need to add some kind of "linking"
step to loading the .so.  Right now, we just collect PyObject*'s, and
don't care whether they're statically allocated or user-defined
objects.  If you wanted to pursue offline feedback directed
compilation, you would need to write something that basically can map
from the pointers in the feedback data to something like a Python
dotted name import path, and then when you load the application, look
up those names and rewrite the new pointers into the generated machine
code.  It sounds a lot like writing a dynamic loader.  :)

It sounds like a huge amount of work, and we haven't approached it.
On the other hand, it sounds like it might be rewarding.

Reid
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Guido van Rossum

On Tue, Feb 2, 2010 at 5:41 PM, Glenn Linderman  wrote:
> On approximately 2/2/2010 4:28 PM, came the following characters from the
> keyboard of Guido van Rossum:
>>
>> Argh. zipfiles are way to complex to be writing.
>
> Agreed.  But in reading that, it somehow triggered a question: does
> zipimport only work for zipfiles, or does it work for any archive format
> that Python stdlib knows how to decode?  And if only the former, why are
> they so special?

The former.

They are special because (unlike e.g. tar files) you can read the
table of contents of a zipfile without parsing the entire file. Also
because they are universally supported which makes it unnecessary to
support other formats. Again, contrast tar files which are virtually
unheard of on Windows.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Bob Ippolito

On Sun, Jan 31, 2010 at 11:16 AM, Guido van Rossum  wrote:
> Whoa. This thread already exploded. I'm picking this message to
> respond to because it reflects my own view after reading the PEP.
>
> On Sun, Jan 31, 2010 at 4:13 AM, Hanno Schlichting  wrote:
>> On Sun, Jan 31, 2010 at 1:03 PM, Simon Cross
>>  wrote:
>>> I don't know whether I in favour of using a single pyr folder or not
>>> but if a single folder is used I'd definitely prefer the folder to be
>>> called __pyr__ rather than .pyr.
>
> Exactly what I would prefer. I worry that having many small
> directories is a fairly poor use of the filesystem. A quick scan of
> /usr/local/lib/python3.2 on my Linux box reveals 1163 .py files but
> only 57 directories).

I like this option as well, but why not just name the directory .pyc
instead of __pyr__ or .pyr? That way people probably won't even have
to reconfigure their tools to ignore it :)

-bob
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw

I have to say up front that I'm somewhat shocked at how quickly this thread
has exploded!  Since I'm sprinting this week, I haven't thoroughly read every
message and won't have time tonight to answer every question, but I'll try to
pick out some common ideas.  I really appreciate everyone's input and will try
to clarify the PEP where I can.

It is probably not clear enough from the PEP, but I actually don't expect that
most individual Python developers will use this feature.  This is why the -R
flag exists and the behavior is turned off by default.  When I'm developing
some Python code in my home directory, I usually only use one Python version
and even if I'm going to test it with multiple Python versions, I won't need
to do this *simultaneously*.  I will generally blow away all build artifacts
(including, but not limited to .pyc files) and then rebuild with the different
Python version.

I think that this feature will be limited mostly to distros, which have
different use cases than individual developers.  But these are important use
cases for Python to support nonetheless.

My rationale for choosing the file system layout in the PEP was to try to
present something more familiar to today's Python and to avoid radical
reorganization of the way Python caches its byte code.  Thus having a sibling
directory that differs from the source just by extension seemed more natural
to me.

Encoding the magic number in the file name under .pyr would I thought make the
look up scheme more efficient since the import machinery can craft the file
name directly.  I agree it's not very human friendly because nobody really
knows which magic numbers are associated with which Python versions and flags.

As to the question of sibling directories or folder-per-folder I think
performance issues should be the deciding factor.  There are file system
limitations to consider (but also a wide variety of file systems in use).  Do
the number of stat calls predominate the performance costs?  Maybe it makes
sense to implement the two different approaches and do some measurements.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw

On Jan 30, 2010, at 11:21 PM, Vitor Bosshard wrote:

>Why not:
>
>foo.py
>foo.pyc # < 2.7 or < 3.2
>foo.27.pyc
>foo.32.pyc
>etc.

Because this clutters the module's directory more than it does today, which I
considered to be a negative factor.  And as others have pointed out, there
isn't a one-to-one relationship between Python version numbers and byte code
compatibility.

>I'd rather have a folder cluttered with files I know I can ignore (and
>can easily run a selective rm over) than one that is cluttered with
>subfolders.

I suppose this is going to be very subjective, but in skimming the thread it
seems like most people like putting the byte code cache artifacts in
subdirectories (be they siblings or folder-per-folder).

-Barry

signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw

On Jan 31, 2010, at 01:44 PM, Nick Coghlan wrote:

>We deliberate don't document -U because its typical effect is "break the
>world" - it makes all strings unicode in 2.x.

As an aside, I think this should be documented *somewhere* other than just in
import.c!  I'd totally forgotten about it until I read the source and almost
missed it.  Either it should be documented or it should be ripped out.

-Barry

signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw

On Jan 31, 2010, at 03:07 PM, Ben Finney wrote:

>In other words, my understanding is that the current PEP would have the
>following tree for an example project::
>
>foo/
>__init__.py
>__init__.pyr/
>deadbeef.pyc
>decafbad.pyc
>lorem.py
>lorem.pyr/
>deadbeef.pyc
>decafbad.pyc

[...etc...]

>That's a nightmarish mess of compiled files swamping the source files,
>as has been pointed out several times.

Except that I think it will be quite uncommon for typical Python developers to
be confronted with this.

>Could we instead have a single subdirectory for each tree of module
>packages, keeping them tidily out of the way of the source files, while
>making them located just as deterministically::

If we do not choose the sibling folder approach, I feel pretty strongly that
it ought be more like the Subversion-like folder-per-folder approach than the
Bazaar-like folder-at-top-of-hierarchy approach.  If you have to manually blow
away a particular pyc file, folder-per-folder makes it much easier to find
exactly what you want to blow away without have to search up the file system,
and then back down again to find the pyc file to delete.  How many ..'s does
it take until you're lost in the twisty maze of ls?

-Barry

signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw

On Jan 31, 2010, at 12:36 PM, Georg Brandl wrote:

>Not really -- much of the code I've seen that tries to guess the source
>file name from a __file__ value just does something like this:
>
>   if fname.lower().endswith(('.pyc', '.pyo')): fname = fname[:-1]
>
>That's not compatible with using .pyr, either.

The rationale for the .pyr extension is because I've usually seen (and
written) this instead:

base, ext = os.path.splitext(fname)
py_file = base + '.py'
# ...or...
if ext != '.py':
continue

I think I rarely care what the extension is if it's not '.py'.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Barry Warsaw

On Jan 31, 2010, at 09:30 PM, Martin v. Löwis wrote:

>If a single pyc folder is used, I think an additional __source__
>attribute would be needed to indicate what source file time stamp had
>been checked (if any) to determine that the byte code file is current.

This is a good point.  __file__ is ambiguous so I think a reasonable thing to
add to the PEP is clear semantics for extracting the source file name and the
cached file name from the module object.

Python 3 uses the .py file for __file__ but I'd like to see a transition to
__source__ for that, with __cache__ for the location of the PVM, JVM, LLVM or
whatever compilation cache artifact file.

I've added a note to my working update of the PEP.

-Barry

signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Ben Finney

Barry Warsaw  writes:

> I suppose this is going to be very subjective, but in skimming the
> thread it seems like most people like putting the byte code cache
> artifacts in subdirectories (be they siblings or folder-per-folder).

I don't understand the distinction you're making between those two
options. Can you explain what you mean by each of “siblings” and
“folder-per-folder”?

-- 
 \ “Pinky, are you pondering what I'm pondering?” “I think so, |
  `\   Brain, but Tuesday Weld isn't a complete sentence.” —_Pinky and |
_o__)   The Brain_ |
Ben Finney

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3147: PYC Repository Directories

2010-02-02 Thread Ben Finney

Barry Warsaw  writes:

> If you have to manually blow away a particular pyc file,
> folder-per-folder makes it much easier to find exactly what you want
> to blow away without have to search up the file system, and then back
> down again to find the pyc file to delete. How many ..'s does it take
> until you're lost in the twisty maze of ls?

I don't think keeping the cache files in a mass of intertwingled extra
subdirectories is the way to solve that problem. That speaks, rather, to
the need for Python to be able to find the file on behalf of the user
and blow it away on request, so the user doesn't need to go searching.

Possible interface (with spelling of options chosen hastily)::

$ python foo.py# Use cached byte code if available.
$ python --force-compile foo.py# Unconditionally compile.

If removing the byte code file, without running the module, is what's
desired::

$ python --delete-cache foo.py # Delete cached byte code.
$ rm $(python --show-cache-file foo.py)  # Same as above.

That should cover just about any common need for the user to know
exactly which byte code file corresponds to a given source file. That,
in turn, frees us to choose a less obtrusive location for the byte code
files than mingled in with the source.

-- 
 \ “Pinky, are you pondering what I'm pondering?” “I think so, but |
  `\  where will we find an open tattoo parlor at this time of |
_o__)   night?” —_Pinky and The Brain_ |
Ben Finney

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] PEP 3146: Merge Unladen Swallow into CPython

2010-02-02 Thread Dirkjan Ochtman

On Tue, Feb 2, 2010 at 23:54, Collin Winter  wrote:
> Done. The diff is at
> http://codereview.appspot.com/186247/diff2/5014:8003/7002. I listed
> Cython, Shedskin and a bunch of other alternatives to pure CPython.
> Some of that information is based on conversations I've had with the
> respective developers, and I'd appreciate corrections if I'm out of
> date.

Thanks, that's a very good list (and I think it makes for a useful
addition to the PEP).

Cheers,

Dirkjan
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

50 matches

Mail list logo