[Python-Dev] The Return Of Argument Clinic

2013-08-05 Thread Larry Hastings



It's time to discuss Argument Clinic again.  I think the
implementation is ready for public scrutiny.

(It was actually ready a week ago, but I lost a couple of
days to "make distclean" corrupting my hg data store--yes,
I hadn't upped my local clinic branch in a while.  Eventually
I gave up on repairing it and just brute-forcd it.  Anyway...)

My Clinic test branch is here:
https://bitbucket.org/larry/python-clinic/

And before you ask, no, the above branch should never ever
ever be merged back into trunk.  We'll start clean once Clinic
is ready for merging and do a nice neat job.

___


There's no documentation, apart from the PEP.  But you can see
plenty of test cases of using Clinic, just grep for the string
"clinic" in */*.c.  But for reference here's the list:
Modules/_cursesmodule.c
Modules/_datetimemodule.c
Modules/_dbmmodule.c
Modules/posixmodule.c
Modules/unicodedata.c
Modules/_weakref.c
Modules/zlibmodule.c
Objects/dictobject.c
Objects/unicodeobject.c

I haven't reimplemented every PyArg_ParseTuple "format unit"
in the retooled Clinic, so it's not ready to try with every
single builtin yet.

The syntax is as Guido dictated it during our meeting after
the Language Summit at PyCon US 2013.  The implementation has
been retooled, several times, and is now both nicer and more
easily extensible.  The internals are just a little messy,
but the external interfaces are all ready for critique.

___

Here are the external interfaces as I forsee them.

If you add your own data types, you'll subclass
"Converter" and maybe "ReturnConverter".  Take a
look at the existing subclasses to get a feel for
what that's like.

If you implemented your own DSL, you'd make something
that quacked like "PythonParser" (implementing __init__
and parse methods), and you'd deal with "Block",
"Module", "Class", "Function", and "Parameter" objects
a lot.

What do you think?

___


What follows are six questions I'd like to put to the community,
ranked oddly enough in order of how little to how much I
care about the answer.

BTW, by convention, every time I need a arbitrary sample
function I use "os.stat".

(Please quote the question line in your responses,
otherwise I fear we'll get lost in the sea of text.)

___
Question 0: How should we integrate Clinic into the build process?

Clinic presents a catch-22: you want it as part of the build process,
but it needs Python to be built before it'll run.  Currently it
requires Python 3.3 or newer; it might work in 3.2, I've never
tried it.

We can't depend on Python 3 being available when we build.
This complicates the build process somewhat.  I imagine it's a
solvable problem on UNIX... with the right wizardry.  I have no
idea how one'd approach it on Windows, but obviously we need to
solve the problem there too.

___
Question 1: Which C function nomenclature?

Argument Clinic generates two functions prototypes per Python
function: one specifying one of the traditional signatures for
builtins, whose code is generated completely by Clinic, and the
other with a custom-generated signature for just that call whose
code is written by the user.

Currently the former doesn't have any specific name, though I
have been thinking of it as the "parse" function.  The latter
is definitely called the "impl" (pronounced IM-pull), short
for "implementation".

When Clinic generates the C code, it uses the name of the Python
function to create the C functions' names, with underscores in
place of dots.  Currently the "parse" function gets the base name
("os_stat"), and the "impl" function gets an "_impl" added to the
end ("os_stat_impl").

Argument Clinic is agnostic about the names of these functions.
It's possible it'd be nicer to name these the other way around,
say "os_stat_parse" for the parse function and "os_stat" for the
impl.

Anyone have a strong opinion one way or the other?  I don't much
care; all I can say is that the "obvious" way to do it when I
started was to add "_impl" to the impl, as it is the new creature
under the sun.

___
Question 2: Emit code for modules and classes?

Argument Clinic now understands the structure of the
modules and classes it works with.  You declare them
like so:

module os
class os.ImaginaryClassHere
def os.ImaginaryClassHere.stat(...):
...

Currently it does very little with the information; right
now it mainly just gets baked into the documentation.
In the future I expect it to get used in the introspection
metadata, and it'll definitely be relevant to external
consumers of the Argument Clinic information (IDEs building
per

Re: [Python-Dev] The Return Of Argument Clinic

2013-08-05 Thread Armin Rigo
Hi Larry,

On Mon, Aug 5, 2013 at 10:48 AM, Larry Hastings  wrote:
> Question 4: Return converters returning success/failure?

The option generally used elsewhere is: if we throw an exception, we
return some special value; but the special value doesn't necessarily
mean by itself that an exception was set.  It's a reasonable solution
because the caller only needs to call PyErr_Occurred() for one special
value, rather than every time.  See for example any call to
PyFloat_AsDouble().


A bientôt,

Armin.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The Return Of Argument Clinic

2013-08-05 Thread Nick Coghlan
On 5 August 2013 18:48, Larry Hastings  wrote:
> Question 0: How should we integrate Clinic into the build process?
>
> Clinic presents a catch-22: you want it as part of the build process,
> but it needs Python to be built before it'll run.  Currently it
> requires Python 3.3 or newer; it might work in 3.2, I've never
> tried it.
>
> We can't depend on Python 3 being available when we build.
> This complicates the build process somewhat.  I imagine it's a
> solvable problem on UNIX... with the right wizardry.  I have no
> idea how one'd approach it on Windows, but obviously we need to
> solve the problem there too.

Isn't solving the bootstrapping problem the reason for checking in the
clinic-generated output? If there's no Python available, we build what
we have (without the clinic step), then we build it again *with* the
clinic step.

> ___
> Question 1: Which C function nomenclature?

> Anyone have a strong opinion one way or the other?  I don't much
> care; all I can say is that the "obvious" way to do it when I
> started was to add "_impl" to the impl, as it is the new creature
> under the sun.

Consider this from the client side, and I believe it answers itself:
other code in the module will be expected the existing signature, so
that signature needs to stay with the existing name, while the new C
implementation function gets the new name.

> ___
> Question 2: Emit code for modules and classes?
>
> There are some complications to this, one of which I'll
> discuss next.  But I put it to you, gentle reader: how
> much boilerplate should Argument Clinic undertake to
> generate, and how much more class and module metadata
> should be wired in to it?

I strongly recommend deferring this. Incremental development is good,
and getting this bootstrapped at all is going to be challenging enough
without trying to do everything at once.

> ___
> Question 3: #ifdef support for functions?
>
>
> Do you agree that Argument Clinic should generate this
> information, and it should use the approach in 3) ?

I think you should postpone anything related to modules and classes
until the basic function support is in and working.

> ___
> Question 4: Return converters returning success/failure?
>
> Can we live with PyErr_Occurred() here?

Armin's suggestion of a valid return value (say, -1) that indicates
"error may have occurred" sounds good to me.

> ___
> Question 5: Keep too-magical class decorator Converter.wrap?
>
> I'd like to keep it in, and anoint it as the preferred way
> of declaring Converter subclasses.  Anybody else have a strong
> opinion on this either way?

Can't you get the same effect without the magic by having a separate
"custom_init" method that the main __init__ method promises to call
with the extra keyword args after finishing the other parts of the
initialization? Them a custom converter would just look like:

class path_t_converter(Converter):
def custom_init(self, *, allow_fd=False, nullable=False):
...

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446)

2013-08-05 Thread Victor Stinner
2013/7/27 Nick Coghlan :
>> Do we even need a new PEP, or should we just do it? Or can we adapt
>> Victor's PEP 446?
>
> Adapting the PEP sounds good - while I agree with switching to a sane
> default, I think the daemonisation thread suggests there may need to
> be a supporting API to help force FDs created by nominated logging
> handlers to be inherited.

Why would a Python logging handler be used in a child process? If the
child process is a fresh Python process, it starts with the default
logging handlers (no handler).

Files opened by the logging module must be closed on exec().

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446)

2013-08-05 Thread Victor Stinner
I checked the python-daemon module: it closes all open file
descriptors except 0, 1, 2. It has a files_preserve attribute to keep
some FD opens. It redirects stdin, stdout and stderr to /dev/null and
keep these file descriptors open. If python-daemon is used to execute
a new program, the files_preserve list can be used to mark these file
descriptors as inherited.

The zdaemon.zdrun module closes all open file descriptors except 0, 1,
2. It uses also dup2() to redirect stdout and stderr to the write end
of a pipe.

Victor

2013/7/25 Eric V. Smith :
> On 7/24/2013 6:25 PM, Guido van Rossum wrote:
 To reduce the need for 3rd party subprocess creation code, we should
 have better daemon creation code in the stdlib -- I wrote some damn
 robust code for this purpose in my previous job, but it never saw the
 light of day.
>>>
>>> What do you call "daemon"? An actual Unix-like daemon?
>>
>> Yeah, a background process with parent PID 1 and not associated with
>> any terminal group.
>
> There's PEP 3143 and https://pypi.python.org/pypi/python-daemon. I've
> used it often, with great success.
>
> --
> Eric.
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/victor.stinner%40gmail.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446)

2013-08-05 Thread Nick Coghlan
On 5 August 2013 22:52, Victor Stinner  wrote:
> I checked the python-daemon module: it closes all open file
> descriptors except 0, 1, 2. It has a files_preserve attribute to keep
> some FD opens. It redirects stdin, stdout and stderr to /dev/null and
> keep these file descriptors open. If python-daemon is used to execute
> a new program, the files_preserve list can be used to mark these file
> descriptors as inherited.
>
> The zdaemon.zdrun module closes all open file descriptors except 0, 1,
> 2. It uses also dup2() to redirect stdout and stderr to the write end
> of a pipe.

So closed by default, and directing people towards subprocess and
python-daemon if they need to keep descriptors open sounds really
promising.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The Return Of Argument Clinic

2013-08-05 Thread Glenn Linderman

On 8/5/2013 1:48 AM, Larry Hastings wrote:


The impl should know whether or not it failed.  So it's the
interface we're defining that forces it to throw away that
information.  If we provided a way for it to return that
information, we could shave off some cycles.  The problem
is, how do we do that in a way that doesn't suck?
...
Can we live with PyErr_Occurred() here?


Isn't there another option? To have the impl call a special "failed" 
clinic API, prior to returning failure? And if that wasn't called, then 
the return is success. Or does that require the same level of overhead 
as PyErr_Occurred?


Reducing the chances of PyErr_Occurred per Armin's suggestion seems good 
if the above is not an improvement.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Inherance of file descriptor and handles on Windows (PEP 446)

2013-08-05 Thread Antoine Pitrou
On Mon, 5 Aug 2013 14:56:06 +0200
Victor Stinner  wrote:
> 2013/7/27 Nick Coghlan :
> >> Do we even need a new PEP, or should we just do it? Or can we adapt
> >> Victor's PEP 446?
> >
> > Adapting the PEP sounds good - while I agree with switching to a sane
> > default, I think the daemonisation thread suggests there may need to
> > be a supporting API to help force FDs created by nominated logging
> > handlers to be inherited.
> 
> Why would a Python logging handler be used in a child process? If the
> child process is a fresh Python process, it starts with the default
> logging handlers (no handler).
> 
> Files opened by the logging module must be closed on exec().

I agree with this. It is only on fork()-without-exec() that the
behaviour of python-daemon is actively anti-social.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Antoine Pitrou

Hi,

On Sun, 04 Aug 2013 09:23:41 +0200
Stefan Behnel  wrote:
> 
> I'm currently catching up on PEP 442, which managed to fly completely below
> my radar so far. It's a really helpful change that could end up fixing a
> major usability problem that Cython was suffering from: user provided
> deallocation code now has a safe execution environment (well, at least in
> Py3.4+). That makes Cython a prime candidate for testing this, and I've
> just started to migrate the implementation.

That's great to hear. "Safe execution environment" for finalization
code is exactly what the PEP is about.

> One thing that I found to be missing from the PEP is inheritance handling.
> The current implementation doesn't seem to care about base types at all, so
> it appears to be the responsibility of the type to call its super type
> finalisation function. Is that really intended?

Yes, it is intended that users have to call super().__del__() in their
__del__ implementation, if they want to call the upper-level finalizer.
This is exactly the same as in __init__() and (most?) other special
functions.

> Another bit is the exception handling. According to the documentation,
> tp_finalize() is supposed to first save the current exception state, then
> do the cleanup, then call WriteUnraisable() if necessary, then restore the
> exception state.
> 
> http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize
> 
> Is there a reason why this is left to the user implementation, rather than
> doing it generically right in PyObject_CallFinalizer() ? That would also
> make it more efficient to call through the super type hierarchy, I guess. I
> don't see a need to repeat this exception state swapping at each level.

I didn't give much thought to this detail. Originally I was simply
copying this bit of semantics from tp_dealloc and tp_del, but indeed we
could do better. Do you want to open an issue about it?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Antoine Pitrou
On Sun, 04 Aug 2013 17:59:57 +0200
Stefan Behnel  wrote:
> > I continued my implementation and found that calling up the base type
> > hierarchy is essentially the same code as calling up the hierarchy for
> > tp_dealloc(), so that was easy to adapt to in Cython and is also more
> > efficient than a generic loop (because it can usually benefit from
> > inlining). So I'm personally ok with leaving the super type calling code to
> > the user side, even though manual implementers may not be entirely happy.
> > 
> > I think it should get explicitly documented how subtypes should deal with a
> > tp_finalize() in (one of the) super types. It's not entirely trivial
> > because the tp_finalize slot is not guaranteed to be filled for a super
> > type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that
> > PyType_Ready() copies it would still hold, though.

Not only it could be NULL (if no upper type has a finalizer), but it
could also not exist at all (if Py_TPFLAGS_HAVE_FINALIZE isn't in
tp_flags). If an API is needed to make this easier then why not. But
I'm not sure anyone else than Cython really has such concerns. Usually,
the class hierarchy for C extension types is known at compile-time and
therefore you know exactly which upper finalizers to call.

> Hmm, it seems to me by now that the only safe way of handling this is to
> let each tp_dealloc() level in the hierarchy call tp_finalize() through
> PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in
> tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc()
> functions in base types and subtypes.

I'm not following you. Why is it "a bit fragile" to call the base
tp_finalize from a derived tp_finalize? It should actually be totally
safe, since tp_finalize is a regular function called in a safe
environment (unlike tp_dealloc and tp_del).

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Stefan Behnel
Hi,

I was just continuing in my monologue when you replied, so I'll just drop
my response below.

Antoine Pitrou, 05.08.2013 20:51:
> On Sun, 04 Aug 2013 09:23:41 +0200
> Stefan Behnel wrote:
>> I'm currently catching up on PEP 442, which managed to fly completely below
>> my radar so far. It's a really helpful change that could end up fixing a
>> major usability problem that Cython was suffering from: user provided
>> deallocation code now has a safe execution environment (well, at least in
>> Py3.4+). That makes Cython a prime candidate for testing this, and I've
>> just started to migrate the implementation.
> 
> That's great to hear. "Safe execution environment" for finalization
> code is exactly what the PEP is about.
> 
>> One thing that I found to be missing from the PEP is inheritance handling.
>> The current implementation doesn't seem to care about base types at all, so
>> it appears to be the responsibility of the type to call its super type
>> finalisation function. Is that really intended?
> 
> Yes, it is intended that users have to call super().__del__() in their
> __del__ implementation, if they want to call the upper-level finalizer.
> This is exactly the same as in __init__() and (most?) other special
> functions.

That's the Python side of things. However, if a subtype overwrites
tp_finalize(), then there should be a protocol for making sure the super
type's tp_finalize() is called, and that it's being called in the right
kind of execution environment.


>> Another bit is the exception handling. According to the documentation,
>> tp_finalize() is supposed to first save the current exception state, then
>> do the cleanup, then call WriteUnraisable() if necessary, then restore the
>> exception state.
>>
>> http://docs.python.org/3.4/c-api/typeobj.html#PyTypeObject.tp_finalize
>>
>> Is there a reason why this is left to the user implementation, rather than
>> doing it generically right in PyObject_CallFinalizer() ? That would also
>> make it more efficient to call through the super type hierarchy, I guess. I
>> don't see a need to repeat this exception state swapping at each level.
> 
> I didn't give much thought to this detail. Originally I was simply
> copying this bit of semantics from tp_dealloc and tp_del, but indeed we
> could do better. Do you want to open an issue about it?


I think the main problem I have with the PEP is this part:

"""
The PEP doesn't change the semantics of:
 * C extension types with a custom tp_dealloc function.
"""

Meaning, it was designed to explicitly ignore this use case. That's a
mistake, IMHO. If we are to add a new finalisation protocol, why not make
it work in the general case (i.e. fix the problem once and for all),
instead of restricting it to a special case and leaving the rest to each
user to figure out again?

Separating the finalisation from the deallocation is IMO a good idea. It
fixes cyclic garbage collection, that's excellent. And it removes the
differences between GC and normal refcounting cleanup by clarifying in what
states finalisation and deallocation are executed (one safe, one not).

I think what's missing is the following.

* split deallocation into a distinct finalisation and deallocation phase

* make finalisation run recursively (or iteratively) through all
inheritance levels, in a well defined execution environment (e.g. after
saving away the exception state)

* after successful finalisation, run the deallocation, as before.

My guess is that a recursive finalisation phase where subtype code calls
into the supertype is generally more efficient, so I think I'd prefer that.

I think it's a mistake that the current implementation calls the
finalisation from tp_dealloc(). Instead, both the finalisation and the
deallocation should be called externally and independently from the cleanup
mechanism behind Py_DECREF(). (There is no problem in CS that can't be
solved by adding another level of indirection...)

An obvious open question is how to deal with exceptions during
finalisation. Any break in the execution chain would mean that a part of
the type wouldn't be finalised. One way to handle this could be to simply
assume that the deallocation phase would still clean up anything that's
left over. Or the protocol could dictate that each level must swallow its
own exceptions and call the super type finaliser with a clean exception
state. This might suggest that an external iterative call loop through the
finaliser hierarchy has a usability advantage over recursive calls.

Just dropping my idea here.

Stefan


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Antoine Pitrou
On Mon, 05 Aug 2013 21:03:33 +0200
Stefan Behnel  wrote:
> I think the main problem I have with the PEP is this part:
> 
> """
> The PEP doesn't change the semantics of:
>  * C extension types with a custom tp_dealloc function.
> """
> 
> Meaning, it was designed to explicitly ignore this use case.

It doesn't ignore it. It lets you fix your C extension type to use
tp_finalize for resource finalization. It also provides the
PyObject_CallFinalizerFromDealloc() API function to make it easier to
call tp_finalize from your tp_dealloc.

What the above sentence means is that, if you don't change your legacy
tp_dealloc, your type will not take advantage of the new facilities.

(you can take a look at the _io module; it was modified to take
advantage of tp_finalize)

> * make finalisation run recursively (or iteratively) through all
> inheritance levels, in a well defined execution environment (e.g. after
> saving away the exception state)

__init__ and other methods only let the user recurse explicitly.
__del__ would be a weird exception if it recursed implicitly. Also, it
would break backwards compatibility for existing uses of __del__.

> I think it's a mistake that the current implementation calls the
> finalisation from tp_dealloc(). Instead, both the finalisation and the
> deallocation should be called externally and independently from the cleanup
> mechanism behind Py_DECREF(). (There is no problem in CS that can't be
> solved by adding another level of indirection...)

Why not, but I'm not sure that would solve anything on your side. If it
does, would you like to cook a patch? I wonder if there's some
unexpected issue with doing what you're proposing.

> An obvious open question is how to deal with exceptions during
> finalisation. Any break in the execution chain would mean that a part of
> the type wouldn't be finalised.

Let's come back to pure Python:

class A:
def __del__(self):
1/0

class B(A):
def __del__(self):
super().__del__()
self.cleanup_resources()


If you want cleanup_resources() to be called always (despite
A.__del__() raising), you have to use a try/finally block. There's no
magic here.

Letting the user call the upper finalizer explicitly lets them choose
their preferred form of exception handling.

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Stefan Behnel
Antoine Pitrou, 05.08.2013 20:56:
> On Sun, 04 Aug 2013 17:59:57 +0200
> Stefan Behnel wrote:
>>> I continued my implementation and found that calling up the base type
>>> hierarchy is essentially the same code as calling up the hierarchy for
>>> tp_dealloc(), so that was easy to adapt to in Cython and is also more
>>> efficient than a generic loop (because it can usually benefit from
>>> inlining). So I'm personally ok with leaving the super type calling code to
>>> the user side, even though manual implementers may not be entirely happy.
>>>
>>> I think it should get explicitly documented how subtypes should deal with a
>>> tp_finalize() in (one of the) super types. It's not entirely trivial
>>> because the tp_finalize slot is not guaranteed to be filled for a super
>>> type IIUC, as opposed to tp_dealloc. I assume the recursive invariant that
>>> PyType_Ready() copies it would still hold, though.
> 
> Not only it could be NULL (if no upper type has a finalizer), but it
> could also not exist at all (if Py_TPFLAGS_HAVE_FINALIZE isn't in
> tp_flags). If an API is needed to make this easier then why not. But
> I'm not sure anyone else than Cython really has such concerns. Usually,
> the class hierarchy for C extension types is known at compile-time and
> therefore you know exactly which upper finalizers to call.

Well, you shouldn't have to, though. Otherwise, it would be practically
impossible to insert a new finaliser into an existing hierarchy once other
people/projects have started inheriting from it.

And, sure, Cython makes these things so easy that people actually do them.
The Sage math system has type hierarchies that go up to 10 levels deep
IIRC. That's a lot of space for future changes.


>> Hmm, it seems to me by now that the only safe way of handling this is to
>> let each tp_dealloc() level in the hierarchy call tp_finalize() through
>> PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in
>> tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc()
>> functions in base types and subtypes.

I think I got confused here. PyObject_CallFinalizerFromDealloc() works on
the object, not the type. So it can't be used to call anything but the
bottom-most tp_finalize().


> I'm not following you. Why is it "a bit fragile" to call the base
> tp_finalize from a derived tp_finalize? It should actually be totally
> safe, since tp_finalize is a regular function called in a safe
> environment (unlike tp_dealloc and tp_del).

As long as there is not OWTDI, you can't really make safe assumption about
the way a super type's tp_finalize() and tp_dealloc() play together. The
details definitely need to be spelled out here.

Stefan


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Antoine Pitrou
On Mon, 05 Aug 2013 21:32:54 +0200
Stefan Behnel  wrote:
> 
> >> Hmm, it seems to me by now that the only safe way of handling this is to
> >> let each tp_dealloc() level in the hierarchy call tp_finalize() through
> >> PyObject_CallFinalizerFromDealloc(), instead of calling up the stack in
> >> tp_finalize(). Otherwise, it's a bit fragile for arbitrary tp_dealloc()
> >> functions in base types and subtypes.
> 
> I think I got confused here. PyObject_CallFinalizerFromDealloc() works on
> the object, not the type. So it can't be used to call anything but the
> bottom-most tp_finalize().

Well, the bottom-most tp_finalize() is responsible for calling the upper
ones, if it wants to.

> > I'm not following you. Why is it "a bit fragile" to call the base
> > tp_finalize from a derived tp_finalize? It should actually be totally
> > safe, since tp_finalize is a regular function called in a safe
> > environment (unlike tp_dealloc and tp_del).
> 
> As long as there is not OWTDI, you can't really make safe assumption about
> the way a super type's tp_finalize() and tp_dealloc() play together. The
> details definitely need to be spelled out here.

I'd be glad to make the spec more explicit if needed, but first you
need to tell me if the current behaviour is ok, or if you need
something else (within the boundaries of backwards compatibility and
reasonable expectations, though: i.e. no implicit recursion
through the __mro__).

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 442 clarification for type hierarchies

2013-08-05 Thread Stefan Behnel
Antoine Pitrou, 05.08.2013 21:26:
> On Mon, 05 Aug 2013 21:03:33 +0200
> Stefan Behnel wrote:
>> I think the main problem I have with the PEP is this part:
>>
>> """
>> The PEP doesn't change the semantics of:
>>  * C extension types with a custom tp_dealloc function.
>> """
>>
>> Meaning, it was designed to explicitly ignore this use case.
> 
> It doesn't ignore it. It lets you fix your C extension type to use
> tp_finalize for resource finalization. It also provides the
> PyObject_CallFinalizerFromDealloc() API function to make it easier to
> call tp_finalize from your tp_dealloc.
> 
> What the above sentence means is that, if you don't change your legacy
> tp_dealloc, your type will not take advantage of the new facilities.
> 
> (you can take a look at the _io module; it was modified to take
> advantage of tp_finalize)

Oh, I'm aware of the backwards compatibility, but I totally *want* to take
advantage of the new feature.


>> * make finalisation run recursively (or iteratively) through all
>> inheritance levels, in a well defined execution environment (e.g. after
>> saving away the exception state)
> 
> __init__ and other methods only let the user recurse explicitly.
> __del__ would be a weird exception if it recursed implicitly. Also, it
> would break backwards compatibility for existing uses of __del__.

Hmm, it's a bit unfortunate that tp_finalize() maps so directly to
__del__(), but I think this can be fixed. In any case, each tp_finalize()
function must only ever be called once, so if a subtype inherited the
tp_finalize() slot from its parent, it mustn't be called again. Instead,
the hierarchy would be followed upwards to search for the next
tp_finalize() that's different from the current one, i.e. the function
pointer differs. That means that only the top-most super type would need to
call __del__(), after all tp_finalize() functions in subtypes have run.


>> I think it's a mistake that the current implementation calls the
>> finalisation from tp_dealloc(). Instead, both the finalisation and the
>> deallocation should be called externally and independently from the cleanup
>> mechanism behind Py_DECREF(). (There is no problem in CS that can't be
>> solved by adding another level of indirection...)
> 
> Why not, but I'm not sure that would solve anything on your side.

Well, it would untangle both phases and make it clearer what needs to call
what. If I call my super type's tp_dealloc(), does it need to call
tp_finalize() or not? And if it does: the type's one or just its own one?

If both phases are split, then the answer is simply: it doesn't and
mustn't, because that's already been taken care of. All it has to care
about is what it's there for: deallocation.


> If it does, would you like to cook a patch? I wonder if there's some
> unexpected issue with doing what you're proposing.

I was somehow expecting this question. There's still the open issue of
module initialisation, though. Not sure which is more important. Given that
this feature has already been merged, I guess it's better to fix it up
before people start making actual use of it, instead of putting work into a
new feature that's not even started yet.


>> An obvious open question is how to deal with exceptions during
>> finalisation. Any break in the execution chain would mean that a part of
>> the type wouldn't be finalised.
> 
> Let's come back to pure Python:
> 
> class A:
> def __del__(self):
> 1/0
> 
> class B(A):
> def __del__(self):
> super().__del__()
> self.cleanup_resources()

What makes you think it's a good idea to call the parent type's finaliser
before doing the local finalisation, and not the other way round? What if
the subtype needs access to parts of the super type for its cleanup?

In other words, which makes more sense (at the C level):

try:
super().tp_finalize()
finally:
local_cleanup()

or

try:
local_cleanup()
finally:
super().tp_finalize()

Should that order be part of the protocol or not? (well, not for __del__()
I guess, but maybe for tp_finalize()?)

Coming back to the __del__() vs. tp_finalize() story, if tp_finalize()
first recursed into the super types, the top-most one of which then calls
__del__() and returns, we'd get an execution order that runs Python-level
__del__() methods before C-level tp_finalize() functions, but loose the
subtype-before-supertype execution order for tp_finalize() functions.

That might call for a three-step cleanup:

1) run all Python __del__() methods recursively
2) run all tp_finalize() functions recursively
3) run tp_dealloc() recursively

This would allow all three call chains to run in subtype-before-supertype
order, and execute the more sensitive Python methods before the low-level
"I know what I'm doing" C finalisers.


> If you want cleanup_resources() to be called always (despite
> A.__del__() raising), you have to use a try/finally block. There's no
> magic here.
> 
> Letting the user 

Re: [Python-Dev] unittest.TestSuite holding references to unittest.TestCase instances too long

2013-08-05 Thread Matt McClure
On Sat, Aug 3, 2013 at 3:27 PM, Michael Foord wrote:

> It smells to me like a new feature rather than a bugfix, and it's a
> moderately big change. I don't think it can be backported to 2.7 other than
> through unittest2.
>

Is http://hg.python.org/unittest2 the place to backport to unittest2?

-- 
Matt McClure
http://matthewlmcclure.com
http://www.mapmyfitness.com/profile/matthewlmcclure
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The Return Of Argument Clinic

2013-08-05 Thread Larry Hastings



On 08/05/2013 02:55 AM, Nick Coghlan wrote:

On 5 August 2013 18:48, Larry Hastings  wrote:

Question 0: How should we integrate Clinic into the build process?

Isn't solving the bootstrapping problem the reason for checking in the
clinic-generated output? If there's no Python available, we build what
we have (without the clinic step), then we build it again *with* the
clinic step.


It solves the bootstrapping problem, but that's not the only problem 
Clinic presents to the development workflow.


If you modify some Clinic DSL in a C file in the CPython tree, then run 
"make", should the Makefile re-run Clinic over that file?  If you say 
"no", then there's no problem.  If you say "yes", then we have the 
problem I described.




___
Question 1: Which C function nomenclature?
Consider this from the client side, and I believe it answers itself: 
other code in the module will be expected the existing signature, so 
that signature needs to stay with the existing name, while the new C 
implementation function gets the new name.


One vote for "os_stat_impl".  Bringing the sum total of votes up to 1!  ;-)



___
Question 2: Emit code for modules and classes?

There are some complications to this, one of which I'll
discuss next.  But I put it to you, gentle reader: how
much boilerplate should Argument Clinic undertake to
generate, and how much more class and module metadata
should be wired in to it?

I strongly recommend deferring this. Incremental development is good,
and getting this bootstrapped at all is going to be challenging enough
without trying to do everything at once.


I basically agree.  But you glossed over an important part of that 
question, "how much more class and module metadata should be wired in 
right now?".


Originally Clinic didn't ask for full class and module information, you 
just specified the full dotted path and that was that.  But that's 
ambiguous; Clinic wouldn't be able to infer what was a module vs what 
was a class.  And in the future, if/when it generates module and class 
boilerplate, obviously it'll need to know the distinction.  I figure, 
specifying the classes and modules doesn't add a lot of additional cost, 
but it'll very likely save us a lot of time in the long run, so I made 
it a requirement.  (WAGNI!)


Anyway, I guess what I was really kind of trying to get at here was:
a) are there any other obvious bits of metadata Clinic should require 
right now for functions,
b) what other metadata might Clinic take in the future--not because I 
want to add it, but just so we can figure out the next question,
c) to what degree can we future-proof Clinic 1.0 so extension authors 
can more easily straddle versions.


Thinking about it more with a fresh perspective, maybe all we need is a 
Clinic version number directive.  This would declare the minimum Clinic 
version--which would really just track the Python version it shipped 
with--that you may use to process this file. Like so:


   /*[clinic]
   clinic 3.5
   [clinic]*/

As long as the code Clinic generates is backwards compatible for Python 
3.4, I think this will has it covered.  We may at times force developers 
to use fresher versions of Python to process Clinic stuff, but I don't 
think that's a big deal.




___
Question 4: Return converters returning success/failure?

Can we live with PyErr_Occurred() here?

Armin's suggestion of a valid return value (say, -1) that indicates
"error may have occurred" sounds good to me.


Yes indeed, and thanks Armin for pointing it out.  This works perfectly 
in Clinic, as each individual return converter controls the code 
generated for cleanup afterwards.  So it can be a completely local 
policy per-return-converter what the magic value is.  Heck, you could 
have multiple int converters with different magic return values (not 
that that seems like a good idea).




___
Question 5: Keep too-magical class decorator Converter.wrap?

I'd like to keep it in, and anoint it as the preferred way
of declaring Converter subclasses.  Anybody else have a strong
opinion on this either way?

Can't you get the same effect without the magic by having a separate
"custom_init" method that the main __init__ method promises to call
with the extra keyword args after finishing the other parts of the
initialization? Them a custom converter would just look like:

 class path_t_converter(Converter):
 def custom_init(self, *, allow_fd=False, nullable=False):
 ...


I can get the same effect without reusing the name __init__, but I 
wouldn't say I can do it "without the magic".  The whole point of the 
decorator is magic.


Let's say I go with your proposal.  What happens if someone makes a 
Converter, and wraps it with Conver

[Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable

2013-08-05 Thread Victor Stinner
Hi,

My second try (old PEP 446) to change how the inheritance of file
descriptors was somehow rejected. Here is a third try (new PEP 446)
which should include all information of the recent discussions on
python-dev, especially how file descriptors and handles are inherited
on Windows.

I added tables to summarize the status of Python 3.3 and the status of
atomic flags. I rewrote the rationale to introduce and explain the
inheritance of file descriptors, which is not an simple subject.

I removed the O_NONBLOCK flag from the PEP, it is a different issue
and a new PEP must be written.

Online HTML version:
http://www.python.org/dev/peps/pep-0446/

You may also want to read my first attempt:
http://www.python.org/dev/peps/pep-0433/


PEP: 446
Title: Make newly created file descriptors non-inheritable
Version: $Revision$
Last-Modified: $Date$
Author: Victor Stinner 
Status: Draft
Type: Standards Track
Content-Type: text/x-rst
Created: 5-August-2013
Python-Version: 3.4


Abstract


Leaking file descriptors in child processes causes various annoying
issues and is a known major security vulnerability. This PEP proposes to
make all file descriptors created by Python non-inheritable by default
to reduces the risk of these issues. This PEP fixes also a race
condition in multithreaded applications on operating systems supporting
atomic flags to create non-inheritable file descriptors.


Rationale
=

Inheritance of File Descriptors
---

Each operating system handles the inheritance of file descriptors
differently. Windows creates non-inheritable file descriptors by
default, whereas UNIX creates inheritable file descriptors by default.
Python prefers the POSIX API over the native Windows API to have a
single code base, and so it creates inheritable file descriptors.

There is one exception: ``os.pipe()`` creates non-inheritable pipes on
Windows, whereas it creates inheritable pipes on UNIX.  The reason is an
implementation artifact: ``os.pipe()`` calls ``CreatePipe()`` on Windows
(native API), whereas it calls ``pipe()`` on UNIX (POSIX API). The call
to ``CreatePipe()`` was added in Python in 1994, before the introduction
of ``pipe()`` in the POSIX API in Windows 98. The `issue #4708
`_ proposes to change ``os.pipe()`` on
Windows to create inheritable pipes.


Inheritance of File Descriptors on Windows
--

On Windows, the native type of file objects are handles (C type
``HANDLE``). These handles have a ``HANDLE_FLAG_INHERIT`` flag which
defines if a handle can be inherited in a child process or not. For the
POSIX API, the C runtime (CRT) provides also file descriptors (C type
``int``). The handle of a file descriptor can be get using the
function ``_get_osfhandle(fd)``. A file descriptor can be created from a
handle using the function ``_open_osfhandle(handle)``.

Using `CreateProcess()
`_,
handles are only inherited if their inheritable flag
(``HANDLE_FLAG_INHERIT``) is set and if the ``bInheritHandles``
parameter of ``CreateProcess()`` is ``TRUE``; all file descriptors
except standard streams (0, 1, 2) are closed in the child process, even
if ``bInheritHandles`` is ``TRUE``. Using the ``spawnv()`` function, all
inheritable handles and all inheritable file descriptors are inherited
in the child process. This function uses the undocumented fields
*cbReserved2* and *lpReserved2* of the `STARTUPINFO
`_
structure to pass an array of file descriptors.

To replace standard streams (stdin, stdout, stderr) using
``CreateProcess()``, the ``STARTF_USESTDHANDLES`` flag must be set in
the *dwFlags* field of the ``STARTUPINFO`` structure and the
*bInheritHandles* parameter of ``CreateProcess()`` must be set to
``TRUE``. So when at least one standard stream is replaced, all
inheritable handles are inherited by the child process.

See also:

* `Handle Inheritance
  
`_
* `Q315939: PRB: Child Inherits Unintended Handles During
  CreateProcess Call `_


Inheritance of File Descriptors on UNIX
---

POSIX provides a *close-on-exec* flag on file descriptors to close
automatically a file descriptor when the C function ``execv()`` is
called. File descriptors with the *close-on-exec* flag cleared are
inherited in the child process, file descriptors with the flag set are
closed in the child process.

The flag can be set in two syscalls (one to get current flags, a second
to set new flags) using ``fcntl()``::

int flags, res;
flags = fcntl(fd, F_GETFD);
if (flags == -1) { /* handle the error */ }
flags |= FD_CLOEXEC;
/* or "flags &= ~FD_CLOEXEC;" to clear the flag */
res = f

Re: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable

2013-08-05 Thread Richard Oudkerk

On 06/08/2013 1:23am, Victor Stinner wrote:

Each operating system handles the inheritance of file descriptors
differently. Windows creates non-inheritable file descriptors by
default, whereas UNIX creates inheritable file descriptors by default.


The Windows API creates non-inheritable *handles* by default.  But the C 
runtime creates inheritable fds by default.


Also the socket library creates sockets with inheritable handles by 
default.  Apparently there isn't a reliable way to make sockets 
non-inheritable because anti-virus/firewall software can interfere:



http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable

--
Richard

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable

2013-08-05 Thread Richard Oudkerk

On 06/08/2013 1:23am, Victor Stinner wrote:

Each operating system handles the inheritance of file descriptors
differently. Windows creates non-inheritable file descriptors by
default, whereas UNIX creates inheritable file descriptors by default.


The Windows API creates non-inheritable *handles* by default.  But the C 
runtime creates inheritable fds by default.


Also the socket library creates sockets with inheritable handles by 
default.  Apparently there isn't a reliable way to make sockets 
non-inheritable because anti-virus/firewall software can interfere:



http://stackoverflow.com/questions/12058911/can-tcp-socket-handles-be-set-not-inheritable

--
Richard

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (New) PEP 446: Make newly created file descriptors non-inheritable

2013-08-05 Thread Victor Stinner
> On Windows, the ``subprocess`` closes all handles and file descriptors
> in the child process by default. If at least one standard stream (stdin,
> stdout or stderr) is replaced (ex: redirected into a pipe), all
> inheritable handles are inherited in the child process.
>
> Summary:
>
> ===  =  ==  =
> Module   FD on UNIX Handles on Windows  FD on Windows
> ===  =  ==  =
> subprocess, default  STD, pass_fds  noneSTD

Oh, the summary table is wrong for the "subprocess, default" line: all
inheritable handles are inherited if at least one standard stream is
replaced.

Victor
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] The Return Of Argument Clinic

2013-08-05 Thread Nick Coghlan
On 6 August 2013 09:53, Larry Hastings  wrote:
> On 08/05/2013 02:55 AM, Nick Coghlan wrote:
> On 5 August 2013 18:48, Larry Hastings  wrote:
>
> Question 0: How should we integrate Clinic into the build process?
>
> Isn't solving the bootstrapping problem the reason for checking in the
> clinic-generated output? If there's no Python available, we build what
> we have (without the clinic step), then we build it again *with* the
> clinic step.
>
> It solves the bootstrapping problem, but that's not the only problem Clinic
> presents to the development workflow.
>
> If you modify some Clinic DSL in a C file in the CPython tree, then run
> "make", should the Makefile re-run Clinic over that file?  If you say "no",
> then there's no problem.  If you say "yes", then we have the problem I
> described.

Ah, I think I see the problem you mean. What is defined in the
makefile as the way to regenerate an object file from the C file. If
it is run clinic and then run the compiler, then you will get a
dependency loop. If it doesn't implicitly run clinic, then we risk
checking in inconsistent clinic metadata.

I think the simplest answer may be to have "make clinic" as an
explicit command, along with a commit hook that checks for clinic
metadata consistency. Then "make" doesn't have to change and there's
no nasty bootstrapping problem.


> ___
> Question 2: Emit code for modules and classes?
>
> There are some complications to this, one of which I'll
> discuss next.  But I put it to you, gentle reader: how
> much boilerplate should Argument Clinic undertake to
> generate, and how much more class and module metadata
> should be wired in to it?
>
> I strongly recommend deferring this. Incremental development is good,
> and getting this bootstrapped at all is going to be challenging enough
> without trying to do everything at once.
>
>
> I basically agree.  But you glossed over an important part of that question,
> "how much more class and module metadata should be wired in right now?".
>
> Originally Clinic didn't ask for full class and module information, you just
> specified the full dotted path and that was that.  But that's ambiguous;
> Clinic wouldn't be able to infer what was a module vs what was a class.  And
> in the future, if/when it generates module and class boilerplate, obviously
> it'll need to know the distinction.  I figure, specifying the classes and
> modules doesn't add a lot of additional cost, but it'll very likely save us
> a lot of time in the long run, so I made it a requirement.  (WAGNI!)

Note that setuptools entry point syntax solves the namespace ambiguity
problem by using ":" to separate the module name from the object's
name within the module (the nost test runner does the same thing). I'm
adopting that convention for the PEP 426 metadata, and it's probably
appropriate as a concise notation for clinic as well.

> As long as the code Clinic generates is backwards compatible for Python 3.4,
> I think this will has it covered.  We may at times force developers to use
> fresher versions of Python to process Clinic stuff, but I don't think that's
> a big deal.

One of the nice things about explicitly versioned standards is that
you can set the "no version stated" to the first version released and
avoid the boilerplate in the common case :)

> ___
> Question 5: Keep too-magical class decorator Converter.wrap?
> Let's say I go with your proposal.  What happens if someone makes a
> Converter, and wraps it with Converter.wrap, and defines their own __init__?
> It would never get called.  Silently, by default, which is worse--though I
> could explicitly detect such an __init__ and throw an exception I guess.
> Still, now we have a class where you can't use the name __init__, you have
> to use this funny other name, for arbitrary "correctness" reasons.

You misunderstand me: I believe a class decorator is the *wrong
solution*. I am saying Converter.wrap *shouldn't exist*, and that the
logic for what it does should be directly in Converter.__init__.

The additional initialisation method could be given a better name like
"process_custom_params" rather than "custom_init".

That is, instead of this hard to follow magic:

@staticmethod
def wrap(cls):
class WrappedConverter(cls, Converter):
def __init__(self, name, function, default=unspecified,
*, doc_default=None, required=False, **kwargs):
super(cls, self).__init__(name, function, default,
doc_default=doc_default, required=required)
cls.__init__(self, **kwargs)
return functools.update_wrapper(
WrappedConverter,
cls, updated=())

You would just have the simple:

class Converter:
def __init__(self, name, function, default=unspecified,
*, doc_default=None, required=False, **kwargs):
 

Re: [Python-Dev] Make extension module initialisation more like Python module initialisation

2013-08-05 Thread Stefan Behnel
Hi,

let me revive and summarize this old thread.

Stefan Behnel, 08.11.2012 13:47:
> I suspect that this will be put into a proper PEP at some point, but I'd
> like to bring this up for discussion first. This came out of issues 13429
> and 16392.
> 
> http://bugs.python.org/issue13429
> 
> http://bugs.python.org/issue16392
> 
> 
> The problem
> ===
> 
> Python modules and extension modules are not being set up in the same way.
> For Python modules, the module is created and set up first, then the module
> code is being executed. For extensions, i.e. shared libraries, the module
> init function is executed straight away and does both the creation and
> initialisation. This means that it knows neither the __file__ it is being
> loaded from nor its package (i.e. its FQMN). This hinders relative imports
> and resource loading. In Py3, it's also not being added to sys.modules,
> which means that a (potentially transitive) re-import of the module will
> really try to reimport it and thus run into an infinite loop when it
> executes the module init function again. And without the FQMN, it's not
> trivial to correctly add the module to sys.modules either.
> 
> We specifically run into this for Cython generated modules, for which it's
> not uncommon that the module init code has the same level of complexity as
> that of any 'regular' Python module. Also, the lack of a FQMN and correct
> file path hinders the compilation of __init__.py modules, i.e. packages,
> especially when relative imports are being used at module init time.

The outcome of this discussion was that the extension module import
protocol needs to change in order to provide all necessary information to
the module init function.

Brett Cannon proposed to move the module object creation into the extension
module importer, i.e. outside of the user provided module init function.
CPython would then load the extension module, create and initialise the
module object (set __file__, __name__, etc.) and pass it into the module
init function.

I proposed to make the PyModuleDef struct the new entry point instead of
just a generic C function, as that would give the module importer all
necessary information about the module to create the module object. The
only missing bit is the entry point for the new module init function.

Nick Coghlan objected to the proposal of simply extending PyModuleDef with
an initialiser function, as the struct is part of the stable ABI.

Alternatives I see:

1) Expose a struct that points to the extension module's PyModuleDef struct
and the init function and expose that struct instead.

2) Expose both the PyModuleDef and the init function as public symbols.

3) Provide a public C function as entry point that returns both a
PyModuleDef pointer and a module init function pointer.

4) Change the m_init function pointer in PyModuleDef_base from func(void)
to func(PyObject*) iff the PyModuleDef struct is exposed as a public symbol.

5) Duplicate PyModuleDef and adapt the new one as in 4).

Alternatives 1) and 2) only differ marginally by the number of public
symbols being exposed. 3) has the advantage of supporting more advanced
setups, e.g. heap allocation for the PyModuleDef struct. 4) is a hack and
has the disadvantage that the signature of the module init function cannot
be stored across reinitialisations (PyModuleDef has no "flags" or "state"
field to remember it). 5) would fix that, i.e. we could add a proper
pointer to the new module init function as well as a flags field for future
extensions. A similar effect could be achieved by carefully designing the
struct in 1).

I think 1-3 are all reasonable ways to do this, although I don't think 3)
will be necessary. 5) would be a clean fix, but has the disadvantage of
duplicating an entire struct just to change one field in it.

I'm currently leaning towards 1), with a struct that points to PyModuleDef,
module init function and a flags field for future extensions. I understand
that this would need to become part of the stable ABI, so explicit
extensibility is important to keep up backwards compatibility.

Opinions?

Stefan


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make extension module initialisation more like Python module initialisation

2013-08-05 Thread Nick Coghlan
On 6 August 2013 15:02, Stefan Behnel  wrote:
> Alternatives I see:
>
> 1) Expose a struct that points to the extension module's PyModuleDef struct
> and the init function and expose that struct instead.
>
> 2) Expose both the PyModuleDef and the init function as public symbols.
>
> 3) Provide a public C function as entry point that returns both a
> PyModuleDef pointer and a module init function pointer.
>
> 4) Change the m_init function pointer in PyModuleDef_base from func(void)
> to func(PyObject*) iff the PyModuleDef struct is exposed as a public symbol.
>
> 5) Duplicate PyModuleDef and adapt the new one as in 4).
>
> Alternatives 1) and 2) only differ marginally by the number of public
> symbols being exposed. 3) has the advantage of supporting more advanced
> setups, e.g. heap allocation for the PyModuleDef struct. 4) is a hack and
> has the disadvantage that the signature of the module init function cannot
> be stored across reinitialisations (PyModuleDef has no "flags" or "state"
> field to remember it). 5) would fix that, i.e. we could add a proper
> pointer to the new module init function as well as a flags field for future
> extensions. A similar effect could be achieved by carefully designing the
> struct in 1).
>
> I think 1-3 are all reasonable ways to do this, although I don't think 3)
> will be necessary. 5) would be a clean fix, but has the disadvantage of
> duplicating an entire struct just to change one field in it.
>
> I'm currently leaning towards 1), with a struct that points to PyModuleDef,
> module init function and a flags field for future extensions. I understand
> that this would need to become part of the stable ABI, so explicit
> extensibility is important to keep up backwards compatibility.
>
> Opinions?

I believe a better option would be to migrate module creation over to
a dynamic PyModule_Slot and PyModule_Spec approach in the stable ABI,
similar to the one that was defined for types in PEP 384.

A related topic is that over on import-sig, we're currently tinkering
with the idea of changing the way *Python* module imports happen to
include a separate "ImportSpec" object (exact name TBC). The spec
would contain preliminary info on all of the things that the import
system can figure out *without* actually importing the module. That
list includes all the special attributes that are currently set on
modules:

__loader__
__name__
__package__
__path__
__file__
__cached__

(Note that the attributes on the spec *may not* be the same as those
in the module's own namespace - for example, __name__ and
__spec__.name would differ in a module executed with -m, and __path__
and __spec__.path would end up differing in packages that directly
manipulated their __path__ attribute during __init__ execution)

The intent is to clean up some of the ad hoc hackery that was needed
to make PEP 420 work, and reduce the amount of duplicated
functionality needed in loader implementations.

If you wanted to reboot this thread on import-sig, that would probably
be a good thing :)

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Make extension module initialisation more like Python module initialisation

2013-08-05 Thread Stefan Behnel
Nick Coghlan, 06.08.2013 07:35:
> If you wanted to reboot this thread on import-sig, that would probably
> be a good thing :)

Sigh. Yet another list to know about and temporarily follow...

The import-sig list doesn't seem to be mirrored on Gmane yet. Also, it
claims to be dead w.r.t. Py3.4:

"""
The intent is that this SIG will be re-retired after Python 3.3 is released.
"""

-> http://www.python.org/community/sigs/current/import-sig/

"""
Resurrected for landing PEP 382 in Python 3.3.
"""

-> http://mail.python.org/mailman/listinfo/import-sig

Seriously, wouldn't python-dev be just fine for this? It's not like the
import system is going to be rewritten for each minor release from now on.

Stefan


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com