[Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343

2005-11-28 Thread Nick Coghlan
Given the current semantics of PEP 343 and the following class:

   class null_context(object):
 def __context__(self):
 return self
 def __enter__(self):
 return self
 def __exit__(self, *exc_info):
 pass

Mistakenly writing:

with null_context:
# Oops, passed the class instead of an instance

Would give a less than meaningful error message:

 TypeError: unbound method __context__() must be called with null_context 
instance as first argument (got nothing instead)

It's the usual metaclass problem with invoking a slot (or slot equivalent) via 
"obj.__slot__()" rather than via "type(obj).__slot__(obj)" the way the 
underlying C code does.

I think we need to fix the proposed semantics so that they access the slots 
via the type, rather than directly through the instance. Otherwise the slots 
for the with statement will behave strangely when compared to the slots for 
other magic methods.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] urlparse brokenness

2005-11-28 Thread Guido van Rossum
OK, you've convinced me. But for backwards compatibility (until Python
3000), a new API should be designed. We can't change the old API in an
incompatible way. Please submit complete code + docs to SF. (If you
think this requires much design work, a PEP may be in order but I
think that given the new RFCs it's probably straightforward enough to
not require that.

--Guido

On 11/27/05, Mike Brown <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
> > IIRC I did it this way because the RFC about parsing urls specifically
> > prescribed it had to be done this way.
>
> That was true as of RFC 1808 (1995-1998), although the grammar actually
> allowed for a more generic interpretation.
>
> Such an interpretation was suggested in RFC 2396 (1998-2004) via a regular
> expression for parsing URI 'references' (a formal abstraction introduced in
> 2396) into 5 components (not six, since 'params' were moved into 'path'
> and eventually became an option on every path segment, not just the end
> of the path). The 5 components are:
>
>   scheme, authority (formerly netloc), path, query, fragment.
>
> Parsing could result in some components being undefined, which is distinct
> from being empty (e.g., 'mailto:[EMAIL PROTECTED]' would have an undefined 
> authority
> and fragment, and a defined, but empty, query).
>
> RFC 3986 / STD 66 (2005-) did not change the regular expression, but makes
> several references to these '5 major components' of a URI, and says that these
> components are scheme-independent; parsers that operate at the generic syntax
> level "can parse any URI reference into its major components. Once the scheme
> is determined, further scheme-specific parsing can be performed on the
> components."
>
> > You have to know what the scheme means before you can
> > parse the rest -- there is (by design!) no standard parsing for
> > anything that follows the scheme and the colon.
>
> Not since 1998, IMHO. It was implicit, at least since RFC 2396, that all URI
> references can be interpreted as having the 5 components, it was made explicit
> in RFC 3986 / STD 66.
>
> > I don't even think
> > that you can trust that if the colon is followed by two slashes that
> > what follows is a netloc for all schemes.
>
> You can.
>
> > But if there's an RFC that says otherwise I'll gladly concede;
> > urlparse's main goal in life is to b RFC compliant.
>
> Its intent seems to be to split a URI into its major components, which are now
> by definition scheme-independent (and have been, implicitly, for a long time),
> so the function shouldn't distinguish between schemes.
>
> Do you want to keep returning that 6-tuple, or can we make it return a
> 5-tuple? If we keep returning 'params' for backward compatibility, then that
> means the 'path' we are returning is not the 'path' that people would expect
> (they'll have to concatenate path+params to get what the generic syntax calls
> a 'path' nowadays). It's also deceptive because params are now allowed on all
> path segments, and the current function only takes them from the last segment.
>
> Also for backward compatibility, should an absent component continue to
> manifest in the result as an empty string? I think a compliant parser should
> make a distinction between absent and empty (it could make a difference, in
> theory).
>
> If a regular expression were used for parsing, it would produce None for
> absent components and empty-string for empty ones. I implemented it this
> way in 4Suite's Ft.Lib.Uri and it works nicely.
>
> Mike
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Metaclass problem in the "with" statement semantics in PEP 343

2005-11-28 Thread Guido van Rossum
On 11/28/05, Nick Coghlan <[EMAIL PROTECTED]> wrote:
> Given the current semantics of PEP 343 and the following class:
>
>class null_context(object):
>  def __context__(self):
>  return self
>  def __enter__(self):
>  return self
>  def __exit__(self, *exc_info):
>  pass
>
> Mistakenly writing:
>
> with null_context:
> # Oops, passed the class instead of an instance
>
> Would give a less than meaningful error message:
>
>  TypeError: unbound method __context__() must be called with null_context
> instance as first argument (got nothing instead)
>
> It's the usual metaclass problem with invoking a slot (or slot equivalent) via
> "obj.__slot__()" rather than via "type(obj).__slot__(obj)" the way the
> underlying C code does.
>
> I think we need to fix the proposed semantics so that they access the slots
> via the type, rather than directly through the instance. Otherwise the slots
> for the with statement will behave strangely when compared to the slots for
> other magic methods.

Maybe it's because I'm just an old fart, but I can't make myself care
about this. The code is broken. You get an error message. It even has
the correct exception (TypeError). In this particular case the error
message isn't that great -- well, the same is true in many other cases
(like whenever the invocation is a method call from Python code).

That most built-in operations produce a different error message
doesn't mean we have to make *all* built-in operations use the same
approach. I fail to see the value of the consistency you're calling
for.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] (no subject)

2005-11-28 Thread Guido van Rossum
On 11/24/05, Duncan Grisby <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I posted this to comp.lang.python, but got no response, so I thought I
> would consult the wise people here...
>
> I have encountered a problem with the re module. I have a
> multi-threaded program that does lots of regular expression searching,
> with some relatively complex regular expressions. Occasionally, events
> can conspire to mean that the re search takes minutes. That's bad
> enough in and of itself, but the real problem is that the re engine
> does not release the interpreter lock while it is running. All the
> other threads are therefore blocked for the entire time it takes to do
> the regular expression search.

Rather than trying to fight the GIL, I suggest that you let a regex
expert look at your regex(es) and the input that causes the long
running times. As Fredrik suggested, certain patterns are just
inefficient but can be rewritten more efficiently. There are plenty of
regex experts on c.l.py.

Unless you have a multi-CPU box, the performance of your app isn't
going to improve by releasing the GIL -- it only affects the
responsiveness of other threads.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications

2005-11-28 Thread Guido van Rossum
On 11/20/05, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> [EMAIL PROTECTED] wrote:
> > The local python community here in Sydney indicated that python.org is
> > only upset when groups port the source to 'obscure' systems and *don't*
> > submit patches... It is possible that I was misinformed.
>
> I never heard such concerns. I personally wouldn't notice if somebody
> ported Python, and did not feed back the patches.

I guess that I'm the source of that sentiment.

My reason for wanting people to contribute ports back is that if they
don't, the port is more likely to stick on some ancient version of
Python (e.g. I believe Nokia is still at 2.2.2). Then, assuming the
port remains popular, its users are going to pressure developers of
general Python packages to provide support for old versions of Python.

While I agree that maintaining port-specific code is a pain whenever
Python is upgraded, I still think that accepting patches for
odd-platform ports is the better alternative. Even if the patches
deteriorate as Python evolves, they should still (in principle) make a
re-port easier.

Perhaps the following compromise can be made: the PSF accepts patches
from reputable platform maintainers. (Of course, like all
contributions, they must be of high quality and not break anything,
etc., before they are accepted.) If such patches cause problems with
later Python versions, the PSF won't maintain them, but instead invite
the original contributors (or other developers who are interested in
that particular port) to fix them. If there is insufficient response,
or if it comes too late given the PSF release schedule, the PSF
developers may decide to break or remove support for the affected
platform.

There's a subtle balance between keeping too much old cruft and being
too aggressive in removing cruft that still serves a purpose for
someone. I bet that we've erred in both directions at times.

> Sometimes, people ask "there is this and that port, why isn't it
> integrated", to which the answer is in most cases "because authors
> didn't contribute". This is not being upset - it is merely a fact.
> This port (djgcc) is the first one in a long time (IIRC) where
> anybody proposed rejecting it.
>
> > I am not sure about the future myself. DJGPP 2.04 has been parked at beta
> > for two years now. It might be fair to say that the *general* DJGPP
> > developer base has shrunk a little bit. But the PythonD userbase has
> > actually grown since the first release three years ago. For the time
> > being, people get very angry when the servers go down here :-)
>
> It's not that much availability of the platform I worry about, but the
> commitment of the Python porter. We need somebody to forward bug
> reports to, and somebody to intervene if incompatible changes are made.
> This person would also indicate that the platform is no longer
> available, and hence the port can be removed.

It sounds like Ben Decker is for the time being volunteering to
provide patches and to maintain them. (I hope I'm reading you right,
Ben.) I'm +1 on accepting his patches, *provided* as always they pass
muster in terms of general Python development standards. (Jeff Epler's
comments should be taken to heart.)

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] SRE should release the GIL (was: no subject)

2005-11-28 Thread Duncan Grisby
On Monday 28 November, Guido van Rossum wrote:

> On 11/24/05, Duncan Grisby <[EMAIL PROTECTED]> wrote:

> > I have encountered a problem with the re module. I have a
> > multi-threaded program that does lots of regular expression searching,
> > with some relatively complex regular expressions. Occasionally, events
> > can conspire to mean that the re search takes minutes. That's bad
> > enough in and of itself, but the real problem is that the re engine
> > does not release the interpreter lock while it is running. All the
> > other threads are therefore blocked for the entire time it takes to do
> > the regular expression search.
> 
> Rather than trying to fight the GIL, I suggest that you let a regex
> expert look at your regex(es) and the input that causes the long
> running times. As Fredrik suggested, certain patterns are just
> inefficient but can be rewritten more efficiently. There are plenty of
> regex experts on c.l.py.

Part of the problem is certainly inefficient regexes, and we have
improved things to some extent by changing some of them. Unfortunately,
the regexes come from user input, so we can't be certain that our users
aren't going to do stupid things. It's not too bad if a stupid regex
slows things down for a bit, but it is bad if it causes the whole
application to freeze for minutes at a time.

> Unless you have a multi-CPU box, the performance of your app isn't
> going to improve by releasing the GIL -- it only affects the
> responsiveness of other threads.

We do have a multi-CPU box. Even with good regexes, regex matching takes
up a significant proportion of the time spent processing in our
application, so being able to release the GIL will hopefully increase
performance overall as well as increasing responsiveness.

We are currently testing our application with the patch to sre that Eric
posted. Once we get on to some performance tests, we'll post the results
of whether releasing the GIL does make a measurable difference for us.

Cheers,

Duncan.

-- 
 -- Duncan Grisby --
  -- [EMAIL PROTECTED] --
   -- http://www.grisby.org --
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications

2005-11-28 Thread Martin v. Löwis
Guido van Rossum wrote:
> Perhaps the following compromise can be made: the PSF accepts patches
> from reputable platform maintainers. (Of course, like all
> contributions, they must be of high quality and not break anything,
> etc., before they are accepted.) If such patches cause problems with
> later Python versions, the PSF won't maintain them, but instead invite
> the original contributors (or other developers who are interested in
> that particular port) to fix them. If there is insufficient response,
> or if it comes too late given the PSF release schedule, the PSF
> developers may decide to break or remove support for the affected
> platform.

This is indeed the compromise I was after. If the contributors indicate
that they will maintain it for some time (which happened in this case),
then I can happily accept any port (and did indeed in the past).

In the specific case, there is an additional twist that we deliberately
removed DOS support some time ago, and listed that as officially removed
in a PEP. I understand that djgpp somehow isn't quite the same as DOS,
although I don't understand the differences (anymore).

But if it's fine with you, it is fine with me.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Bug day this Sunday?

2005-11-28 Thread A.M. Kuchling
Is anyone interested in joining a Python bug day this Sunday?

A useful task might be to prepare for the python-core sprint at PyCon
by going through the bug and patch managers, and listing bugs/patches
that would be good candidates for working on at PyCon.

We'd meet in the usual location: #python-dev on irc.freenode.net, from
roughly 9AM to 3PM Eastern (2PM to 8PM UTC) on Sunday Dec. 4.

--amk
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Proposed additional keyword argument in logging calls

2005-11-28 Thread Guido van Rossum
On 11/22/05, Vinay Sajip <[EMAIL PROTECTED]> wrote:
> On numerous occasions, requests have been made for the ability to easily add
> user-defined data to logging events. For example, a multi-threaded server
> application may want to output specific information to a particular server
> thread (e.g. the identity of the client, specific protocol options for the
> client connection, etc.)
>
> This is currently possible, but you have to subclass the Logger class and
> override its makeRecord method to put custom attributes in the LogRecord.
> These can then be output using a customised format string containing e.g.
> "%(foo)s %(bar)d". The approach is usable but requires more work than
> necessary.
>
> I'd like to propose a simpler way of achieving the same result, which
> requires use of an additional optional keyword argument in logging calls.
> The signature of the (internal) Logger._log method would change from
>
>   def _log(self, level, msg, args, exc_info=None)
>
> to
>
>   def _log(self, level, msg, args, exc_info=None, extra_info=None)
>
> The extra_info argument will be passed to Logger.makeRecord, whose signature
> will change from
>
>   def makeRecord(self, name, level, fn, lno, msg, args, exc_info):
>
> to
>
>   def makeRecord(self, name, level, fn, lno, msg, args, exc_info,
> extra_info)
>
> makeRecord will, after doing what it does now, use the extra_info argument
> as follows:
>
> If type(extra_info) != types.DictType, it will be ignored.
>
> Otherwise, any entries in extra_info whose keys are not already in the
> LogRecord's __dict__ will be added to the LogRecord's __dict__.
>
> Can anyone see any problems with this approach? If not, I propose to post
> the approach on python-list and then if there are no strong objections,
> check it in to the trunk. (Since it could break existing code, I'm assuming
> (please correct me if I'm wrong) that it shouldn't go into the
> release24-maint branch.)

This looks like a good clean solution to me. I agree with Paul Moore's
suggestion that if extra_info is not None you should just go ahead and
use it as a dict and let the errors propagate.

What's the rationale for not letting it override existing fields?
(There may be a good one, I just don't see it without turning on my
thinking cap, which would cost extra. :-)

Perhaps it makes sense to call it 'extra' instead of 'extra_info'?

As a new feature it should definitely not go into 2.4; but I don't see
how it could break existing code.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Guido van Rossum
On 11/18/05, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
> Perhaps we should use the memory management technique that the rest
> of Python uses: reference counting.  I don't see why the AST
> structures couldn't be PyObjects.

Me neither. Adding yet another memory allocation scheme to Python's
already staggering number of memory allocation strategies sounds like
a bad idea.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] something is wrong with test___all__

2005-11-28 Thread Guido van Rossum
Has this been handled yet? If not, perhaps showing the good and bad
bytecode here would help trigger someone's brain into understanding
the problem.

On 11/22/05, Reinhold Birkenfeld <[EMAIL PROTECTED]> wrote:
> Hi,
>
> on my machine, "make test" hangs at test_colorsys.
>
> Careful investigation shows that when the bytecode is freshly generated
> by "make all" (precisely in test___all__) the .pyc file is different from 
> what a
> direct call to "regrtest.py test_colorsys" produces.
>
> Curiously, a call to "regrtest.py test___all__" instead of "make test" 
> produces
> the correct bytecode.
>
> I can only suspect some AST bug here.
>
> Reinhold
>
> --
> Mail address is perfectly valid!
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Jeremy Hylton
On 11/28/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 11/18/05, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
> > Perhaps we should use the memory management technique that the rest
> > of Python uses: reference counting.  I don't see why the AST
> > structures couldn't be PyObjects.
>
> Me neither. Adding yet another memory allocation scheme to Python's
> already staggering number of memory allocation strategies sounds like
> a bad idea.

The reason this thread started was the complaint that reference
counting in the compiler is really difficult.  Almost every line of
code can lead to an error exit.  The code becomes quite cluttered when
it uses reference counting.  Right now, the AST is created with
malloc/free, but that makes it hard to free the ast at the right time.
 It would be fairly complex to convert the ast nodes to pyobjects. 
They're just simple discriminated unions right now.  If they were
allocated from an arena, the entire arena could be freed when the
compilation pass ends.

Jeremy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications

2005-11-28 Thread Martin v. Löwis
Guido van Rossum wrote:
 > I don't recall why DOS support was removed (PEP 11 doesn't say)

The PEP was actually created after the removal, so you added (or
asked me to add) this entry:

 Name: MS-DOS, MS-Windows 3.x
 Unsupported in:   Python 2.0
 Code removed in:  Python 2.1

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Patch Req. # 1351020 & 1351036: PythonD modifications

2005-11-28 Thread Guido van Rossum
On 11/28/05, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
> > Perhaps the following compromise can be made: the PSF accepts patches
> > from reputable platform maintainers. (Of course, like all
> > contributions, they must be of high quality and not break anything,
> > etc., before they are accepted.) If such patches cause problems with
> > later Python versions, the PSF won't maintain them, but instead invite
> > the original contributors (or other developers who are interested in
> > that particular port) to fix them. If there is insufficient response,
> > or if it comes too late given the PSF release schedule, the PSF
> > developers may decide to break or remove support for the affected
> > platform.
>
> This is indeed the compromise I was after. If the contributors indicate
> that they will maintain it for some time (which happened in this case),
> then I can happily accept any port (and did indeed in the past).
>
> In the specific case, there is an additional twist that we deliberately
> removed DOS support some time ago, and listed that as officially removed
> in a PEP. I understand that djgpp somehow isn't quite the same as DOS,
> although I don't understand the differences (anymore).
>
> But if it's fine with you, it is fine with me.

Thanks. :-) I say, the more platforms the merrier.

I don't recall why DOS support was removed (PEP 11 doesn't say) but I
presume it was just because nobody volunteered to maintain it, not
because we have a particularly dislike for DOS. So now that we have a
volunteer let's deal with his patches without prejudice.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Guido van Rossum
On 11/28/05, Jeremy Hylton <[EMAIL PROTECTED]> wrote:
> On 11/28/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > On 11/18/05, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
> > > Perhaps we should use the memory management technique that the rest
> > > of Python uses: reference counting.  I don't see why the AST
> > > structures couldn't be PyObjects.
> >
> > Me neither. Adding yet another memory allocation scheme to Python's
> > already staggering number of memory allocation strategies sounds like
> > a bad idea.
>
> The reason this thread started was the complaint that reference
> counting in the compiler is really difficult.  Almost every line of
> code can lead to an error exit.

Sorry, I forgot that (I've been off-line for a week of quality time
with Orlijn, and am now digging my self out from under several hundred
emails :-).

> The code becomes quite cluttered when
> it uses reference counting.  Right now, the AST is created with
> malloc/free, but that makes it hard to free the ast at the right time.

Would fixing the code to add free() calls in all the error exits make
it more or less cluttered than using reference counting?

>  It would be fairly complex to convert the ast nodes to pyobjects.
> They're just simple discriminated unions right now.

Are they all the same size?

> If they were
> allocated from an arena, the entire arena could be freed when the
> compilation pass ends.

Then I don't understand why there was discussion of alloca() earlier
on -- surely the lifetime of a node should not be limited by the stack
frame that allocated it?

I'm not in principle against having an arena for this purpose, but I
worry that this will make it really hard to provide a Python API for
the AST, which has already been requested and whose feasibility
(unless I'm mistaken) also was touted as an argument for switching to
the AST compiler in the first place. I hope we'll never have to deal
with an API like the parser module provides...

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Jeremy Hylton
On 11/28/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> > The code becomes quite cluttered when
> > it uses reference counting.  Right now, the AST is created with
> > malloc/free, but that makes it hard to free the ast at the right time.
>
> Would fixing the code to add free() calls in all the error exits make
> it more or less cluttered than using reference counting?

If we had an arena API, we'd only need to call free on the arena at
top-level entry points.  If an error occurs deeps inside the compiler,
the arena will still get cleaned up by calling free at the top.

> >  It would be fairly complex to convert the ast nodes to pyobjects.
> > They're just simple discriminated unions right now.
>
> Are they all the same size?

No.  Each type is a different size and there are actually a lot of
types -- statements, expressions, arguments, slices, &c.  All the
objects of one type are the same size.

> > If they were
> > allocated from an arena, the entire arena could be freed when the
> > compilation pass ends.
>
> Then I don't understand why there was discussion of alloca() earlier
> on -- surely the lifetime of a node should not be limited by the stack
> frame that allocated it?

Actually this is a pretty good limit, because all these data
structures are temporaries used by the compiler.  Once compilation has
finished, there's no need for the AST or the compiler state.

> I'm not in principle against having an arena for this purpose, but I
> worry that this will make it really hard to provide a Python API for
> the AST, which has already been requested and whose feasibility
> (unless I'm mistaken) also was touted as an argument for switching to
> the AST compiler in the first place. I hope we'll never have to deal
> with an API like the parser module provides...

My preference would be to have the ast shared by value.  We generate
code to serialize it to and from a byte stream and share that between
Python and C.  It is less efficient, but it is also very simple.

Jeremy
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Martin v. Löwis
Jeremy Hylton wrote:
 > The reason this thread started was the complaint that reference
 > counting in the compiler is really difficult.  Almost every line of
 > code can lead to an error exit.  The code becomes quite cluttered when
 > it uses reference counting.  Right now, the AST is created with
 > malloc/free, but that makes it hard to free the ast at the right time.
 >  It would be fairly complex to convert the ast nodes to pyobjects.
 > They're just simple discriminated unions right now.  If they were
 > allocated from an arena, the entire arena could be freed when the
 > compilation pass ends.

I haven't looked at the AST code at all so far, but my experience
with gcc is that such an approach is fundamentally flawed: you
would always have memory that ought to survive the parsing, so
you will have to copy it out of the arena. This will either lead
to dangling pointers, or garbage memory. So in gcc, they eventually
moved to a full garbage collector (after several iterations).

Reference counting has the advantage that you can always DECREF
at the end of the function. So if you put all local variables
at the beginning of the function, and all DECREFs at the end,
getting clean memory management should be doable, IMO. Plus,
contributors would be familiar with the scheme in place.

I don't know if details have already been proposed, but I would
update asdl to generate a hierarchy of classes: i.e.

class mod(object):pass

class Module(mod):
   def __init__(self, body):
 self.body = body # List of stmt

#...

class Expression(mod):
   def __init__(self, body):
 self.body = body # expr

# ...
class Raise(stmt):
   def __init__(self, dest, values, nl):
  self.dest # expr or None
  self.values # List of expr
  self.bl # bool (True or False)

There would be convenience functions, like

   PyObject *mod_Module(PyObject* body);
   enum mod_kind mod_kind(PyObject* mod);
   // Module, Interactive, Expression, or mod_INVALID
   PyObject *mod_Expression_body(PyObject*);
   //...
   PyObject *stmt_Raise_dest(PyObject*);

(whether the accessors return new or borrowed reference
  could be debated; plain C struct accesses would also
  be possible)

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Neil Schemenauer
On Mon, Nov 28, 2005 at 03:47:07PM -0500, Jeremy Hylton wrote:
> The reason this thread started was the complaint that reference
> counting in the compiler is really difficult.

I don't think that's exactly right.  The problem is that the AST
compiler mixes its own memory management strategy with reference
counting and the result doesn't quite work.  The AST compiler mainly
keeps track of memory via containment: for example, if B is an
attribute of A then B gets freed when A gets freed.  That works fine
as long as B is never shared.  My memory of the problems is a little
fuzzy.  Maybe Neal Norwitz can explain it better.

  Neil
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Guido van Rossum
[Guido]
> > Then I don't understand why there was discussion of alloca() earlier
> > on -- surely the lifetime of a node should not be limited by the stack
> > frame that allocated it?

[Jeremy]
> Actually this is a pretty good limit, because all these data
> structures are temporaries used by the compiler.  Once compilation has
> finished, there's no need for the AST or the compiler state.

Are you really saying that there is one function which is called only
once (per compilation) which allocates *all* the AST nodes? That's the
only situation where I'd see alloca() working -- unless your alloca()
doesn't allocate memory on the stack. I was somehow assuming that the
tree would be built piecemeal by parser callbacks or some such
mechanism. There's still a stack frame whose lifetime limits the AST
lifetime, but it is not usually the current stackframe when a new node
is allocated, so alloca() can't be used.

I guess I don't understand the AST compiler code enough to participate
in this discussion. Or perhaps we are agreeing violently?

> > I'm not in principle against having an arena for this purpose, but I
> > worry that this will make it really hard to provide a Python API for
> > the AST, which has already been requested and whose feasibility
> > (unless I'm mistaken) also was touted as an argument for switching to
> > the AST compiler in the first place. I hope we'll never have to deal
> > with an API like the parser module provides...
>
> My preference would be to have the ast shared by value.  We generate
> code to serialize it to and from a byte stream and share that between
> Python and C.  It is less efficient, but it is also very simple.

So there would still be a Python-objects version of the AST but the
compiler itself doesn't use it.

At least by-value makes sense to me -- if you're making tree
transformations you don't want accidental sharing to cause unexpected
side effects.

--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Brett Cannon
On 11/28/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> [Guido]
> > > Then I don't understand why there was discussion of alloca() earlier
> > > on -- surely the lifetime of a node should not be limited by the stack
> > > frame that allocated it?
>
> [Jeremy]
> > Actually this is a pretty good limit, because all these data
> > structures are temporaries used by the compiler.  Once compilation has
> > finished, there's no need for the AST or the compiler state.
>
> Are you really saying that there is one function which is called only
> once (per compilation) which allocates *all* the AST nodes?

Nope, there isn't for everything.  It's just that some are temporary
to internal functions and thus can stand to be freed later (unless my
memory is really shot).  Otherwise it is piece-meal.  There is the
main data structure such as the compiler struct and the top-level node
for the AST, but otherwise everything (currently) is allocated as
needed.

> That's the
> only situation where I'd see alloca() working -- unless your alloca()
> doesn't allocate memory on the stack. I was somehow assuming that the
> tree would be built piecemeal by parser callbacks or some such
> mechanism. There's still a stack frame whose lifetime limits the AST
> lifetime, but it is not usually the current stackframe when a new node
> is allocated, so alloca() can't be used.
>
> I guess I don't understand the AST compiler code enough to participate
> in this discussion. Or perhaps we are agreeing violently?
>

I don't think your knowledge of the codebase precludes your
participation.  Actually, I think it makes it even more important
since if some scheme is devised that is not easily explained it is
really going to hinder who can help out with maintenance and
enhancements on the compiler.

> > > I'm not in principle against having an arena for this purpose, but I
> > > worry that this will make it really hard to provide a Python API for
> > > the AST, which has already been requested and whose feasibility
> > > (unless I'm mistaken) also was touted as an argument for switching to
> > > the AST compiler in the first place. I hope we'll never have to deal
> > > with an API like the parser module provides...
> >
> > My preference would be to have the ast shared by value.  We generate
> > code to serialize it to and from a byte stream and share that between
> > Python and C.  It is less efficient, but it is also very simple.
>
> So there would still be a Python-objects version of the AST but the
> compiler itself doesn't use it.
>

Yep.  The idea was be to return a PyString formatted ala the parser
module where it is just a bunch of nested items in a Scheme-like
format.  There would then be Python or C code that would generate a
Python object representation from that.  Then, when you were finished
tweaking the structure, you would write back out as a PyString and
then recreate the internal representation.  That makes it
pass-by-value since you pass the serialized PyString version across
the C-Python boundary.

> At least by-value makes sense to me -- if you're making tree
> transformations you don't want accidental sharing to cause unexpected
> side effects.
>

Yeah, that could be bad.  =)

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] reference leaks

2005-11-28 Thread Walter Dörwald
Neal Norwitz wrote:

> On 11/25/05, Walter Dörwald <[EMAIL PROTECTED]> wrote:
>> Can you move the call to codecs.register_error() out of test_callbacks()
>> and retry?
> 
> It then leaks 3 refs on each call to test_callbacks().

This should be fixed now in r41555 and r41556.

Bye,
Walter Dörwald

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Greg Ewing
Jeremy Hylton wrote:

> Almost every line of
> code can lead to an error exit.  The code becomes quite cluttered when
> it uses reference counting.

I don't see why very many more error exits should become
possible just by introducing refcounting. Errors are possible
whenever you allocate something, however you do it, so you
need error checks on all your allocations in any case.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | A citizen of NewZealandCorp, a   |
Christchurch, New Zealand  | wholly-owned subsidiary of USA Inc.  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Greg Ewing
Neal Norwitz wrote:

> This is an entire function from Python/ast.c.

> Sequences do not know what type they hold, so there needs to be
> different dealloc functions to free them properly (asdl_*_seq_free()).

Well, that's one complication that would go away if
the nodes were PyObjects.

> The memory leak occurs when FunctionDef fails.  name, args, body, and
> decorator_seq are all local and would not be freed.  The simple
> variables can be freed in each "constructor" like FunctionDef(), but
> the sequences cannot unless they keep the info about which type they
> hold.

If FunctionDef's reference semantics are defined so
that it steals references to its arguments, then here
is how the same function would look with PyObject
AST nodes, as far as I can see:

  static PyObject *
  ast_for_funcdef(struct compiling *c, const node *n)
  {
  /* funcdef: 'def' [decorators] NAME parameters ':' suite */
  PyObject *name = NULL;
  PyObject *args = NULL;
  PyObject *body = NULL;
  PyObject *decorator_seq = NULL;
  int name_i;

  REQ(n, funcdef);

  if (NCH(n) == 6) { /* decorators are present */
decorator_seq = ast_for_decorators(c, CHILD(n, 0));
if (!decorator_seq)
goto error;
name_i = 2;
  }
  else {
name_i = 1;
  }

  name = NEW_IDENTIFIER(CHILD(n, name_i));
  if (!name)
goto error;
  else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
ast_error(CHILD(n, name_i), "assignment to None");
goto error;
  }
  args = ast_for_arguments(c, CHILD(n, name_i + 1));
  if (!args)
goto error;
  body = ast_for_suite(c, CHILD(n, name_i + 3));
  if (!body)
goto error;

  return FunctionDef(name, args, body, decorator_seq, LINENO(n));

  error:
  Py_XDECREF(body);
  Py_XDECREF(decorator_seq);
  Py_XDECREF(args);
  Py_XDECREF(name);
  return NULL;
  }

The only things I've changed are turning some type
declarations into PyObject * and replacing the
deallocation functions at the end with Py_XDECREF!

Maybe there are other functions where it would not
be so straightforward, but if this really is a
typical AST function, switching to PyObjects looks
like it wouldn't be difficult at all, and would
actually make some things simpler.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | A citizen of NewZealandCorp, a   |
Christchurch, New Zealand  | wholly-owned subsidiary of USA Inc.  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Greg Ewing
Here's a somewhat radical idea:

Why not write the parser and bytecode compiler in Python?

A .pyc could be bootstrapped from it and frozen into
the executable.

-- 
Greg Ewing, Computer Science Dept, +--+
University of Canterbury,  | A citizen of NewZealandCorp, a   |
Christchurch, New Zealand  | wholly-owned subsidiary of USA Inc.  |
[EMAIL PROTECTED]  +--+
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Martin v. Löwis
Neal Norwitz wrote:
> Hope this helps explain a bit.  Please speak up with how this can be
> improved.  Gotta run.

I would rewrite it as

static PyObject*
ast_for_funcdef(struct compiling *c, const node *n)
{
 /* funcdef: [decorators] 'def' NAME parameters ':' suite */
 PyObject *name = NULL;
 PyObject *args = NULL;
 PyObject *body = NULL;
 PyObject *decorator_seq = NULL;
 PyObject *result = NULL;
 int name_i;

 REQ(n, funcdef);

 if (NCH(n) == 6) { /* decorators are present */
decorator_seq = ast_for_decorators(c, CHILD(n, 0));
if (!decorator_seq)
goto error;
name_i = 2;
 }
 else {
name_i = 1;
 }

 name = NEW_IDENTIFIER(CHILD(n, name_i));
 if (!name)
goto error;
 else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
ast_error(CHILD(n, name_i), "assignment to None");
goto error;
 }
 args = ast_for_arguments(c, CHILD(n, name_i + 1));
 if (!args)
goto error;
 body = ast_for_suite(c, CHILD(n, name_i + 3));
 if (!body)
goto error;

 result = FunctionDef(name, args, body, decorator_seq, LINENO(n));

error:
 Py_XDECREF(name);
 Py_XDECREF(args);
 Py_XDECREF(body);
 Py_XDECREF(decorator_seq);
 return result;
}

The convention would be that ast_for_* returns new references, which
have to be released regardless of success or failure. FunctionDef
would duplicate all of its parameter references if it succeeds,
and leave them untouched if it fails.

One could develop a checker that verifies that:
a) all PyObject* local variables are initialized to NULL, and
b) all such variables are Py_XDECREF'ed after the error label.
c) result is initialized to NULL, and returned.
Then, "goto error" at any point in the code would be correct
(assuming an exception had been set prior to the goto).

No special release function for the body or the decorators
would be necessary - they would be plain Python lists.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Brett Cannon
On 11/28/05, Greg Ewing <[EMAIL PROTECTED]> wrote:
> Here's a somewhat radical idea:
>
> Why not write the parser and bytecode compiler in Python?
>
> A .pyc could be bootstrapped from it and frozen into
> the executable.
>

Is there a specific reason you are leaving out the AST, Greg, or do
you count that as part of the bytecode compiler (I think of that as
the AST->bytecode step handled by Python/compile.c)?

While ease of maintenance would be fantastic and would probably lead
to much more language experimentation if more of the core parts of
Python were written in Python, I would worry about performance.  While
generating bytecode is not necessarily an everytime thing, I know
Guido has said he doesn't like punishing the performance of small
scripts in the name of large-scale apps (reason why interpreter
startup time has always been an issue) which tend not to have a .pyc
file.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CVS repository mostly closed now

2005-11-28 Thread 장혜식
On 11/27/05, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> I tried removing the CVS repository from SF; it turns
> out that this operation is not supported. Instead, it
> is only possible to remove it from the project page;
> pserver and ssh access remain indefinitely, as does
> viewcvs.

There's a hacky trick to remove them:
 put  rm -rf $CVSROOT/src into CVSROOT/loginfo
and remove the line then and commit again. :)


Hye-Shik
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Neal Norwitz
On 11/28/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>
> I guess I don't understand the AST compiler code enough to participate
> in this discussion.

I hope everyone while chime in here.  This is important to improve and
learn from others.

Let me try to describe the current situation with a small amount of
code.  Hopefully it will give some idea of the larger problems.

This is an entire function from Python/ast.c.  It demonstrates the
issues fairly clearly.  It contains at least one memory leak.  It uses
asdl_seq which are barely more than somewhat dynamic arrays. 
Sequences do not know what type they hold, so there needs to be
different dealloc functions to free them properly (asdl_*_seq_free()).
 ast_for_*() allocate memory, so in case of an error, the memory will
need to be freed.  Most of this memory is internal to the AST code. 
However, there are some identifiers (PyString's) that must be
DECREF'ed.  See below for the memory leak.

static stmt_ty
ast_for_funcdef(struct compiling *c, const node *n)
{
/* funcdef: 'def' [decorators] NAME parameters ':' suite */
identifier name = NULL;
arguments_ty args = NULL;
asdl_seq *body = NULL;
asdl_seq *decorator_seq = NULL;
int name_i;

REQ(n, funcdef);

if (NCH(n) == 6) { /* decorators are present */
decorator_seq = ast_for_decorators(c, CHILD(n, 0));
if (!decorator_seq)
goto error;
name_i = 2;
}
else {
name_i = 1;
}

name = NEW_IDENTIFIER(CHILD(n, name_i));
if (!name)
goto error;
else if (!strcmp(STR(CHILD(n, name_i)), "None")) {
ast_error(CHILD(n, name_i), "assignment to None");
goto error;
}
args = ast_for_arguments(c, CHILD(n, name_i + 1));
if (!args)
goto error;
body = ast_for_suite(c, CHILD(n, name_i + 3));
if (!body)
goto error;

return FunctionDef(name, args, body, decorator_seq, LINENO(n));

error:
asdl_stmt_seq_free(body);
asdl_expr_seq_free(decorator_seq);
free_arguments(args);
Py_XDECREF(name);
return NULL;
}

The memory leak occurs when FunctionDef fails.  name, args, body, and
decorator_seq are all local and would not be freed.  The simple
variables can be freed in each "constructor" like FunctionDef(), but
the sequences cannot unless they keep the info about which type they
hold.  That would help quite a bit, but I'm not sure it's the
right/best solution.

Hope this helps explain a bit.  Please speak up with how this can be
improved.  Gotta run.

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CVS repository mostly closed now

2005-11-28 Thread Fred L. Drake, Jr.
On Monday 28 November 2005 20:14, 장혜식 wrote:
 > There's a hacky trick to remove them:
 >  put  rm -rf $CVSROOT/src into CVSROOT/loginfo
 > and remove the line then and commit again. :)

Wow, that is tricky!  Glad it wasn't me who thought of this one.  :-)


  -Fred

-- 
Fred L. Drake, Jr.   
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Greg Ewing
Brett Cannon wrote:

> Is there a specific reason you are leaving out the AST, Greg, or do
> you count that as part of the bytecode compiler

No, I consider it part of the parser. My mental model
of parsing & compiling in the presence of a parse tree
is like this:

   [source] -> scanner -> [tokens]
 -> parser -> [AST] -> code_generator -> [code]

The fact that there still seems to be another kind of
parse tree in between the scanner and the AST generator
is an oddity which I hope will eventually disappear.

> I know
> Guido has said he doesn't like punishing the performance of small
> scripts in the name of large-scale apps

To me, that's an argument in favour of always generating
a .pyc, even for scripts.

Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] CVS repository mostly closed now

2005-11-28 Thread Martin v. Löwis
장혜식 wrote:
> There's a hacky trick to remove them:
>  put  rm -rf $CVSROOT/src into CVSROOT/loginfo
> and remove the line then and commit again. :)

Sure :-) SF makes a big fuss as to how good a service
this is: open source will never go away. I tend to
agree, somewhat. For historical reasons, it is surely
nice to be able to browse the CVS repository (in particular
if you need to correlate CVS revision numbers and svn
revision numbers); also, people can take any time they
want to convert CVS sandboxes.

So instead of hacking them, I thought we better comply.
With the mechanics in place, anybody should notice
we switched to subversion (but I will write something
on c.l.p.a, anyway).

Regards,
Martin

P.S. Sorry for not getting your name right in the To:
field; that's thunderbird.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Neal Norwitz
On 11/28/05, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Neal Norwitz wrote:
> > Hope this helps explain a bit.  Please speak up with how this can be
> > improved.  Gotta run.
>
> I would rewrite it as

[code snipped]

For those watching, Greg's and Martin's version were almost the same. 
However, Greg's version left in the memory leak, while Martin fixed it
by letting the result fall through.  Martin added some helpful rules
about dealing with the memory.  Martin also gets bonus points for
talking about developing a checker. :-)

In both cases, their modified code is similar to the existing AST
code, but all deallocation is done with Py_[X]DECREFs rather than a
type specific deallocator.  Definitely nicer than the current
situation.  It's also the same as the rest of the python code.

With arenas the code would presumably look something like this:

static stmt_ty
ast_for_funcdef(struct compiling *c, const node *n)
{
/* funcdef: 'def' [decorators] NAME parameters ':' suite */
identifier name;
arguments_ty args;
asdl_seq *body;
asdl_seq *decorator_seq = NULL;
int name_i;

REQ(n, funcdef);

if (NCH(n) == 6) { /* decorators are present */
decorator_seq = ast_for_decorators(c, CHILD(n, 0));
if (!decorator_seq)
return NULL;
name_i = 2;
}
else {
name_i = 1;
}

name = NEW_IDENTIFIER(CHILD(n, name_i));
if (!name)
return NULL;
Py_AST_Register(name);
if (!strcmp(STR(CHILD(n, name_i)), "None")) {
ast_error(CHILD(n, name_i), "assignment to None");
return NULL;
}
args = ast_for_arguments(c, CHILD(n, name_i + 1));
body = ast_for_suite(c, CHILD(n, name_i + 3));
if (!args || !body)
return NULL;

return FunctionDef(name, args, body, decorator_seq, LINENO(n));
}

All the goto's become return NULLs.  After allocating a PyObject, it
would need to be registered (ie, the mythical Py_AST_Register(name)). 
This is easier than using all PyObjects in that when an error occurs,
there's nothing to think about, just return.  Only optional values
(like decorator_seq) need to be initialized.  It's harder in that one
must remember to register any PyObject so it can be Py_DECREFed at the
end.  Since the arena is allocated in big hunk(s), it would presumably
be faster than using PyObjects since there would be less memory
allocation (and fragmentation).  It should be possible to get rid of
some of the conditionals too (I joined body and args above).

Using all PyObjects has another benefit that may have been mentioned
elsewhere, ie that the rest of Python uses the same techniques for
handling deallocation.

I'm not really advocating any particular approach.  I *think* arenas
would be easiest, but it's not a clear winner.  I think Martin's note
about GCC using GC is interesting.  AFAIK GCC is a lot more complex
than the Python code, so I'm not sure it's 100% relevant.  OTOH, we
need to weigh that experience.

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Memory management in the AST parser & compiler

2005-11-28 Thread Martin v. Löwis
Neal Norwitz wrote:
> For those watching, Greg's and Martin's version were almost the same. 
> However, Greg's version left in the memory leak, while Martin fixed it
> by letting the result fall through.

Actually, Greg said (correctly) that his version also fixes the
leak: he assumed that FunctionDef would *consume* the references
being passed (whether it is successful or not).

I don't think this is a good convention, though.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com