Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3

2005-01-02 Thread Jack Jansen
On 2-jan-05, at 4:40, Bob Ippolito wrote:
+SCRIPT="""#!/bin/sh
+export MACOSX_DEPLOYMENT_TARGET=10.3
+exec %s "[EMAIL PROTECTED]"
This script should check to see if MACOSX_DEPLOYMENT_TARGET is already 
set.  If I have some reason to set MACOSX_DEPLOYMENT_TARGET=10.4 for 
compilation (say I'm compiling an extension that requires 10.4 
features) then I'm going to have some serious problems with this fix.
I was going to do that, but then I thought it didn't make any sense, 
because this script is *only* used in the context of Apple-provided 
Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other than 
10.3 (be it lower or higher) while compiling an extension for Apple's 
2.3 is going to produce disappointing results anyway.

But, if I've missed a use case, please enlighten me.
--
Jack Jansen, <[EMAIL PROTECTED]>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3

2005-01-02 Thread Bob Ippolito
On Jan 2, 2005, at 4:28 PM, Jack Jansen wrote:
On 2-jan-05, at 4:40, Bob Ippolito wrote:
+SCRIPT="""#!/bin/sh
+export MACOSX_DEPLOYMENT_TARGET=10.3
+exec %s "[EMAIL PROTECTED]"
This script should check to see if MACOSX_DEPLOYMENT_TARGET is 
already set.  If I have some reason to set 
MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an 
extension that requires 10.4 features) then I'm going to have some 
serious problems with this fix.
I was going to do that, but then I thought it didn't make any sense, 
because this script is *only* used in the context of Apple-provided 
Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other 
than 10.3 (be it lower or higher) while compiling an extension for 
Apple's 2.3 is going to produce disappointing results anyway.

But, if I've missed a use case, please enlighten me.
You're right, of course.  I had realized that I was commenting on the 
fixpython script after I had replied, but my concern is still 
applicable to whatever solution is used for Python 2.4.1.  Anything 
lower than 10.3 is of course an error, in either case.

-bob
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Re: [Python-checkins] python/dist/src/Mac/OSX fixapplepython23.py, 1.2, 1.3

2005-01-02 Thread Jack Jansen
On 2-jan-05, at 22:35, Bob Ippolito wrote:
On 2-jan-05, at 4:40, Bob Ippolito wrote:
+SCRIPT="""#!/bin/sh
+export MACOSX_DEPLOYMENT_TARGET=10.3
+exec %s "[EMAIL PROTECTED]"
This script should check to see if MACOSX_DEPLOYMENT_TARGET is 
already set.  If I have some reason to set 
MACOSX_DEPLOYMENT_TARGET=10.4 for compilation (say I'm compiling an 
extension that requires 10.4 features) then I'm going to have some 
serious problems with this fix.
I was going to do that, but then I thought it didn't make any sense, 
because this script is *only* used in the context of Apple-provided 
Python 2.3. And setting MACOSX_DEPLOYMENT_TARGET to anything other 
than 10.3 (be it lower or higher) while compiling an extension for 
Apple's 2.3 is going to produce disappointing results anyway.

But, if I've missed a use case, please enlighten me.
You're right, of course.  I had realized that I was commenting on the 
fixpython script after I had replied, but my concern is still 
applicable to whatever solution is used for Python 2.4.1.  Anything 
lower than 10.3 is of course an error, in either case.
2.4.1 will install this fix into Apple-installed Python 2.3 (if 
applicable, i.e. if you're installing 2.4.1 on 10.3), but for its own 
use it will have the newer distutils, which understands that it needs 
to pick up MACOSX_DEPLOYMENT_TARGET from the Makefile, so it'll never 
see these scripts.
--
Jack Jansen, <[EMAIL PROTECTED]>, http://www.cwi.nl/~jack
If I can't dance I don't want to be part of your revolution -- Emma 
Goldman

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

2005-01-02 Thread Bob Ippolito
Quite a few notable places in the Python sources expect realloc(...) to 
relinquish some memory if the requested size is smaller than the 
currently allocated size.  This is definitely not true on Darwin, and 
possibly other platforms.  I have tested this on OpenBSD and Linux, and 
the implementations on these platforms do appear to relinquish memory, 
but I didn't read the implementation.  I haven't been able to find any 
documentation that states that realloc should make this guarantee, but 
I figure Darwin does this as an "optimization" and because Darwin 
probably can't resize mmap'ed memory (at least it can't from Python, 
but this probably means it doesn't have this capability at all).

It is possible to "fix" this for Darwin, because you can ask the 
default malloc zone how big a particular allocation is, and how big an 
allocation of a given size will actually be (see: ).  
The obvious place to put this would be PyObject_Realloc, because this 
is at least called by _PyString_Resize (which will fix 
).

Should I write up a patch that "fixes" this?  I guess the best thing to 
do would be to determine whether the fix should be used at runtime, by 
allocating a meg or so, resizing it to 1 byte, and see if the size of 
the allocation changes.  If the size of the allocation does change, 
then the system realloc can be trusted to do what Python expects it to 
do, otherwise realloc should be done "cleanly" by allocating a new 
block (returning the original on failure, because it's good enough and 
some places in Python seem to expect that shrink will never fail), 
memcpy, free, return new block.

I wrote up a small hack that does this realloc indirection to CVS 
trunk, and it doesn't seem to cause any measurable difference in 
pystone performance.

Note that all versions of Darwin that I've looked at (6.x, 7.x, and 
8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have 
this "issue", but it might go away by Mac OS X 10.4 or some later 
release.

This URL points to the sf bug and Darwin 7.7's realloc(...) 
implementation: 
http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/

-bob
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

2005-01-02 Thread Tim Peters
[Bob Ippolito]
> Quite a few notable places in the Python sources expect realloc(...) to
> relinquish some memory if the requested size is smaller than the
> currently allocated size.

I don't know what "relinquish some memory" means.  If it means
something like "returns memory to the OS, so that the reported process
size shrinks", then no, nothing in Python ever assumes that.  That's
simply because "returns memory to the OS" and "process size" aren't
concepts in the C standard, and so nothing can be said about them in
general -- not in theory, and neither in practice, because platforms
(OS+libc combos) vary so widely in behavior here.

As a pragmatic matter, I *expect* that a production-quality realloc()
implementation will at least be able to reuse released memory,
provided that the amount released is at least half the amount
originally malloc()'ed (and, e.g., reasonable buddy systems may not be
able to do better than that).

> This is definitely not true on Darwin, and possibly other platforms.  I have 
> tested
> this on OpenBSD and Linux, and the implementations on these platforms do
> appear to relinquish memory,

As above, don't know what this means.

> but I didn't read the implementation.  I haven't been able to find any
> documentation that states that realloc should make this guarantee,

realloc() guarantees very little; it certainly doesn't guarantee
anything, e.g., about OS interactions or process sizes.

> but I figure Darwin does this as an "optimization" and because Darwin
> probably can't resize mmap'ed memory (at least it can't from Python,
> but this probably means it doesn't have this capability at all).
>
> It is possible to "fix" this for Darwin,

I don't understand what's "broken".  Small objects go thru Python's
own allocator, which has its own realloc policies and its own
peculiarities (chiefly that pymalloc never free()s any memory
allocated for small objects).

> because you can ask the default malloc zone how big a particular
> allocation is, and how big an allocation of a given size will actually
> be (see: ).
> The obvious place to put this would be PyObject_Realloc, because this
> is at least called by _PyString_Resize (which will fix
> ).

The diagnosis in the bug report seems to leave it pointing at
socket.py's _fileobject.read(), although I suspect the real cause is
in socketmodule.c's sock_recv().  We've had other reports of various
problems when people pass absurdly large values to socket recv().  A
better fix here would probably amount to rewriting sock_recv() to
refuse to pass enormous numbers to the platform recv() (it appears
that many platform recv() implementations simply don't expect a recv()
argument to be much bigger than the native network buffer size, and
screw up when that's not so).

> Should I write up a patch that "fixes" this?  I guess the best thing to
> do would be to determine whether the fix should be used at runtime, by
> allocating a meg or so, resizing it to 1 byte, and see if the size of
> the allocation changes.  If the size of the allocation does change,
> then the system realloc can be trusted to do what Python expects it to
> do, otherwise realloc should be done "cleanly" by allocating a new
> block (returning the original on failure, because it's good enough and
> some places in Python seem to expect that shrink will never fail),

Yup, that assumption (that a non-growing realloc can't fail) is all
over the place.

> memcpy, free, return new block.
>
> I wrote up a small hack that does this realloc indirection to CVS
> trunk, and it doesn't seem to cause any measurable difference in
> pystone performance.
> 
> Note that all versions of Darwin that I've looked at (6.x, 7.x, and
> 8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have
> this "issue", but it might go away by Mac OS X 10.4 or some later
> release.
> 
> This URL points to the sf bug and Darwin 7.7's realloc(...)
> implementation:
> http://bob.pythonmac.org/archives/2005/01/01/realloc-doesnt/

It would be good to rewrite sock_recv() more defensively in any case. 
Best I can tell, this implementation of realloc() is
standard-conforming but uniquely brain dead in its downsize behavior. 
I don't expect the latter will last (as you say on your page,
"probably plenty of other software" also makes the same pragmatic
assumptions about realloc downsize behavior), so I'm not keen to gunk
up Python to worm around it.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

2005-01-02 Thread Bob Ippolito
On Jan 3, 2005, at 12:13 AM, Tim Peters wrote:
[Bob Ippolito]
Quite a few notable places in the Python sources expect realloc(...) 
to
relinquish some memory if the requested size is smaller than the
currently allocated size.
I don't know what "relinquish some memory" means.  If it means
something like "returns memory to the OS, so that the reported process
size shrinks", then no, nothing in Python ever assumes that.  That's
simply because "returns memory to the OS" and "process size" aren't
concepts in the C standard, and so nothing can be said about them in
general -- not in theory, and neither in practice, because platforms
(OS+libc combos) vary so widely in behavior here.
As a pragmatic matter, I *expect* that a production-quality realloc()
implementation will at least be able to reuse released memory,
provided that the amount released is at least half the amount
originally malloc()'ed (and, e.g., reasonable buddy systems may not be
able to do better than that).
This is what I meant by relinquish (c/o merriam-webster):
a : to stop holding physically : RELEASE 
b : to give over possession or control of : YIELD 

Your expectation is not correct for Darwin's memory allocation scheme.  
It seems that Darwin creates allocations of immutable size.  The only 
way ANY part of an allocation will ever be used by ANYTHING else is if 
free() is called with that allocation.  free() can be called either 
explicitly, or implicitly by calling realloc() with a size larger than 
the size of the allocation.  In that case, it will create a new 
allocation of at least the requested size, copy the contents of the 
original allocation into the new allocation (probably with 
copy-on-write pages if it's large enough, so it might be cheap), and 
free() the allocation.  In the case where realloc() specifies a size 
that is not greater than the allocation's size, it will simply return 
the given allocation and cause no side-effects whatsoever.

Was this a good decision?  Probably not!  However, it is our (in the "I 
know you use Windows but I am not the only one that uses Mac OS X" 
sense) problem so long as Darwin is a supported platform, because it is 
highly unlikely that Apple will backport any "fix" to the allocator 
unless we can prove it has some security implications in software 
shipped with their OS.  I attempted to look for some easy ones by 
performing a quick audit of Apache, OpenSSH, and OpenSSL.  
Unfortunately, their developers did not share your expectation.  I 
found one sprintf-like routine in Apache that could be affected by this 
behavior, and one instance of immutable string creation in Apple's 
CoreFoundation CFString implementation, but I have yet to find an easy 
way to exploit this behavior from the outside.  I should probably be 
looking at PHP and Perl instead ;)

but I figure Darwin does this as an "optimization" and because Darwin
probably can't resize mmap'ed memory (at least it can't from Python,
but this probably means it doesn't have this capability at all).
It is possible to "fix" this for Darwin,
I don't understand what's "broken".  Small objects go thru Python's
own allocator, which has its own realloc policies and its own
peculiarities (chiefly that pymalloc never free()s any memory
allocated for small objects).
What's broken is that there are several places in Python that seem to 
assume that you can allocate a large chunk of memory, and make it 
smaller in some meaningful way with realloc(...).  This is not true 
with Darwin.  You are right about small objects.  They don't matter 
because they're small, and because they're handled by Python's 
allocator.

because you can ask the default malloc zone how big a particular
allocation is, and how big an allocation of a given size will actually
be (see: ).
The obvious place to put this would be PyObject_Realloc, because this
is at least called by _PyString_Resize (which will fix
).
The diagnosis in the bug report seems to leave it pointing at
socket.py's _fileobject.read(), although I suspect the real cause is
in socketmodule.c's sock_recv().  We've had other reports of various
problems when people pass absurdly large values to socket recv().  A
better fix here would probably amount to rewriting sock_recv() to
refuse to pass enormous numbers to the platform recv() (it appears
that many platform recv() implementations simply don't expect a recv()
argument to be much bigger than the native network buffer size, and
screw up when that's not so).
You are correct.  The real cause is in sock_recv(), and/or 
_PyString_Resize(), depending on how you look at it.

Note that all versions of Darwin that I've looked at (6.x, 7.x, and
8.0b1 corresponding to publicly available WWDC 2004 Tiger code) have
this "issue", but it might go away by Mac OS X 10.4 or some later
release.
It would be good to rewrite sock_recv() more defensively in any case.
Best I can tell, this implementation of realloc() is
standard-conforming but uni

Re: [Python-Dev] Darwin's realloc(...) implementation never shrinks allocations

2005-01-02 Thread Tim Peters
[Bob Ippolito]
> ...
> Your expectation is not correct for Darwin's memory allocation scheme.
> It seems that Darwin creates allocations of immutable size.  The only
> way ANY part of an allocation will ever be used by ANYTHING else is if
> free() is called with that allocation.

Ya, I understood that.  My conclusion was that Darwin's realloc()
implementation isn't production-quality.  So it goes.

>  free() can be called either explicitly, or implicitly by calling realloc() 
> with
> a size larger than the size of the allocation.  In that case, it will create 
> a new
> allocation of at least the requested size, copy the contents of the
> original allocation into the new allocation (probably with
> copy-on-write pages if it's large enough, so it might be cheap), and
> free() the allocation.

Really?  Another near-universal "quality of implementation"
expectation is that a growing realloc() will strive to extend
in-place.  Like realloc(malloc(100), 101).  For example, the
theoretical guarantee that one-at-a-time list.append() has amortized
linear time doesn't depend on that, but pragmatically it's greatly
helped by a reasonable growing realloc() implementation.

>  In the case where realloc() specifies a size that is not greater than the
> allocation's size, it will simply return the given allocation and cause no 
> side-
> effects whatsoever.
>
> Was this a good decision?  Probably not!

Sounds more like a bug (or two) to me than "a decision", but I don't know.

>  However, it is our (in the "I know you use Windows but I am not the only
> one that uses Mac OS X sense) problem so long as Darwin is a supported
> platform, because it is highly unlikely that Apple will backport any "fix" to
> the allocator unless we can prove it has some security implications in
> software shipped with their OS. ...

Is there any known case where Python performs poorly on this OS, for
this reason, other than the "pass giant numbers to recv() and then
shrink the string because we didn't get anywhere near that many bytes"
case?  Claiming rampant performance problems should require evidence
too .

...
> Presumably this can happen at other places (including third party
> extensions), so a better place to do this might be _PyString_Resize().
> list_resize() is another reasonable place to put this.  I'm sure there
> are other places that use realloc() too, and the majority of them do
> this through obmalloc.  So maybe instead of trying to track down all
> the places where this can manifest, we should just "gunk up" Python and
> patch PyObject_Realloc()?

There is no "choke point" for allocations in Python -- some places
call the system realloc() directly.  Maybe the latter matter on Darwin
too, but maybe they don't.  The scope of this hack spreads if they do.
 I have no idea how often realloc() is called directly by 3rd-party
extension modules.  It's called directly a lot in Zope's C code, but
AFAICT only to grow vectors, never to shrink them.
'
> Since we are both pretty confident that other allocators aren't like Darwin,
> this "gunk" can be #ifdef'ed to the __APPLE__ case.

#ifdef's are a last resort:  they almost never go away, so they
complicate the code forever after, and typically stick around for
years even after the platform problems they intended to address have
been fixed.  For obvious reasons, they're also an endless source of
platform-specific bugs.

Note that pymalloc already does a memcpy+free when in
PyObject_Realloc(p, n) p was obtained from the system malloc or
realloc but n is small enough to meet the "small object" threshold
(pymalloc "takes over" small blocks that result from a
PyObject_Realloc()).  That's a reasonable strategy *because* n is
always small in such cases.  If you're going to extend this strategy
to n of arbitrary size, then you may also create new performance
problems for some apps on Darwin (copying n bytes can get arbitrarily
expensive).

> ...
>  I'm sure I'll find something, but what's important to me is that Python
> works well on Mac OS X, so something should happen.

I agree the socket-abuse case should be fiddled, and for more reasons
than just Darwin's realloc() quirks.  I don't know that there are
actual problems on Darwin broader than that case (and I'm not
challenging you to contrive one, I'm asking whether realloc() quirks
are suspected in any other case that's known).  Part of what you
demonstrated when you said that pystone didn't slow down when you
fiddled stuff is that pystone also didn't speed up.  I also don't know
that the memcpy+free wormaround is actually going to help more than it
hurts overall.  Yes, in the socket-abuse case, where the program
routinely malloc()s strings millions of bytes larger than the socket
can deliver, it would obviously help.  That's not typically program
behavior (however typical it may be of that specific app).  More
typical is shrinking a long list one element at a time, in which case
about half the list remaining would get memcpy'd from t

Re: [Python-Dev] Zipfile needs?

2005-01-02 Thread Guido van Rossum
> Encryption/decryption support.  Will most likely require a C extension since
> the algorithm relies on ints (or longs, don't remember) wrapping around when
> the value becomes too large.

You may want to do this in C for speed, but C-style int wrapping is
easily done by doing something like "x = x & 0xL" at crucial
points in the code (for unsigned 32-bit ints) with an additional "if x
& 0x8000L: x -= 0x1L" to simulate signed 32-bit ints.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com