Re: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7

2010-06-24 Thread Greg Ewing

Ronald Oussoren wrote:

That's because setgroups(3) is limited to 16 groups 

> (that is, the kernel doesn't support more than 16 groups at all).

So how does an account being a member of 18 groups ever work?

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Stephen J. Turnbull
Guido van Rossum writes:

 > For example: how we can make the suite of functions used for URL
 > processing more polymorphic, so that each developer can choose for
 > herself how URLs need to be treated in her application.

While you have come down on the side of polymorphism (as opposed to
separate functions), I'm a little nervous about it.  Specifically,
Philip Eby expressed a desire for earlier type errors, while
polymorphism seems to ensure that you'll need to Look Before You Leap
to get early error detection.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Lennart Regebro
On Tue, Jun 22, 2010 at 20:07, James Y Knight  wrote:
> Yeah. This is a real issue I have with the direction Python3 went: it pushes
> you into decoding everything to unicode early, even when you don't care --

Well, yes, maybe even if *you* don't care. But often the functions you
need to call must care, and then you need to decode to unicode, even
if you personally don't care. And in those cases, you should deocde as
early as possible.

In the cases where neither you nor the functions you call care, then
you don't have to decode, and you can happily pass binary data from
one function to another.

So this is not really a question of the direction Python 3 went. It's
more a case that some methods that *could* do their transformations in
a well defined way on bytes don't, and then force you to decode to
unicode. But that's not a problem with direction, it's just a missing
feature in the stdlib.

-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python3porting.com/
+33 661 58 14 64
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread M.-A. Lemburg
Lennart Regebro wrote:
> On Tue, Jun 22, 2010 at 20:07, James Y Knight  wrote:
>> Yeah. This is a real issue I have with the direction Python3 went: it pushes
>> you into decoding everything to unicode early, even when you don't care --
> 
> Well, yes, maybe even if *you* don't care. But often the functions you
> need to call must care, and then you need to decode to unicode, even
> if you personally don't care. And in those cases, you should deocde as
> early as possible.
> 
> In the cases where neither you nor the functions you call care, then
> you don't have to decode, and you can happily pass binary data from
> one function to another.
> 
> So this is not really a question of the direction Python 3 went. It's
> more a case that some methods that *could* do their transformations in
> a well defined way on bytes don't, and then force you to decode to
> unicode. But that's not a problem with direction, it's just a missing
> feature in the stdlib.

The discussion is showing that in at least a few application spaces,
the stdlib should be able to work on both bytes and Unicode, preferably
using the same interfaces using polymorphism, i.e.

some_function(bytes) -> bytes
some_function(str) -> str

In Python2 this partially works due to the automatic bytes->str
conversion (in some cases you get some_function(bytes) -> str),
the codec base class implementations being a prime example.

In Python3, things have to be done explicity and I think we need
to add a few helpers to make writing such str/bytes interfaces
easier.

We've already had some suggestions in that area, but probably need
to collect a few more ideas based on real-life porting attempts.

I'd like to make this a topic at the upcoming language summit
in Birmingham, if Michael agrees.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 24 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2010-07-19: EuroPython 2010, Birmingham, UK24 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Michael Foord

On 24/06/2010 11:58, M.-A. Lemburg wrote:

Lennart Regebro wrote:
   

On Tue, Jun 22, 2010 at 20:07, James Y Knight  wrote:
 

Yeah. This is a real issue I have with the direction Python3 went: it pushes
you into decoding everything to unicode early, even when you don't care --
   

Well, yes, maybe even if *you* don't care. But often the functions you
need to call must care, and then you need to decode to unicode, even
if you personally don't care. And in those cases, you should deocde as
early as possible.

In the cases where neither you nor the functions you call care, then
you don't have to decode, and you can happily pass binary data from
one function to another.

So this is not really a question of the direction Python 3 went. It's
more a case that some methods that *could* do their transformations in
a well defined way on bytes don't, and then force you to decode to
unicode. But that's not a problem with direction, it's just a missing
feature in the stdlib.
 

The discussion is showing that in at least a few application spaces,
the stdlib should be able to work on both bytes and Unicode, preferably
using the same interfaces using polymorphism, i.e.

some_function(bytes) ->  bytes
some_function(str) ->  str

In Python2 this partially works due to the automatic bytes->str
conversion (in some cases you get some_function(bytes) ->  str),
the codec base class implementations being a prime example.

In Python3, things have to be done explicity and I think we need
to add a few helpers to make writing such str/bytes interfaces
easier.

We've already had some suggestions in that area, but probably need
to collect a few more ideas based on real-life porting attempts.

I'd like to make this a topic at the upcoming language summit
in Birmingham, if Michael agrees.

   

Yep, it sounds like a great topic for the language summit.

Michael

--
http://www.ironpythoninaction.com/

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Guido van Rossum
On Thu, Jun 24, 2010 at 1:12 AM, Stephen J. Turnbull  wrote:
> Guido van Rossum writes:
>
>  > For example: how we can make the suite of functions used for URL
>  > processing more polymorphic, so that each developer can choose for
>  > herself how URLs need to be treated in her application.
>
> While you have come down on the side of polymorphism (as opposed to
> separate functions), I'm a little nervous about it.  Specifically,
> Philip Eby expressed a desire for earlier type errors, while
> polymorphism seems to ensure that you'll need to Look Before You Leap
> to get early error detection.

Understood, but both the majority of str/bytes methods and several
existing APIs (e.g. many in the os module, like os.listdir()) do it
this way.

Also, IMO a polymorphic function should *not* accept *mixed*
bytes/text input -- join('x', b'y') should be rejected. But join('x',
'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make sense to me.

So, actually, I *don't* understand what you mean by needing LBYL.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Nick Coghlan
On Fri, Jun 25, 2010 at 12:33 AM, Guido van Rossum  wrote:
> Also, IMO a polymorphic function should *not* accept *mixed*
> bytes/text input -- join('x', b'y') should be rejected. But join('x',
> 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make sense to me.

A policy of allowing arguments to be either str or bytes, but not a
mixture, actually avoids one of the more painful aspects of the 2.x
"promote mixed operations to unicode" approach. Specifically, you
either had to scan all the arguments up front to check for unicode, or
else you had to stop what you were doing and start again with the
unicode version if you encountered unicode partway through. Neither
was particularly nice to implement.

As you noted elsewhere, literals and string methods are still likely
to be a major sticking point with that approach - common operations
like ''.join(seq) and b''.join(seq) aren't polymorphic, so functions
that use them won't be polymorphic either. (It's only the str->unicode
promotion behaviour in 2.x that works around this problem there).

Would it be heretical to suggest that sum() be allowed to work on
strings to at least eliminate ''.join() as something that breaks bytes
processing? It already works for bytes, although it then fails with a
confusing message for bytearray:

>>> sum(b"a b c".split(), b'')
b'abc'

>>> sum(bytearray(b"a b c").split(), bytearray(b''))
Traceback (most recent call last):
  File "", line 1, in 
TypeError: sum() can't sum bytes [use b''.join(seq) instead]

>>> sum("a b c".split(), '')
Traceback (most recent call last):
  File "", line 1, in 
TypeError: sum() can't sum strings [use ''.join(seq) instead]

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Guido van Rossum
On Thu, Jun 24, 2010 at 8:25 AM, Nick Coghlan  wrote:
> On Fri, Jun 25, 2010 at 12:33 AM, Guido van Rossum  wrote:
>> Also, IMO a polymorphic function should *not* accept *mixed*
>> bytes/text input -- join('x', b'y') should be rejected. But join('x',
>> 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make sense to me.
>
> A policy of allowing arguments to be either str or bytes, but not a
> mixture, actually avoids one of the more painful aspects of the 2.x
> "promote mixed operations to unicode" approach. Specifically, you
> either had to scan all the arguments up front to check for unicode, or
> else you had to stop what you were doing and start again with the
> unicode version if you encountered unicode partway through. Neither
> was particularly nice to implement.

Right. Polymorphic functions should *not* allow mixing text and bytes.
It's all text or all bytes.

> As you noted elsewhere, literals and string methods are still likely
> to be a major sticking point with that approach - common operations
> like ''.join(seq) and b''.join(seq) aren't polymorphic, so functions
> that use them won't be polymorphic either. (It's only the str->unicode
> promotion behaviour in 2.x that works around this problem there).
>
> Would it be heretical to suggest that sum() be allowed to work on
> strings to at least eliminate ''.join() as something that breaks bytes
> processing? It already works for bytes, although it then fails with a
> confusing message for bytearray:
>
 sum(b"a b c".split(), b'')
> b'abc'
>
 sum(bytearray(b"a b c").split(), bytearray(b''))
> Traceback (most recent call last):
>  File "", line 1, in 
> TypeError: sum() can't sum bytes [use b''.join(seq) instead]
>
 sum("a b c".split(), '')
> Traceback (most recent call last):
>  File "", line 1, in 
> TypeError: sum() can't sum strings [use ''.join(seq) instead]

I don't think we should abuse sum for this. A simple idiom to get the
*empty* string of a particular type is x[:0] so you could write
something like this to concatenate a list or strings or bytes:
xs[:0].join(xs). Note that if xs is empty we wouldn't know what to do
anyway so this should be disallowed.

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
This is a follow up to PEP 3147.  That PEP, already implemented in Python 3.2,
allows for Python source files from different Python versions to live together
in the same directory.  It does this by putting a magic tag in the .pyc file
name and placing the .pyc file in a __pycache__ directory.

Distros such as Debian and Ubuntu will use this to greatly simplifying
deploying Python, and Python applications and libraries.  Debian and Ubuntu
usually ship more than one version of Python, and currently have to play
complex games with symlinks to make this work.  PEP 3147 will go a long way to
eliminating the need for extra directories and symlinks.

One more thing I've found we need though, is a way to handled shared libraries
for extension modules.  Just as we can get name collisions on foo.pyc, we can
get collisions on foo.so.  We obviously cannot install foo.so built for Python
3.2 and foo.so built for Python 3.3 in the same location.  So symlink
nightmare's mini-me is back.

I have a fairly simple fix for this.  I'd actually be surprised if this hasn't
been discussed before, but teh Googles hasn't turned up anything.

The idea is to put the Python version number in the shared library file name,
and extend .so lookup to find these extended file names.  So for example, we'd
see foo.3.2.so instead, and Python would know how to dynload both that and the
traditional foo.so file too (for backward compatibility).

(On file naming: the original patch used foo.so.3.2 and that works just as
well, but I thought there might be tools that expect exactly a '.so' suffix,
so I changed it to put the Major.Minor version number to the left of the
extension.  The exact naming scheme is of course open to debate.)

This is a much simpler patch than PEP 3147, though I'm not 100% sure it's the
right approach.  The way this works is by modifying the configure and
Makefile.pre.in to put the version number in the $SO make variable.  Python
parses its (generated) Makefile to find $SO and it uses this deep in the
bowels of distutils to decide what suffix to use when writing shared libraries
built by 'python setup.py build_ext'.

This means the patched Python only writes versioned .so files by default.  I
personally don't see that as a problem, and it does not affect the test suite,
with the exception of one easily tweaked test.  I don't know if third party
tools will care.  The fact that traditional foo.so shared libraries will still
satisfy the import should be enough, I think.

The patch is currently Linux only, since I need this for Debian and Ubuntu and
wanted to keep the change narrow.

Other possible approaches:
 * Extend the distutils API so that the .so file extension can be passed in,
   instead of being essentially hardcoded to what Python's Makefile contains.
 * Keep the dynload_shlib.c change, but modify the Debian/Ubuntu build
   environment to pass in $SO to make (though the configure.in warning and
   sleep is a little annoying).
 * Add a ./configure option to enable this, which Debuntu's build would use.

The patch is available here:

http://pastebin.ubuntu.com/454512/

and my working branch is here:

https://code.edge.launchpad.net/~barry/python/sovers

Please let me know what you think.  I'm happy to just commit this to the py3k
branch if there are no objections .  I don't think a new PEP is in
order, but an update to PEP 3147 might make sense.

Cheers,
-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Benjamin Peterson
2010/6/24 Barry Warsaw :
> Please let me know what you think.  I'm happy to just commit this to the py3k
> branch if there are no objections .  I don't think a new PEP is in
> order, but an update to PEP 3147 might make sense.

How will this interact with PEP 384 if that is implemented?



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Daniel Stutzbach
On Thu, Jun 24, 2010 at 10:50 AM, Barry Warsaw  wrote:

> The idea is to put the Python version number in the shared library file
> name,
> and extend .so lookup to find these extended file names.  So for example,
> we'd
> see foo.3.2.so instead, and Python would know how to dynload both that and
> the
> traditional foo.so file too (for backward compatibility).
>

 What use case does this address?

PEP 3147 addresses the fact that the user may have different versions of
Python installed and each wants to write a .pyc file when loading a module.
 .so files are not generated simply by running the Python interpreter, ergo
.so files are not an issue for that use case.

If you want to make it so a system can install a package in just one
location to be used by multiple Python installations, then the version
number isn't enough.  You also need to distinguish debug builds, profiling
builds, Unicode width (see issue8654), and probably several other
./configure options.
--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Baptiste Carvello

P.J. Eby a écrit :

[...] stdlib constants are almost always ASCII, 
and the main use cases for ebytes would involve ascii-extended encodings.)


Then, how about a new "ascii string" literal? This would produce a special kind 
of string that would coerce to a normal string when mixed with a str, and to a 
bytes using ascii codec when mixed with a bytes. Then you could write


>>> a"/".join(base, path)

and not worry if base and path are both str, or both bytes (mixed being of 
course forbidden).


B.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread P.J. Eby

At 05:12 PM 6/24/2010 +0900, Stephen J. Turnbull wrote:

Guido van Rossum writes:

 > For example: how we can make the suite of functions used for URL
 > processing more polymorphic, so that each developer can choose for
 > herself how URLs need to be treated in her application.

While you have come down on the side of polymorphism (as opposed to
separate functions), I'm a little nervous about it.  Specifically,
Philip Eby expressed a desire for earlier type errors, while
polymorphism seems to ensure that you'll need to Look Before You Leap
to get early error detection.


This doesn't have to be in the functions; it can be in the 
*types*.  Mixed-type string operations have to do type checking and 
upcasting already, but if the protocol were open, you could make an 
encoded-bytes type that would handle the error checking.


(Btw, in some earlier emails, Stephen, you implied that this could be 
fixed with codecs -- but it can't, because the problem isn't with the 
bytes containing invalid Unicode, it's with the Unicode containing 
invalid bytes -- i.e., characters that can't be encoded to the 
ultimate codec target.)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Bill Janssen
Here are a couple of ideas I'm taking away from the bytes/string
discussion.

First, it would probably be a good idea to have a String ABC.

Secondly, maybe the string situation in 2.x wasn't as broken as we
thought it was.  In particular, those who deal with lots of encoded
strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
are more like numbers than we think.  We have separate types for int,
float, Decimal, etc.  But they're all numbers, and they all
cross-operate.  In 2.x, it seems there were two missing features: no
encoding attribute on str, which should have been there and should have
been required, and the default encoding being "ASCII" (I can't tell you
how many times I've had to fix that issue when a non-ASCII encoded str
was passed to some output function).

So maybe having a second string type in 3.x that consists of an encoded
sequence of bytes plus the encoding, call it "estr", wouldn't have been
a bad idea.  It would probably have made sense to have estr cooperate
with the str type, in the same way that two different kinds of numbers
cooperate, "promoting" the result of an operation only when necessary.
This would automatically achieve the kind of polymorphic functionality
that Guido is suggesting, but without losing the ability to do

  x = e(ASCII)"bar"
  a = ''.join("foo", x)

(or whatever the syntax for such an encoded string literal would be --
I'm not claiming this is a good one) which presume would bind "a" to a
Unicode string "foobar" -- have to work out what gets promoted to what.

The language moratorium kind of makes this all theoretical, but building
a String ABC still would be a good start, and presumably isn't forbidden
by the moratorium.

Bill

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Brett Cannon
On Thu, Jun 24, 2010 at 08:50, Barry Warsaw  wrote:
> This is a follow up to PEP 3147.  That PEP, already implemented in Python 3.2,
> allows for Python source files from different Python versions to live together
> in the same directory.  It does this by putting a magic tag in the .pyc file
> name and placing the .pyc file in a __pycache__ directory.
>
> Distros such as Debian and Ubuntu will use this to greatly simplifying
> deploying Python, and Python applications and libraries.  Debian and Ubuntu
> usually ship more than one version of Python, and currently have to play
> complex games with symlinks to make this work.  PEP 3147 will go a long way to
> eliminating the need for extra directories and symlinks.
>
> One more thing I've found we need though, is a way to handled shared libraries
> for extension modules.  Just as we can get name collisions on foo.pyc, we can
> get collisions on foo.so.  We obviously cannot install foo.so built for Python
> 3.2 and foo.so built for Python 3.3 in the same location.  So symlink
> nightmare's mini-me is back.
>
> I have a fairly simple fix for this.  I'd actually be surprised if this hasn't
> been discussed before, but teh Googles hasn't turned up anything.
>
> The idea is to put the Python version number in the shared library file name,
> and extend .so lookup to find these extended file names.  So for example, we'd
> see foo.3.2.so instead, and Python would know how to dynload both that and the
> traditional foo.so file too (for backward compatibility).
>
> (On file naming: the original patch used foo.so.3.2 and that works just as
> well, but I thought there might be tools that expect exactly a '.so' suffix,
> so I changed it to put the Major.Minor version number to the left of the
> extension.  The exact naming scheme is of course open to debate.)
>

While the idea is fine with me since I won't have any of my
directories cluttered with multiple .so files, I would still want to
add some moniker showing that the version number represents the
interpreter and not the .so file. If I read "foo.3.2.so", that naively
seems to mean to mean the foo module's 3.2 release is what is in
installed, not that it's built for CPython 3.2. So even though it
might be redundant, I would still want the VM name added.

Adding the VM name also doesn't make extension modules the exclusive
domain of CPython either. If some other VM decides to make their own
.so files that are not binary compatible then we should not preclude
that as this solution it is nothing more than it makes a string
comparison have to look at 7 more characters.

-Brett

P.S.: I wish we could drop use of the 'module.so' variant at the same
time, for consistency sake and to cut out a stat call, but I know that
is asking too much.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:

>2010/6/24 Barry Warsaw :
>> Please let me know what you think.  I'm happy to just commit this to the
>> py3k branch if there are no objections .  I don't think a new PEP is
>> in order, but an update to PEP 3147 might make sense.
>
>How will this interact with PEP 384 if that is implemented?

Good question, I'd forgotten to mention that PEP.

I think the PEP is a good idea, and worth working on, but it is a longer term
solution to the problem of extension source code compatibility.  It's longer
term because extensions will have to be rewritten to use the new API defined
in PEP 384.  It will take a long time to get this into practice, and
supporting it will be a case-by-case basis.

I'm trying to come up with something that will work immediately while PEP 384
is being adopted.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Benjamin Peterson
2010/6/24 Barry Warsaw :
> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>
>>2010/6/24 Barry Warsaw :
>>> Please let me know what you think.  I'm happy to just commit this to the
>>> py3k branch if there are no objections .  I don't think a new PEP is
>>> in order, but an update to PEP 3147 might make sense.
>>
>>How will this interact with PEP 384 if that is implemented?
> I'm trying to come up with something that will work immediately while PEP 384
> is being adopted.

But how will modules specify that they support multiple ABIs then?



-- 
Regards,
Benjamin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Brett Cannon
On Thu, Jun 24, 2010 at 10:38, Bill Janssen  wrote:
[SNIP]
> The language moratorium kind of makes this all theoretical, but building
> a String ABC still would be a good start, and presumably isn't forbidden
> by the moratorium.

Because a new ABC would go into the stdlib (I assume in collections or
string) the moratorium does not apply.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Guido van Rossum
On Thu, Jun 24, 2010 at 10:48 AM, Brett Cannon  wrote:
> On Thu, Jun 24, 2010 at 08:50, Barry Warsaw  wrote:
>> This is a follow up to PEP 3147.  That PEP, already implemented in Python 
>> 3.2,
>> allows for Python source files from different Python versions to live 
>> together
>> in the same directory.  It does this by putting a magic tag in the .pyc file
>> name and placing the .pyc file in a __pycache__ directory.
>>
>> Distros such as Debian and Ubuntu will use this to greatly simplifying
>> deploying Python, and Python applications and libraries.  Debian and Ubuntu
>> usually ship more than one version of Python, and currently have to play
>> complex games with symlinks to make this work.  PEP 3147 will go a long way 
>> to
>> eliminating the need for extra directories and symlinks.
>>
>> One more thing I've found we need though, is a way to handled shared 
>> libraries
>> for extension modules.  Just as we can get name collisions on foo.pyc, we can
>> get collisions on foo.so.  We obviously cannot install foo.so built for 
>> Python
>> 3.2 and foo.so built for Python 3.3 in the same location.  So symlink
>> nightmare's mini-me is back.
>>
>> I have a fairly simple fix for this.  I'd actually be surprised if this 
>> hasn't
>> been discussed before, but teh Googles hasn't turned up anything.
>>
>> The idea is to put the Python version number in the shared library file name,
>> and extend .so lookup to find these extended file names.  So for example, 
>> we'd
>> see foo.3.2.so instead, and Python would know how to dynload both that and 
>> the
>> traditional foo.so file too (for backward compatibility).
>>
>> (On file naming: the original patch used foo.so.3.2 and that works just as
>> well, but I thought there might be tools that expect exactly a '.so' suffix,
>> so I changed it to put the Major.Minor version number to the left of the
>> extension.  The exact naming scheme is of course open to debate.)
>>
>
> While the idea is fine with me since I won't have any of my
> directories cluttered with multiple .so files, I would still want to
> add some moniker showing that the version number represents the
> interpreter and not the .so file. If I read "foo.3.2.so", that naively
> seems to mean to mean the foo module's 3.2 release is what is in
> installed, not that it's built for CPython 3.2. So even though it
> might be redundant, I would still want the VM name added.

Well, for versions of the .so itself, traditionally version numbers
are appended *after* the .so suffix (check your /lib directory :-).

> Adding the VM name also doesn't make extension modules the exclusive
> domain of CPython either. If some other VM decides to make their own
> .so files that are not binary compatible then we should not preclude
> that as this solution it is nothing more than it makes a string
> comparison have to look at 7 more characters.
>
> -Brett
>
> P.S.: I wish we could drop use of the 'module.so' variant at the same
> time, for consistency sake and to cut out a stat call, but I know that
> is asking too much.

I wish so too. IIRC there used to be some modules that on Windows were
wrappers around 3rd party DLLs and you can't have foo.dll as the
module wrapping foo.dll the 3rd party DLL. (On Unix this problem
doesn't exist because the 3rd party .so would be named libfoo.so, not
foo.so.)

-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 01:00 PM, Benjamin Peterson wrote:

>2010/6/24 Barry Warsaw :
>> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>>
>>>2010/6/24 Barry Warsaw :
 Please let me know what you think.  I'm happy to just commit this to the
 py3k branch if there are no objections .  I don't think a new PEP is
 in order, but an update to PEP 3147 might make sense.
>>>
>>>How will this interact with PEP 384 if that is implemented?
>> I'm trying to come up with something that will work immediately while PEP 384
>> is being adopted.
>
>But how will modules specify that they support multiple ABIs then?

I didn't understand, so asked Benjamin for clarification in IRC.

 barry: if python 3.3 will only load x.3.3.so, but x.3.2.so supports
   the stable abi, will it load it?  [14:25]
 gutworth: thanks, now i get it :)  [14:26]
 gutworth: i think it should, but it wouldn't under my scheme.  let me
think about it

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Brett Cannon
On Thu, Jun 24, 2010 at 11:27, Guido van Rossum  wrote:
> On Thu, Jun 24, 2010 at 10:48 AM, Brett Cannon  wrote:
>> On Thu, Jun 24, 2010 at 08:50, Barry Warsaw  wrote:
>>> This is a follow up to PEP 3147.  That PEP, already implemented in Python 
>>> 3.2,
>>> allows for Python source files from different Python versions to live 
>>> together
>>> in the same directory.  It does this by putting a magic tag in the .pyc file
>>> name and placing the .pyc file in a __pycache__ directory.
>>>
>>> Distros such as Debian and Ubuntu will use this to greatly simplifying
>>> deploying Python, and Python applications and libraries.  Debian and Ubuntu
>>> usually ship more than one version of Python, and currently have to play
>>> complex games with symlinks to make this work.  PEP 3147 will go a long way 
>>> to
>>> eliminating the need for extra directories and symlinks.
>>>
>>> One more thing I've found we need though, is a way to handled shared 
>>> libraries
>>> for extension modules.  Just as we can get name collisions on foo.pyc, we 
>>> can
>>> get collisions on foo.so.  We obviously cannot install foo.so built for 
>>> Python
>>> 3.2 and foo.so built for Python 3.3 in the same location.  So symlink
>>> nightmare's mini-me is back.
>>>
>>> I have a fairly simple fix for this.  I'd actually be surprised if this 
>>> hasn't
>>> been discussed before, but teh Googles hasn't turned up anything.
>>>
>>> The idea is to put the Python version number in the shared library file 
>>> name,
>>> and extend .so lookup to find these extended file names.  So for example, 
>>> we'd
>>> see foo.3.2.so instead, and Python would know how to dynload both that and 
>>> the
>>> traditional foo.so file too (for backward compatibility).
>>>
>>> (On file naming: the original patch used foo.so.3.2 and that works just as
>>> well, but I thought there might be tools that expect exactly a '.so' suffix,
>>> so I changed it to put the Major.Minor version number to the left of the
>>> extension.  The exact naming scheme is of course open to debate.)
>>>
>>
>> While the idea is fine with me since I won't have any of my
>> directories cluttered with multiple .so files, I would still want to
>> add some moniker showing that the version number represents the
>> interpreter and not the .so file. If I read "foo.3.2.so", that naively
>> seems to mean to mean the foo module's 3.2 release is what is in
>> installed, not that it's built for CPython 3.2. So even though it
>> might be redundant, I would still want the VM name added.
>
> Well, for versions of the .so itself, traditionally version numbers
> are appended *after* the .so suffix (check your /lib directory :-).
>

Second thing you taught me today (first was the x[:0] trick)!

I've also been on OS X too long; /usr/lib is just .dynalib and that
puts the version number before the extension.

>> Adding the VM name also doesn't make extension modules the exclusive
>> domain of CPython either. If some other VM decides to make their own
>> .so files that are not binary compatible then we should not preclude
>> that as this solution it is nothing more than it makes a string
>> comparison have to look at 7 more characters.
>>
>> -Brett
>>
>> P.S.: I wish we could drop use of the 'module.so' variant at the same
>> time, for consistency sake and to cut out a stat call, but I know that
>> is asking too much.
>
> I wish so too. IIRC there used to be some modules that on Windows were
> wrappers around 3rd party DLLs and you can't have foo.dll as the
> module wrapping foo.dll the 3rd party DLL. (On Unix this problem
> doesn't exist because the 3rd party .so would be named libfoo.so, not
> foo.so.)

Wouldn't Barry's proposed solution actually fill this need since it
will give the file a custom Python suffix that more-or-less guarantees
no name clash with a third-party DLL?
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Éric Araujo
Le 24/06/2010 17:50, Barry Warsaw (FLUFL) a écrit :
> Other possible approaches:
>  * Extend the distutils API so that the .so file extension can be passed in,
>instead of being essentially hardcoded to what Python's Makefile contains.

Third-party code rely on Distutils internal quirks, so it’s frozen. Feel
free to open a bug against Distutils2 on the Python tracker if that
would be generally useful.

Regards

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Éric Araujo
Le 24/06/2010 19:48, Brett Cannon a écrit :
> P.S.: I wish we could drop use of the 'module.so' variant at the same
> time, for consistency sake and to cut out a stat call, but I know that
> is asking too much.

At least, looking for spam/__init__module.so could be avoided. It seems
to me that the package definition does not allow that. The tradeoff
would be code complication for one less stat call. Worth a bug report?

Regards

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Michael Foord

On 24/06/2010 19:11, Brett Cannon wrote:

On Thu, Jun 24, 2010 at 10:38, Bill Janssen  wrote:
[SNIP]
   

The language moratorium kind of makes this all theoretical, but building
a String ABC still would be a good start, and presumably isn't forbidden
by the moratorium.
 

Because a new ABC would go into the stdlib (I assume in collections or
string) the moratorium does not apply.
   


Although it would require changes for builtin types like file to work 
with a new string ABC, right?


Michael


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
   



--
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
your employer, to release me from all obligations and waivers arising from any 
and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, 
clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
acceptable use policies (”BOGUS AGREEMENTS”) that I have entered into with your 
employer, its partners, licensors, agents and assigns, in perpetuity, without 
prejudice to my ongoing rights and privileges. You further represent that you 
have the authority to release me from any BOGUS AGREEMENTS on behalf of your 
employer.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Brett Cannon
On Thu, Jun 24, 2010 at 12:07, Michael Foord  wrote:
> On 24/06/2010 19:11, Brett Cannon wrote:
>>
>> On Thu, Jun 24, 2010 at 10:38, Bill Janssen  wrote:
>> [SNIP]
>>
>>>
>>> The language moratorium kind of makes this all theoretical, but building
>>> a String ABC still would be a good start, and presumably isn't forbidden
>>> by the moratorium.
>>>
>>
>> Because a new ABC would go into the stdlib (I assume in collections or
>> string) the moratorium does not apply.
>>
>
> Although it would require changes for builtin types like file to work with a
> new string ABC, right?

Only if they wanted to rely on some concrete implementation of a
method contained within the ABC. Otherwise that's what abc.register
exists for.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Ian Bicking
On Thu, Jun 24, 2010 at 12:38 PM, Bill Janssen  wrote:

> Here are a couple of ideas I'm taking away from the bytes/string
> discussion.
>
> First, it would probably be a good idea to have a String ABC.
>
> Secondly, maybe the string situation in 2.x wasn't as broken as we
> thought it was.  In particular, those who deal with lots of encoded
> strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
> are more like numbers than we think.  We have separate types for int,
> float, Decimal, etc.  But they're all numbers, and they all
> cross-operate.  In 2.x, it seems there were two missing features: no
> encoding attribute on str, which should have been there and should have
> been required, and the default encoding being "ASCII" (I can't tell you
> how many times I've had to fix that issue when a non-ASCII encoded str
> was passed to some output function).
>

I've started to form a conceptual notion that I think fits these cases.

We've setup a system where we think of text as natively unicode, with
encodings to put that unicode into a byte form.  This is certainly
appropriate in a lot of cases.  But there's a significant class of problems
where bytes are the native structure.  Network protocols are what we've been
discussing, and are a notable case of that.  That is, b'/' is the most
native sense of a path separator in a URL, or b':' is the most native sense
of what separates a header name from a header value in HTTP.  To disallow
unicode URLs or unicode HTTP headers would be rather anti-social, especially
because unicode is now the "native" string type in Python 3 (as an aside for
the WSGI spec we've been talking about using "native" strings in some
positions like dictionary keys, meaning Python 2 str and Python 3 str, while
being more exacting in other areas such as a response body which would
always be bytes).

The HTTP spec and other network protocols seems a little fuzzy on this,
because it was written before unicode even existed, and even later activity
happened at a point when "unicode" and "text" weren't widely considered the
same thing like they are now.  But I think the original intention is
revealed in a more modern specification like WebSockets, where they are very
explicit that ':' is just shorthand for a particular byte, it is not "text"
in our new modern notion of the term.

So with this idea in mind it makes more sense to me that *specific pieces of
text* can be reasonably treated as both bytes and text.  All the string
literals in urllib.parse.urlunspit() for example.

The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not
become special('/x')) and special('/')+x=='/x' (again it becomes str).  This
avoids some of the cases of unicode or str infecting a system as they did in
Python 2 (where you might pass in unicode and everything works fine until
some non-ASCII is introduced).

The one place where this might be tricky is if you have an encoding that is
not ASCII compatible.  But we can't guard against every possibility.  So it
would be entirely wrong to take a string encoded with UTF-16 and start to
use b'/' with it.  But there are other nonsensical combinations already
possible, especially with polymorphic functions, we can't guard against all
of them.  Also I'm unsure if something like UTF-16 is in any way compatible
with the kind of legacy systems that use bytes.  Can you encode your
filesystem with UTF-16?  I don't think you could encode a cookie with it.

So maybe having a second string type in 3.x that consists of an encoded
> sequence of bytes plus the encoding, call it "estr", wouldn't have been
> a bad idea.  It would probably have made sense to have estr cooperate
> with the str type, in the same way that two different kinds of numbers
> cooperate, "promoting" the result of an operation only when necessary.
> This would automatically achieve the kind of polymorphic functionality
> that Guido is suggesting, but without losing the ability to do
>
>  x = e(ASCII)"bar"
>  a = ''.join("foo", x)
>
> (or whatever the syntax for such an encoded string literal would be --
> I'm not claiming this is a good one) which presume would bind "a" to a
> Unicode string "foobar" -- have to work out what gets promoted to what.
>

I would be entirely happy without a literal syntax.  But as Phillip has
noted, this can't be implemented *entirely* in a library as there are some
constraints with the current str/bytes implementations.  Reading PEP 3003
I'm not clear if such changes are part of the moratorium?  They seem like
they would be (sadly), but it doesn't seem clearly noted.

I think there's a *different* use case for things like
bytes-in-a-utf8-encoding (e.g., to allow XML data to be decoded lazily), but
that could be yet another class, and maybe shouldn't be polymorphicly usable
as bytes (i.e., treat it as an optimized str representation that is
otherwise semantically equivalent).  A String ABC would formalize these
things.

-- 
Ian Bicking  |  http://blog.ianbi

Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 02:28 PM, Barry Warsaw wrote:

>On Jun 24, 2010, at 01:00 PM, Benjamin Peterson wrote:
>
>>2010/6/24 Barry Warsaw :
>>> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>>>
2010/6/24 Barry Warsaw :
> Please let me know what you think.  I'm happy to just commit this to the
> py3k branch if there are no objections .  I don't think a new PEP is
> in order, but an update to PEP 3147 might make sense.

How will this interact with PEP 384 if that is implemented?
>>> I'm trying to come up with something that will work immediately while PEP 
>>> 384
>>> is being adopted.
>>
>>But how will modules specify that they support multiple ABIs then?
>
>I didn't understand, so asked Benjamin for clarification in IRC.
>
> barry: if python 3.3 will only load x.3.3.so, but x.3.2.so supports
>   the stable abi, will it load it?  [14:25]
> gutworth: thanks, now i get it :)  [14:26]
> gutworth: i think it should, but it wouldn't under my scheme.  let me
>think about it

So, we could say that PEP 384 compliant extension modules would get written
without a version specifier.  IOW, we'd treat foo.so as using the ABI.  It
would then be up to the Python runtime to throw ImportErrors if in fact we
were loading a legacy, non-PEP 384 compliant extension.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 10:48 AM, Brett Cannon wrote:

>While the idea is fine with me since I won't have any of my
>directories cluttered with multiple .so files, I would still want to
>add some moniker showing that the version number represents the
>interpreter and not the .so file. If I read "foo.3.2.so", that naively
>seems to mean to mean the foo module's 3.2 release is what is in
>installed, not that it's built for CPython 3.2. So even though it
>might be redundant, I would still want the VM name added.

I have a new version of my patch that steals the "magic tag" idea from PEP
3147.  Note that it does not use the *actual* same piece of information to
compose the file name, but for now it does match the pyc tag string.

E.g.

% find . -name \*.so
./build/lib.linux-x86_64-3.2/math.cpython-32.so
./build/lib.linux-x86_64-3.2/select.cpython-32.so
./build/lib.linux-x86_64-3.2/_struct.cpython-32.so
...

Further, by default, ./configure doesn't add this tag so that you would have
to build Python with:

% SOABI=cpython-32 ./configure

to get anything between the module name and the extension.  I could of course
make this a configure switch instead, and could default it to some other magic
string instead of the empty string.

>Adding the VM name also doesn't make extension modules the exclusive
>domain of CPython either. If some other VM decides to make their own
>.so files that are not binary compatible then we should not preclude
>that as this solution it is nothing more than it makes a string
>comparison have to look at 7 more characters.
>
>-Brett
>
>P.S.: I wish we could drop use of the 'module.so' variant at the same
>time, for consistency sake and to cut out a stat call, but I know that
>is asking too much.

I think you're right that with the $SOABI trick above, you wouldn't get the
name collisions Guido recalls, and you could get rid of module.so.  OTOH, as I
am currently only targeting Linux, it seems like the module.so stat is wasted
anyway on that platform.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 11:27 AM, Guido van Rossum wrote:

>On Thu, Jun 24, 2010 at 10:48 AM, Brett Cannon  wrote:
>> While the idea is fine with me since I won't have any of my
>> directories cluttered with multiple .so files, I would still want to
>> add some moniker showing that the version number represents the
>> interpreter and not the .so file. If I read "foo.3.2.so", that naively
>> seems to mean to mean the foo module's 3.2 release is what is in
>> installed, not that it's built for CPython 3.2. So even though it
>> might be redundant, I would still want the VM name added.
>
>Well, for versions of the .so itself, traditionally version numbers
>are appended *after* the .so suffix (check your /lib directory :-).

Which is probably another reason not to use foo.so.X.Y for Python extension
modules.  I think it would be confusing, and foo..so looks nice and is
consistent with foo..pyc.  (Ref to updated patch coming...)

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Guido van Rossum
I see it a little differently (though there is probably a common
concept lurking in here).

The protocols you mention are intentionally designed to be
encoding-neutral as long as the encoding is an ASCII superset. This
covers ASCII itself, Latin-1, Latin-N for other values of N, MacRoman,
Microsoft's code pages (most of them anyways), UTF-8, presumably at
least some of the Japanese encodings, and probably a host of others.
But it does not cover UTF-16, EBCDIC, and others. (Encodings that have
"shift bytes" that change the meaning of some or all ordinary ASCII
characters also aren't covered, unless such an encoding happens to
exclude the special characters that the protocol spec cares about).

The protocol specs typically go out of their way to specify what byte
values they use for syntactically significant positions (e.g. ':' in
headers, or '/' in URLs), while hand-waving about the meaning of "what
goes in between" since it is all typically treated as "not of
syntactic significance". So you can write a parser that looks at bytes
exclusively, and looks for a bunch of ASCII punctuation characters
(e.g. '<', '>', '/', '&'), and doesn't know or care whether the stuff
in between is encoded in Latin-15, MacRoman or UTF-8 -- it never looks
"inside" stretches of characters between the special characters and
just copies them. (Sometimes there may be *some* sections that are
required to be ASCII and there equivalence of a-z and A-Z is well
defined.)

But I wouldn't go so far as to claim that interpreting the protocols
as text is wrong. After all we're talking exclusively about protocols
that are designed intentionally to be directly "human readable"
(albeit as a fall-back option) -- the only tool you need to debug the
traffic on the wire or socket is something that knows which subset of
ASCII is considered "printable" and which renders everything else
safely as a hex escape or even a special "unknown" character (like
Unicode's "?" inside a black diamond).

Depending on the requirements of a specific app (or framework) it may
be entirely reasonable to convert everything to Unicode and process
the resulting text; in other contexts it makes more sense to keep
everything as bytes. It also makes sense to have an interface library
to deal with a specific protocol that treats the protocol side as
bytes but interacts with the application using text, since that is
often how the application programmer wants to treat it anyway.

Of course, some protocols require the application programmer to be
aware of bytes as well in *some* cases -- examples are email and HTTP
which can be used to transfer text as well as binary data (e.g.
images). There is also the bootstrap problem where the wire data must
be partially parsed in order to find out the encoding to be used to
convert it to text. But that doesn't mean it's invalid to think about
it as text in many application contexts.

Regarding the proposal of a String ABC, I hope this isn't going to
become a backdoor to reintroduce the Python 2 madness of allowing
equivalency between text and bytes for *some* strings of bytes and not
others.

Finally, I do think that we should not introduce changes to the
fundamental behavior of text and bytes while the moratorium is in
place. Changes to specific stdlib APIs are fine however.

--Guido

On Thu, Jun 24, 2010 at 12:49 PM, Ian Bicking  wrote:
> On Thu, Jun 24, 2010 at 12:38 PM, Bill Janssen  wrote:
>>
>> Here are a couple of ideas I'm taking away from the bytes/string
>> discussion.
>>
>> First, it would probably be a good idea to have a String ABC.
>>
>> Secondly, maybe the string situation in 2.x wasn't as broken as we
>> thought it was.  In particular, those who deal with lots of encoded
>> strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
>> are more like numbers than we think.  We have separate types for int,
>> float, Decimal, etc.  But they're all numbers, and they all
>> cross-operate.  In 2.x, it seems there were two missing features: no
>> encoding attribute on str, which should have been there and should have
>> been required, and the default encoding being "ASCII" (I can't tell you
>> how many times I've had to fix that issue when a non-ASCII encoded str
>> was passed to some output function).
>
> I've started to form a conceptual notion that I think fits these cases.
>
> We've setup a system where we think of text as natively unicode, with
> encodings to put that unicode into a byte form.  This is certainly
> appropriate in a lot of cases.  But there's a significant class of problems
> where bytes are the native structure.  Network protocols are what we've been
> discussing, and are a notable case of that.  That is, b'/' is the most
> native sense of a path separator in a URL, or b':' is the most native sense
> of what separates a header name from a header value in HTTP.  To disallow
> unicode URLs or unicode HTTP headers would be rather anti-social, especially
> because unicode is now the "native" string type 

Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Brett Cannon
On Thu, Jun 24, 2010 at 11:53, Éric Araujo  wrote:
> Le 24/06/2010 19:48, Brett Cannon a écrit :
>> P.S.: I wish we could drop use of the 'module.so' variant at the same
>> time, for consistency sake and to cut out a stat call, but I know that
>> is asking too much.
>
> At least, looking for spam/__init__module.so could be avoided. It seems
> to me that the package definition does not allow that.

I thought no one had bothered to change import.c to allow for
extension modules to act as a package's __init__?

As for not being allowed, I don't agree with that assessment. If you
treat a package's __init__ module as simply that, a module that would
be named __init__ when imported, then __init__module.c would be valid
(and that's what importlib does).

> The tradeoff
> would be code complication for one less stat call. Worth a bug report?

Nah.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 11:05 AM, Daniel Stutzbach wrote:

>On Thu, Jun 24, 2010 at 10:50 AM, Barry Warsaw  wrote:
>
>> The idea is to put the Python version number in the shared library file
>> name,
>> and extend .so lookup to find these extended file names.  So for example,
>> we'd
>> see foo.3.2.so instead, and Python would know how to dynload both that and
>> the
>> traditional foo.so file too (for backward compatibility).
>>
>
>What use case does this address?

Specifically, it's the use case where we (Debian/Ubuntu) plan on installing
all Python 3.x packages into /usr/lib/python3/dist-packages.  As of PEP 3147,
we can do that without collisions on the pyc files, but would still have to
symlink for extension module .so files, because they are always named foo.so
and Python 3.2's foo.so won't (modulo PEP 384) be compatible with Python 3.3's
foo.so.

So using the same trick as in PEP 3147, if we can name Python 3.2's foo
extension differently than the incompatible Python 3.3's foo extension, we can
have them live in the same directory without symlink tricks.

>PEP 3147 addresses the fact that the user may have different versions of
>Python installed and each wants to write a .pyc file when loading a module.
> .so files are not generated simply by running the Python interpreter, ergo
>.so files are not an issue for that use case.

See above.  It doesn't matter whether the pyc or so is created at run time by
the user or by the distro build system.  If the files for different Python
versions end up in the same directory, they must be named differently too.

>If you want to make it so a system can install a package in just one
>location to be used by multiple Python installations, then the version
>number isn't enough.  You also need to distinguish debug builds, profiling
>builds, Unicode width (see issue8654), and probably several other
>./configure options.

This is a good point, but more easily addressed.  Let's say a distro makes
three Python 3.2 variants available, one "normal" build, a debug build, and
UCS2 and USC4 versions of the above.  All we need to do is choose a different
.so ABI tag (see previous follow) for each of those builds.  My updated patch
(coming soon) allows you to define that tag to configure.  So e.g.

Normal build UCSX: SOABI=cpython-32 ./configure
Debug build UCSX:  SOABI=cpython-32-d ./configure
Normal build UCSY: SOABI=cpython-32-w ./configure
Debug build UCSY:  SOABI=cpython-32-dw ./configure

Mix and match for any other build options you care about.  Because the distro
controls how Python is configured, this should be fairly easy to achieve.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Fred Drake
On Thu, Jun 24, 2010 at 4:55 PM, Barry Warsaw  wrote:
> Which is probably another reason not to use foo.so.X.Y for Python extension
> modules.

Clearly, foo.so.3.2 is the man page for the foo.so.3 system call.

The ABI ident definitely has to be elsewhere.


  -Fred

-- 
Fred L. Drake, Jr.
"A storm broke loose in my mind."  --Albert Einstein
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 08:50 PM, Éric Araujo wrote:

>Le 24/06/2010 17:50, Barry Warsaw (FLUFL) a écrit :
>> Other possible approaches:
>>  * Extend the distutils API so that the .so file extension can be passed in,
>>instead of being essentially hardcoded to what Python's Makefile contains.
>
>Third-party code rely on Distutils internal quirks, so it’s frozen. Feel
>free to open a bug against Distutils2 on the Python tracker if that
>would be generally useful.

Depending on how strict this constraint is, it could make things more
difficult.  I can control what shared library file names Python will load
statically, but in order to support PEP 384 I think I need to be able to
control what file extensions build_ext writes.

My updated patch does this in a backward compatible way.  Of course, distutils
hacks have their tentacles all up in the distutils internals, so maybe my
patch will break something after all.  I can think of a few even hackier ways
to work around that if necessary.

My updated patch:
 * Adds an optional argument to build_ext.get_ext_fullpath() and
   build_ext.get_ext_filename().  This extra argument is the Extension
   instance being built.  (Boy, just in case anyone's already playing with the
   time machine, it sure would have been nice if these methods had originally
   just taken the Extension instance and dug out ext.name instead of passing
   the string in.)
 * Adds an optional new keyword argument to the Extension class, called
   so_abi_tag.  If given, this overrides the Makefile $SO variable extension.

What this means is that with no changes, a non-PEP 384 compliant extension
module wouldn't have to change anything:

setup(
name='stupid',
version='0.0',
packages=['stupid', 'stupid.tests'],
ext_modules=[Extension('_stupid',
   ['src/stupid.c'],
   )],
test_suite='stupid.tests',
)

With a Python built like so:

% SOABI=cpython-32 ./configure

you'd end up with a _stupid.cpython-32.so module.

However, if you knew your extension module was PEP 384 compliant, and could be
shared on >=Python 3.2, you would do:

setup(
name='stupid',
version='0.0',
packages=['stupid', 'stupid.tests'],
ext_modules=[Extension('_stupid',
   ['src/stupid.c'],
   so_abi_tag='',
   )],
test_suite='stupid.tests',
)

and now you'd end up with _stupid.so, which I propose to mean it's PEP 384 ABI
compliant.  (There may not be any other use case than so_abi_tag='' or
so_abi_tag=None, in which case, the Extension keyword *might* be better off as
a boolean.)

Now of course PEP 384 isn't implemented, so it's a bit of a moot point.  But
if some form of versioned .so file naming is accepted for Python 3.2, I'll
update PEP 384 with possible solutions.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Barry Warsaw
On Jun 24, 2010, at 11:50 AM, Barry Warsaw wrote:

>Please let me know what you think.  I'm happy to just commit this to the py3k
>branch if there are no objections .  I don't think a new PEP is in
>order, but an update to PEP 3147 might make sense.

Thanks for all the quick feedback.  I've made some changes based on the
comments so far.  The bzr branch is updated, and a new patch is available
here:

http://pastebin.ubuntu.com/454688/

If reception continues to be mildly approving, I'll open an issue on
bugs.python.org and attach the patch to that.

-Barry


signature.asc
Description: PGP signature
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Éric Araujo
Your plan seems good. Adding keyword arguments should not create
compatibility issues, and I suspect the impact on the code of build_ext
may be actually quite small. I’ll try to review your patch even though I
don’t know C or compiler oddities, but Tarek will have the best insight
and the final word.

In case the time machine’s not available, your suggestion about getting
the filename from the Extension instance instead of passing in a string
can most certainly land in distutils2.

Regards

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Ian Bicking
On Thu, Jun 24, 2010 at 3:59 PM, Guido van Rossum  wrote:

> The protocol specs typically go out of their way to specify what byte
> values they use for syntactically significant positions (e.g. ':' in
> headers, or '/' in URLs), while hand-waving about the meaning of "what
> goes in between" since it is all typically treated as "not of
> syntactic significance". So you can write a parser that looks at bytes
> exclusively, and looks for a bunch of ASCII punctuation characters
> (e.g. '<', '>', '/', '&'), and doesn't know or care whether the stuff
> in between is encoded in Latin-15, MacRoman or UTF-8 -- it never looks
> "inside" stretches of characters between the special characters and
> just copies them. (Sometimes there may be *some* sections that are
> required to be ASCII and there equivalence of a-z and A-Z is well
> defined.)
>

Yes, these are the specific characters that I think we can handle
specially.  For instance, the list of all string literals used by urlsplit
and urlunsplit:
'//'
'/'
':'
'?'
'#'
''
'http'
A list of all valid scheme characters (a-z etc)
Some lists for scheme-specific parsing (which all contain valid scheme
characters)

All of these are constrained to ASCII, and must be constrained to ASCII, and
everything else in a URL is treated as basically opaque.

So if we turned these characters into byte-or-str objects I think we'd
basically be true to the intent of the specs, and in a practical sense we'd
be able to make these functions polymorphic.  I suspect this same pattern
will be present most places where people want polymorphic behavior.

For now we could do something incomplete and just avoid using operators we
can't overload (is it possible to at least make them produce a readable
exception?)

I think we'll avoid a lot of the confusion that was present with Python 2 by
not making the coercions transitive.  For instance, here's something that
would work in Python 2:

  urlunsplit(('http', 'example.com', '/foo', u'bar=baz', ''))

And you'd get out a unicode string, except that would break the first time
that query string (u'bar=baz') was not ASCII (but not until then!)

Here's the urlunsplit code:

def urlunsplit(components):
scheme, netloc, url, query, fragment = components
if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
if url and url[:1] != '/': url = '/' + url
url = '//' + (netloc or '') + url
if scheme:
url = scheme + ':' + url
if query:
url = url + '?' + query
if fragment:
url = url + '#' + fragment
return url

If all those literals were this new special kind of string, if you call:

  urlunsplit((b'http', b'example.com', b'/foo', 'bar=baz', b''))

You'd end up constructing the URL b'http://example.com/foo' and then
running:

url = url + special('?') + query

And that would fail because b'http://example.com/foo' + special('?') would
be b'http://example.com/foo?' and you cannot add that to the str 'bar=baz'.
So we'd be avoiding the Python 2 craziness.

-- 
Ian Bicking  |  http://blog.ianbicking.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Antoine Pitrou
On Thu, 24 Jun 2010 20:07:41 +0100
Michael Foord  wrote:
> 
> Although it would require changes for builtin types like file to work 
> with a new string ABC, right?

There is no builtin file type in 3.x.
Besides, it is not an ABC-level problem; the IO layer is written in C
(although there's still the Python implementation to play with), which
would mandate an abstract C API to access unicode-like objects
(similarly as there's already the buffer API to access bytes-like
objects).

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Scott Dial
On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>> What use case does this address?
> 
> Specifically, it's the use case where we (Debian/Ubuntu) plan on installing
> all Python 3.x packages into /usr/lib/python3/dist-packages.  As of PEP 3147,
> we can do that without collisions on the pyc files, but would still have to
> symlink for extension module .so files, because they are always named foo.so
> and Python 3.2's foo.so won't (modulo PEP 384) be compatible with Python 3.3's
> foo.so.

If the package has .so files that aren't compatible with other version
of python, then what is the motivation for placing that in a shared
location (since it can't actually be shared)?

> So using the same trick as in PEP 3147, if we can name Python 3.2's foo
> extension differently than the incompatible Python 3.3's foo extension, we can
> have them live in the same directory without symlink tricks.

Why would a symlink trick even be necessary if there is a
version-unspecific directory and a version-specific directory on the
search path?

>> PEP 3147 addresses the fact that the user may have different versions of
>> Python installed and each wants to write a .pyc file when loading a module.
>> .so files are not generated simply by running the Python interpreter, ergo
>> .so files are not an issue for that use case.
> 
> See above.  It doesn't matter whether the pyc or so is created at run time by
> the user or by the distro build system.  If the files for different Python
> versions end up in the same directory, they must be named differently too.

But the only motivation for doing this with .pyc files is that the .py
files are able to be shared, since the .pyc is an on-demand-generated,
version-specific artifact (and not the source). The .so file is created
offline by another toolchain, is version-specific, and presumably you
are not suggesting that Python generate it on-demand.

> 
>> If you want to make it so a system can install a package in just one
>> location to be used by multiple Python installations, then the version
>> number isn't enough.  You also need to distinguish debug builds, profiling
>> builds, Unicode width (see issue8654), and probably several other
>> ./configure options.
> 
> This is a good point, but more easily addressed.  Let's say a distro makes
> three Python 3.2 variants available, one "normal" build, a debug build, and
> UCS2 and USC4 versions of the above.  All we need to do is choose a different
> .so ABI tag (see previous follow) for each of those builds.  My updated patch
> (coming soon) allows you to define that tag to configure.  So e.g.

Why is this use case not already addressed by having independent
directories? And why is there an incentive to co-mingle these
version-punned files with version-agnostic ones?

> Mix and match for any other build options you care about.  Because the distro
> controls how Python is configured, this should be fairly easy to achieve.

For packages that have .so files, won't the distro already have to build
multiple copies of that package for all version of Python? So, why can't
it place them in separate directories that are version-specific at that
time? This is not the same as placing .py files that are
version-agnostic into a version-agnostic location.

-- 
Scott Dial
[email protected]
[email protected]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Terry Reedy

On 6/24/2010 1:38 PM, Bill Janssen wrote:


Secondly, maybe the string situation in 2.x wasn't as broken as we
thought it was.  In particular, those who deal with lots of encoded
strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
are more like numbers than we think.  We have separate types for int,
float, Decimal, etc.  But they're all numbers, and they all
cross-operate.


No they do not. Decimal only mixes properly with ints, but not with 
anything else, sometime with surprising and havoc-creating ways:

>>> Decimal(0) == float(0)
False

I believe that and other comparisons may be fixed in 3.2, but I know 
there was lots of discussion of whether float + decimal should return a 
float or decimal, with good arguments both ways. To put it another way, 
there are potential problems with either choice. Automatic mixed-mode 
arithmetic is not always a slam-dunk, no-problem choise.


That aside, there are a couple of places where I think the comparison 
breaks down. If one adds a thousand ints and then a float, there is only 
the final number to convert. If one adds a thousand bytes and then a 
unicode, there is the concantenation of the thousand bytes to convert. 
Or short the result be the concatenation of a thousand unicode 
conversions. This brings up the distributivity (or not) of conversion 
over summation. In general, float(i) + float(j) = float(i+j), for i,j 
ints. I an not sure the same is true if i,j are bytes with some encoding 
and the conversion is unicode. Does it depend on the encoding?


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Nick Coghlan
On Fri, Jun 25, 2010 at 3:07 AM, P.J. Eby  wrote:
> (Btw, in some earlier emails, Stephen, you implied that this could be fixed
> with codecs -- but it can't, because the problem isn't with the bytes
> containing invalid Unicode, it's with the Unicode containing invalid bytes
> -- i.e., characters that can't be encoded to the ultimate codec target.)

That's what the surrogateescape error handler is for though - it will
happily accept mojibake on input (putting invalid bytes into the PUA),
and happily generate mojibake on output (recreating the invalid bytes
from the PUA) as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Guido van Rossum
On Thu, Jun 24, 2010 at 2:44 PM, Ian Bicking  wrote:
> I think we'll avoid a lot of the confusion that was present with Python 2 by
> not making the coercions transitive.  For instance, here's something that
> would work in Python 2:
>
>   urlunsplit(('http', 'example.com', '/foo', u'bar=baz', ''))
>
> And you'd get out a unicode string, except that would break the first time
> that query string (u'bar=baz') was not ASCII (but not until then!)

Actually, that wouldn't be a problem. The problem would be this:

   urlunsplit(('http', 'example.com', u'/foo', 'bar=baz', ''))

(I moved the "u" prefix from bar=baz to /foo.) And this would break
when instead of baz there was some non-ASCII UTF-8, e.g.


urlunsplit(('http', 'example.com', u'/foo', 'bar=\xe1\x88\xb4', ''))
-- 
--Guido van Rossum (python.org/~guido)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] bytes / unicode

2010-06-24 Thread Nick Coghlan
On Fri, Jun 25, 2010 at 1:41 AM, Guido van Rossum  wrote:
> I don't think we should abuse sum for this. A simple idiom to get the
> *empty* string of a particular type is x[:0] so you could write
> something like this to concatenate a list or strings or bytes:
> xs[:0].join(xs). Note that if xs is empty we wouldn't know what to do
> anyway so this should be disallowed.

That's a good trick, although there's a "[0]" missing from your join
example ("type(xs[0])()" is another way to spell the same idea, but
the subscripting version would likely be faster since it skips the
builtin lookup). Promoting that over explicit use of empty str and
bytes literals is probably step 1 in eliminating gratuitous breakage
of bytes/str polymorphism (this trick also has the benefit of working
with non-builtin character sequence types).

Use of non-empty bytes/str literals is going to be harder to handle -
actually trying to apply a polymorphic philosophy to the Python 3 URL
parsing libraries may be a good way to learn more on that front.

Cheers,
Nick.

P.S. I'm off to Sydney for PyconAU this evening, so I'm not sure how
much time I'll get to follow python-dev until next week.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Terry Reedy

On 6/24/2010 4:59 PM, Guido van Rossum wrote:


But I wouldn't go so far as to claim that interpreting the protocols
as text is wrong. After all we're talking exclusively about protocols
that are designed intentionally to be directly "human readable"


I agree that the claim "':' is just a byte" is a bit shortsighted.

If the designers of the protocols had intended to use uninterpreted 
bytes as protocol markers, they could and I suspect would have used 
unused control codes, of which there are several. Then there would have 
been no need for escape mechanisms to put things like :<> into content text.


I am very sure that the reason for specifying *ascii* byte values was to 
be crysal clear as to what *character* was meant and to *exclude* use on 
the internet of the main imcompatible competitor encoding -- IBM's 
EBCDIC -- which IBM used in all of *its* networks. Until the IBM PC came 
out in the early 1980s (and IBM originally saw that as a minor sideline 
and something of a toy), there was a battle over byte encodings between 
IBM and everyone else.


--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread M.-A. Lemburg
Scott Dial wrote:
> On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>>> What use case does this address?
>>
>>> If you want to make it so a system can install a package in just one
>>> location to be used by multiple Python installations, then the version
>>> number isn't enough.  You also need to distinguish debug builds, profiling
>>> builds, Unicode width (see issue8654), and probably several other
>>> ./configure options.
>>
>> This is a good point, but more easily addressed.  Let's say a distro makes
>> three Python 3.2 variants available, one "normal" build, a debug build, and
>> UCS2 and USC4 versions of the above.  All we need to do is choose a different
>> .so ABI tag (see previous follow) for each of those builds.  My updated patch
>> (coming soon) allows you to define that tag to configure.  So e.g.
> 
> Why is this use case not already addressed by having independent
> directories? And why is there an incentive to co-mingle these
> version-punned files with version-agnostic ones?

I don't think this is a good idea. After a while your Python
lib directories would need some serious dusting off to make them
maintainable again.

Disk space is cheap so setting up dedicated directories for each
variant will result in a much easier to manage installation.

If you want a really clever setup, use hard links between those
directory (you can also use symlinks if you like).
Then a change in one Python file will automatically
propagate to all other variant dirs without any maintenance
effort. Together with PYTHONHOME this makes a really nice
virtualenv-like environment.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 25 2010)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

2010-07-19: EuroPython 2010, Birmingham, UK23 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! 


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Nick Coghlan
On Fri, Jun 25, 2010 at 1:50 AM, Barry Warsaw  wrote:
> Please let me know what you think.  I'm happy to just commit this to the py3k
> branch if there are no objections .  I don't think a new PEP is in
> order, but an update to PEP 3147 might make sense.

I like the idea, but I think summarising the rest of this discussion
in its own (relatively short) PEP would be good (there are a few
things that are tricky - exact versioning scheme, PEP 384 forward
compatibility, impact on distutils, articulating the benefits for
distro packaging, etc).

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Stephen Thorne
Steve Holden Wrote:
> Given the amount of interest this thread has generated I can't help
> wondering why it isn't more prominent in python.org content. Is the
> developer community completely disjoint with the web content editor
> community?
> 
> If there is such a disconnect we should think about remedying it: a
> large "Python 2 or 3?" button could link to a reasoned discussion of the
> pros and cons as evinced in this thread. That way people will end up
> with the right version more often (and be writing Python 2 that will
> more easily migrate to Python 3, if they cannot yet use 3).
> 
> There seems to be a perception that the PSF can help fund developments,
> and indeed Jesse Noller has made a small start with his sprint funding
> proposal (which now has some funding behind it). I think if it is to do
> so the Foundation will have to look for substantial new funding. I do
> not currently understand where this funding would come from, and would
> like to tap your developer creativity in helping to define how the
> Foundation can effectively commit more developer time to Python.
> 
> GSoC and GHOP are great examples, but there is plenty of room for all
> sorts of initiatives that result in development opportunities. I'd like
> to help.

I am extremely keen for this to happen. Does anyone have ownership of this
project? There was some discussion of it up-list but the discussion fizzled.

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Martin v. Löwis
Am 25.06.2010 01:28, schrieb Stephen Thorne:
> Steve Holden Wrote:
>> Given the amount of interest this thread has generated I can't help
>> wondering why it isn't more prominent in python.org content. Is the
>> developer community completely disjoint with the web content editor
>> community?
>>
>> If there is such a disconnect we should think about remedying it: a
>> large "Python 2 or 3?" button could link to a reasoned discussion of the
>> pros and cons as evinced in this thread. That way people will end up
>> with the right version more often (and be writing Python 2 that will
>> more easily migrate to Python 3, if they cannot yet use 3).
>>
>> There seems to be a perception that the PSF can help fund developments,
>> and indeed Jesse Noller has made a small start with his sprint funding
>> proposal (which now has some funding behind it). I think if it is to do
>> so the Foundation will have to look for substantial new funding. I do
>> not currently understand where this funding would come from, and would
>> like to tap your developer creativity in helping to define how the
>> Foundation can effectively commit more developer time to Python.
>>
>> GSoC and GHOP are great examples, but there is plenty of room for all
>> sorts of initiatives that result in development opportunities. I'd like
>> to help.
> 
> I am extremely keen for this to happen. Does anyone have ownership of this
> project? There was some discussion of it up-list but the discussion fizzled.

Can you please explain what "this project" is, in the context of your
message? GSoC? GHOP?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread James Y Knight


On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:


On 6/24/2010 5:09 PM, Barry Warsaw wrote:

What use case does this address?


Specifically, it's the use case where we (Debian/Ubuntu) plan on  
installing
all Python 3.x packages into /usr/lib/python3/dist-packages.  As of  
PEP 3147,
we can do that without collisions on the pyc files, but would still  
have to
symlink for extension module .so files, because they are always  
named foo.so
and Python 3.2's foo.so won't (modulo PEP 384) be compatible with  
Python 3.3's

foo.so.


If the package has .so files that aren't compatible with other version
of python, then what is the motivation for placing that in a shared
location (since it can't actually be shared)


Because python looks for .so files in the same place it looks for  
the .py files of the same package. E.g., given a module like lxml, it  
contains the following files (among others):

lxml/
lxml/__init__.py
lxml/__init__.pyc
lxml/builder.py
lxml/builder.pyc
lxml/etree.so

And you can only put it in one place. Really, python should store  
the .py files in /usr/share/python/, the .so files in /usr/lib/x86_64- 
linux-gnu/python2.5-debug/, and the .pyc files in /var/lib/python2.5- 
debug. But python doesn't work like that.


James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Stephen Thorne
On 2010-06-25, "Martin v. Löwis" wrote:
> Am 25.06.2010 01:28, schrieb Stephen Thorne:
> > Steve Holden Wrote:
> >> Given the amount of interest this thread has generated I can't help
> >> wondering why it isn't more prominent in python.org content. Is the
> >> developer community completely disjoint with the web content editor
> >> community?
> >>
> >> If there is such a disconnect we should think about remedying it: a
> >> large "Python 2 or 3?" button could link to a reasoned discussion of the
> >> pros and cons as evinced in this thread. That way people will end up
> >> with the right version more often (and be writing Python 2 that will
> >> more easily migrate to Python 3, if they cannot yet use 3).
> >>
> >> There seems to be a perception that the PSF can help fund developments,
> >> and indeed Jesse Noller has made a small start with his sprint funding
> >> proposal (which now has some funding behind it). I think if it is to do
> >> so the Foundation will have to look for substantial new funding. I do
> >> not currently understand where this funding would come from, and would
> >> like to tap your developer creativity in helping to define how the
> >> Foundation can effectively commit more developer time to Python.
> >>
> >> GSoC and GHOP are great examples, but there is plenty of room for all
> >> sorts of initiatives that result in development opportunities. I'd like
> >> to help.
> > 
> > I am extremely keen for this to happen. Does anyone have ownership of this
> > project? There was some discussion of it up-list but the discussion fizzled.
> 
> Can you please explain what "this project" is, in the context of your
> message? GSoC? GHOP?

Oh, I thought this was quite clear. I was specifically meaning the large
"Python 2 or 3" button on python.org. It would help users who want to know
what version of python to use if they had a clear guide as to what version
to download.

It doesn't help if someone goes to do greenfield development in python
if a library they depend upon has yet to be ported, and they're trying to
use python 3.

(As an addendum add pygtk to the list of libs that python 3 users on #python
are alarmed to find haven't been ported yet)

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] docs - Copy

2010-06-24 Thread Rich Healey
http://docs.python.org/library/copy.html

Just near the bottom it reads:

"""Shallow copies of dictionaries can be made using dict.copy(), and
of lists by assigning a slice of the entire list, for example,
copied_list = original_list[:]."""


Surely this is a typo? To my understanding, copied_list =
original_list[:] gives you a clean copy (slicing returns a new
object)

Can this be updated? Or someone explain to me why it's correct?

Cheers

Example:


>>> t = [1, 2, 3]
>>> y = t
>>> u = t[:]
>>> y[1] = "rawr"
>>> t
[1, 'rawr', 3]
>>> u
[1, 2, 3]
>>>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] FHS compliance of Python installation (was: versioned .so files for Python 3.2)

2010-06-24 Thread Ben Finney
James Y Knight  writes:

> Really, python should store the .py files in /usr/share/python/, the
> .so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and the .pyc
> files in /var/lib/python2.5- debug. But python doesn't work like that.

+1

So who's going to draft the “Filesystem Hierarchy Standard compliance”
PEP? :-)

-- 
 \ “Having sex with Rachel is like going to a concert. She yells a |
  `\  lot, and throws frisbees around the room; and when she wants |
_o__)more, she lights a match.” —Steven Wright |
Ben Finney

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Steve Holden
Stephen Thorne wrote:
> On 2010-06-25, "Martin v. Löwis" wrote:
>> Am 25.06.2010 01:28, schrieb Stephen Thorne:
>>> Steve Holden Wrote:
 Given the amount of interest this thread has generated I can't help
 wondering why it isn't more prominent in python.org content. Is the
 developer community completely disjoint with the web content editor
 community?

 If there is such a disconnect we should think about remedying it: a
 large "Python 2 or 3?" button could link to a reasoned discussion of the
 pros and cons as evinced in this thread. That way people will end up
 with the right version more often (and be writing Python 2 that will
 more easily migrate to Python 3, if they cannot yet use 3).

 There seems to be a perception that the PSF can help fund developments,
 and indeed Jesse Noller has made a small start with his sprint funding
 proposal (which now has some funding behind it). I think if it is to do
 so the Foundation will have to look for substantial new funding. I do
 not currently understand where this funding would come from, and would
 like to tap your developer creativity in helping to define how the
 Foundation can effectively commit more developer time to Python.

 GSoC and GHOP are great examples, but there is plenty of room for all
 sorts of initiatives that result in development opportunities. I'd like
 to help.
>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>> project? There was some discussion of it up-list but the discussion fizzled.
>> Can you please explain what "this project" is, in the context of your
>> message? GSoC? GHOP?
> 
> Oh, I thought this was quite clear. I was specifically meaning the large
> "Python 2 or 3" button on python.org. It would help users who want to know
> what version of python to use if they had a clear guide as to what version
> to download.
> 
> It doesn't help if someone goes to do greenfield development in python
> if a library they depend upon has yet to be ported, and they're trying to
> use python 3.
> 
> (As an addendum add pygtk to the list of libs that python 3 users on #python
> are alarmed to find haven't been ported yet)
> 
This topic really needs to go to the pydotorg list, as the guys there
maintain the site content. I know that Michael Foord is on both lists,
so he may be a good candidate for leading the charge, so to speak. This
topic is likely to assume increasing importance.

regards
 Steve
-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
See Python Video!   http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS:http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
 Ian Dury, 1942-2000
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] docs - Copy

2010-06-24 Thread Steve Holden
Rich Healey wrote:
> http://docs.python.org/library/copy.html
> 
> Just near the bottom it reads:
> 
> """Shallow copies of dictionaries can be made using dict.copy(), and
> of lists by assigning a slice of the entire list, for example,
> copied_list = original_list[:]."""
> 
> 
> Surely this is a typo? To my understanding, copied_list =
> original_list[:] gives you a clean copy (slicing returns a new
> object)
> 
Yes, but it's a shallow copy: the new object references exactly the same
objects as the original list (not copies of those objects). A deep copy
would need to copy any referenced lists, and so on.

> Can this be updated? Or someone explain to me why it's correct?
> 
It sounds correct to me.

regards
 Steve


> Cheers
> 
> Example:
> 
> 
 t = [1, 2, 3]
 y = t
 u = t[:]
 y[1] = "rawr"
 t
> [1, 'rawr', 3]
 u
> [1, 2, 3]


-- 
Steve Holden   +1 571 484 6266   +1 800 494 3119
See Python Video!   http://python.mirocommunity.org/
Holden Web LLC http://www.holdenweb.com/
UPCOMING EVENTS:http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
 Ian Dury, 1942-2000

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] docs - Copy

2010-06-24 Thread Alexander Belopolsky
On Thu, Jun 24, 2010 at 8:51 PM, Rich Healey  wrote:
> http://docs.python.org/library/copy.html
>
> Just near the bottom it reads:
>
> """Shallow copies of dictionaries can be made using dict.copy(), and
> of lists by assigning a slice of the entire list, for example,
> copied_list = original_list[:]."""
>
>
> Surely this is a typo? To my understanding, copied_list =
> original_list[:] gives you a clean copy (slicing returns a new
> object)
>

If you read the doc excerpt carefully, you will realize that it says
the same thing.  I agree that the language can be improved, though.
There is no need to bring in assignment to explain that a[:] makes a
copy of list a.   Please create a documentation issue at
http://bugs.python.org .  If you can suggest a better formulation, it
is likely to be accepted.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] versioned .so files for Python 3.2

2010-06-24 Thread Greg Ewing

Scott Dial wrote:


But the only motivation for doing this with .pyc files is that the .py
files are able to be shared,


In an application made up of a mixture of pure Python and
extension modules, the .py files are able to be shared too.
Seems to me that a similar motivation exists here as well.
Not exactly the same, but closely related.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] docs - Copy

2010-06-24 Thread Rich Healey
On Fri, Jun 25, 2010 at 11:04 AM, Steve Holden  wrote:
> Rich Healey wrote:
>> http://docs.python.org/library/copy.html
>>
>> Just near the bottom it reads:
>>
>> """Shallow copies of dictionaries can be made using dict.copy(), and
>> of lists by assigning a slice of the entire list, for example,
>> copied_list = original_list[:]."""
>>
>>
>> Surely this is a typo? To my understanding, copied_list =
>> original_list[:] gives you a clean copy (slicing returns a new
>> object)
>>
> Yes, but it's a shallow copy: the new object references exactly the same
> objects as the original list (not copies of those objects). A deep copy
> would need to copy any referenced lists, and so on.
>

My apologies guys, I see now.

I will see if I can think of a less ambiguous way to word this and submit a bug.

Thankyou!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Terry Reedy

On 6/24/2010 8:31 PM, Stephen Thorne wrote:


Oh, I thought this was quite clear. I was specifically meaning the large
"Python 2 or 3" button on python.org. It would help users who want to know
what version of python to use if they had a clear guide as to what version
to download.


I think everyone on pydev agrees that that would be good, but I do 
believe anyone has taken ownership of the issue as yet. I am not sure 
who currently maintains the site and whether such are aware of the proposal.


I believe there is material on the wiki as well as the two existing 
pages on other sites that were discussed here. So a new page on 
python.org could consist of a few links. Someone just has to write it.


It doesn't help if someone goes to do greenfield development in python
if a library they depend upon has yet to be ported, and they're trying to
use python 3.

(As an addendum add pygtk to the list of libs that python 3 users on #python
are alarmed to find haven't been ported yet)


The list, if it exists, should be on the wiki, where any registered user 
can edit it, rather than on the .org page.


I suspect that the feedback about Python on #python is somewhat 
different from that on python-list. I also suspect that some of it could 
be used to improve python, the docs, and the site. Is that happening 
much? I know I regularly open tracker issues (such as 6507, 8824, and 
8945) based on python-list discussions , and I know others have made 
wiki edits.



--
Terry Jan Reedy

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on the bytes/string discussion

2010-06-24 Thread Greg Ewing

Terry Reedy wrote:

On 6/24/2010 1:38 PM, Bill Janssen wrote:


We have separate types for int,
float, Decimal, etc.  But they're all numbers, and they all
cross-operate.


No they do not. Decimal only mixes properly with ints, but not with 
anything else


I think there are also some important differences between
numbers and strings concerning how they interact with C code.

In C there are really only two choices for representing a
Python number in a way that C code can directly operate on --
long or double -- and there is a set of functions for coercing a
Python object into one of these that C code almost universally
uses. So a new number type only has to implement the appropriate
conversion methods to be usable by all of that C code.

On the other hand, the existing C code that operates on Python
strings often assumes that it has a particular internal
representation. A new abstract string-access API would have to
be devised, and all existing C code updated to use it. Also,
this new API would not be as easy to use as the number API,
because it would involve asking for the data in some specified
encoding, which would require memory allocation and management.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] "2 or 3" link on python.org

2010-06-24 Thread Nick Coghlan
On Fri, Jun 25, 2010 at 11:18 AM, Terry Reedy  wrote:
> I believe there is material on the wiki as well as the two existing pages on
> other sites that were discussed here. So a new page on python.org could
> consist of a few links. Someone just has to write it.

There's material on the wiki *now* (the Python2orPython3 page), but
there wasn't before the recent discussion started. The whole
Beginner's Guide on the wiki could actually use some TLC to bring it
up to speed with the existence of Python 3.x.

Cheers,
Nick.

-- 
Nick Coghlan   |   [email protected]   |   Brisbane, Australia
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] docs - Copy

2010-06-24 Thread Senthil Kumaran
On Thu, Jun 24, 2010 at 09:05:09PM -0400, Alexander Belopolsky wrote:
> On Thu, Jun 24, 2010 at 8:51 PM, Rich Healey  wrote:
> > http://docs.python.org/library/copy.html
> >
> > Just near the bottom it reads:
> >
> > """Shallow copies of dictionaries can be made using dict.copy(), and
> > of lists by assigning a slice of the entire list, for example,
> > copied_list = original_list[:]."""
> >
> >
> > Surely this is a typo? To my understanding, copied_list =
> > original_list[:] gives you a clean copy (slicing returns a new
> > object)
> >
> 
> the same thing.  I agree that the language can be improved, though.
> There is no need to bring in assignment to explain that a[:] makes a
> copy of list a.   Please create a documentation issue at

Better still, add your doc change suggestion (possible explanation) to
this issue:
http://bugs.python.org/issue9021


-- 
Senthil
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com