Re: [Python-Dev] Mini-Pep: An Empty String ABC

2008-06-01 Thread Antoine Pitrou
Raymond Hettinger  rcn.com> writes:
> Also, starting with Py3.0, strings are
> essentially abstract sequences of code points, meaning that an encode() method
> is essential to being able to usefully transform them back into concrete data.

Well, that depends:
- is a String the specification of a generic range of types which one might want
to special-case in some algorithms, e.g. flatten()
- or is a String the specification of something which is meant to be used as a
replacement of str (or, perhaps, bytes)?

If you answer the former, the String API should be very minimal and there is no
reason for it to support "encoding" or "decoding". Such a String doesn't have to
be a string of characters, it can contain arbitrary objects, e.g. DNA elements.

If you answer the latter, what use is a String subclass which isn't a drop-in
replacement for either str or bytes? Saying "hello, I'm a String" is not very
useful if you can't be used anywhere in existing code. I think most Python
coders wouldn't go out of their way to allow arbitrary String instances as
parameters for their functions, rather than objects conforming to the full str
(or, perhaps, bytes) API.


I'd like to know the use cases of a String ABC representing replacements of the
str class, though. I must admit I've never used UserString and the like, and
don't know how useful they can be. However, the docs have the following to say: 

« This UserString class from this module is available for backward compatibility
only. If you are writing code that does not need to work with versions of Python
earlier than Python 2.2, please consider subclassing directly from the built-in
str type instead of using UserString ».

So, apart from compatibility purposes, what is the point currently of *not*
directly subclassing str?

Regards

Antoine.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Alternative to more ABCs [was:] Iterable String Redux (aka String ABC)

2008-06-01 Thread Paul Moore
2008/6/1 Guido van Rossum <[EMAIL PROTECTED]>:
>> The case for String has already been made.
>
> Actually I'm not sure. One you know that isinstance(x, String) is
> true, what can you assume you can do with x?
[...]
> Right. I'm now beginning to wonder what exactly you're after here --
> saying that something is an "X" without saying anything about an API
> isn't very useful. You need to have at least *some* API to be able to
> do anything with that knowledge.

Apologies to Raymond if I'm putting words into his mouth, but I think
it's more about *not* doing things with the type - a String is a
Sequence that we don't wish to iterate through (in the flatten case),
so the code winds up looking like

for elem in seq:
if isinstance(elem, Sequence) and not isinstance(elem, String):
recurse into the element
else:
deal with the element as atomic

This implies that other "empty" abstract types aren't useful, though,
as they are not subclasses of anything else...

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 371 Discussion (pyProcessing Module)

2008-06-01 Thread Paul Moore
2008/5/31 r. m. oudkerk <[EMAIL PROTECTED]>:
> I am certainly open to using lowercase/lower_case_with_underscores for
> all functions/methods except for Process's methods and possibly
> currentProcess(), but I would like some feed back on that.

I dislike mixedCase, but consistency with the rest of the library is
more important - and as processing is matching the API of threading,
which used mixedCase, it should follow that convention.

Wasn't there some talk of changing modules to use PEP 8 conventions
(lowercase_with_underscore) as part of the Python 3.0 conversion? Did
that ever happen?

Paul.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-06-01 Thread Greg Ewing

Armin Ronacher wrote:


basestring is not subclassable for example.  Also it requires subclassing
which ABCs do not.


The use case that was cited was recognising subclasses of
UserString, and that's what I was responding to. If
basestring were made subclassable and UserString inherited
from it, that use case would be covered.

Recognising string-like objects *without* requiring
subclassing is a hopeless morass to get into, in my
opinion. You'll just have endless arguments about which
of the zillion methods of str should be in the blessed
set which confers string-ness.

I also think that the ABC idea in general suffers from
that problem, to one degree or another depending on
the class involved. Strings are just an extreme case.

--
Greg

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mini-Pep: Simplifying the Integral ABC

2008-06-01 Thread Fredrik Johansson
On Sun, Jun 1, 2008 at 8:15 AM, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> Discussion
> --
> The only known use cases for variants of int are types that limit the range
> of
> values to those that fit in a fixed storage width.

Add:
* Faster big integers (gmpy)
* Integers with exact division to rationals (e.g. sympy)

Fredrik
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] PEP 371 Discussion (pyProcessing Module)

2008-06-01 Thread Nick Coghlan

Paul Moore wrote:

Wasn't there some talk of changing modules to use PEP 8 conventions
(lowercase_with_underscore) as part of the Python 3.0 conversion? Did
that ever happen?


We fixed the module names that used mixed case - the amount of work that 
turned out to be involved in just doing that much for PEP 3108 makes me 
shudder at the thought of trying to fix all of the standard library APIs 
that currently don't follow the style guide...


Cheers,
Nick.

--
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
http://www.boredomandlaziness.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mini-Pep: Simplifying the Integral ABC

2008-06-01 Thread Aahz
On Sat, May 31, 2008, Raymond Hettinger wrote:
>
> Proposal
> 
> Remove non-essential abstract methods like __index__, three argument
> __pow__, __lshift__, __rlshift__, __rshift__, __rrshift__, __and__,
> __rand__, __xor__, __rxor__, __or__, __ror__, and __invert__,
> numerator, and denominator.

The only thing I object to is removing __index__ -- the whole point of an
integral class is that it is substitutable as an index for sequences in
a way that other numeric types are not.  Having an __index__ special
method is a key indicator for duck-typing purposes not covered by the
ABC.
-- 
Aahz ([EMAIL PROTECTED])   <*> http://www.pythoncraft.com/

Need a book?  Use your library!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Finishing up PEP 3108

2008-06-01 Thread Georg Brandl

Mark Dickinson schrieb:
On Sat, May 31, 2008 at 11:33 AM, Georg Brandl <[EMAIL PROTECTED] 
> wrote:



Now that the docs are reST, the source is almost pretty enough to
display
it raw, but I could also imagine a "text" writer that removes the more
obscure markup to present a casual-reader-friendly text version.

The needed sources could then be distributed with Python -- it shouldn't
be more than about 200 kb.


+1 from me.  Would this mean that htmllib and sgmllib could be
removed without further ado.


OK, I've now implemented this in the trunk (will merge to 3k soon --
htmllib and sgmllib can go then).

The topic help is contained in a new module, pydoc_topics.py, which pydoc
imports to provide this help. The module can be generated with Sphinx
by running "make pydoc-topics" in the Doc/ directory. (This is one more
step for the release process, but it is an easy one.)

The module is currently ~ 400 kb in size. If this is deemed to be a problem,
we could use zlib to compress the contents -- which of course is bad for
systems without the zlib module (are there any?).

Georg

--
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Finishing up PEP 3108

2008-06-01 Thread Martin v. Löwis
> The module is currently ~ 400 kb in size. If this is deemed to be a
> problem,
> we could use zlib to compress the contents -- which of course is bad for
> systems without the zlib module (are there any?).

In the distribution, the file gets compressed, anyway. In the
installation, I don't think it is a problem.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-06-01 Thread Guido van Rossum
On Sun, Jun 1, 2008 at 3:54 AM, Greg Ewing <[EMAIL PROTECTED]> wrote:
> The use case that was cited was recognising subclasses of
> UserString, and that's what I was responding to. If
> basestring were made subclassable and UserString inherited
> from it, that use case would be covered.

UserString intentionally doesn't subclass basestring. When basestring
was introduced, it was specifically meant to be the base class of
*only* str and unicode. There are quite a few core APIs that accept no
substitutes, and being an instance of basestring was intended to
guarantee that a value is accepted by such APIs.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Finishing up PEP 3108

2008-06-01 Thread Quentin Gallet-Gilles
I've uploaded a patch for the aifc module (http://bugs.python.org/issue2847).
I'm still working on the testsuite.
Comments are welcome!

Quentin

On Thu, May 29, 2008 at 5:39 PM, Quentin Gallet-Gilles <[EMAIL PROTECTED]>
wrote:

>
> On Thu, May 29, 2008 at 4:56 PM, Lars Immisch <[EMAIL PROTECTED]> wrote:
>
>> 
>>
>>>Issue 2847 - the aifc module still imports the cl module in 3.0.
>>>Problem is that the cl module is gone. =) So it seems silly to
>>> have
>>>the imports lying about. This can probably be changed to critical.
>>>
>>>
>>>It shouldn't be a problem to rip everything cl-related out of aifc.
>>>The question is how useful aifc will be after that ...
>>>
>>>
>>> Has someone already used that module ? I took a look into it, but I'm a
>>> bit confused about the various compression types, case-sensitivity and
>>> compatibility issues [1]. Are Apple's "alaw" and SGI's "ALAW" really the
>>> same encoding ? Can we use the audioop module for ALAW, just like it's
>>> already done for ULAW ?
>>>
>>
>> There is just one alaw I've ever come across (G.711), and the audioop
>> implementation could be used (audioop's alaw support is younger than the
>> aifc module, BTW)
>>
>> The capitalisation is confusing, but your document [1] says: "Apple
>> Computer's QuickTime player recognize only the Apple compression types.
>> Although "ALAW" and "ULAW" contain identical sound samples to the "alaw" and
>> "ulaw" formats and were in use long before Apple introduced the new codes,
>>  QuickTime does not recognize them."
>>
>> So this seems just a matter of naming in the AIFC, but not a matter of two
>> different alaw implementations.
>>
>> - Lars
>>
>
> Ok, I'll handle this issue. I'll be using the audioop implementation as a
> replacement of the SGI compression library. I'll also create a test suite,
> as Brett mentioned in the bug tracker the module was missing one.
>
> Quentin
>
>
>>
>> [1] http://www-mmsp.ece.mcgill.ca/Documents/AudioFormats/AIFF/AIFF.html
>>
>
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Alternative to more ABCs [was:] Iterable String Redux (aka String ABC)

2008-06-01 Thread Guido van Rossum
On Sun, Jun 1, 2008 at 6:57 AM, Paul Moore <[EMAIL PROTECTED]> wrote:
> 2008/6/1 Guido van Rossum <[EMAIL PROTECTED]>:
>>> The case for String has already been made.
>>
>> Actually I'm not sure. One you know that isinstance(x, String) is
>> true, what can you assume you can do with x?
> [...]
>> Right. I'm now beginning to wonder what exactly you're after here --
>> saying that something is an "X" without saying anything about an API
>> isn't very useful. You need to have at least *some* API to be able to
>> do anything with that knowledge.
>
> Apologies to Raymond if I'm putting words into his mouth, but I think
> it's more about *not* doing things with the type - a String is a
> Sequence that we don't wish to iterate through (in the flatten case),
> so the code winds up looking like
>
>for elem in seq:
>if isinstance(elem, Sequence) and not isinstance(elem, String):
>recurse into the element
>else:
>deal with the element as atomic

I thought that was he meant too, until he said he rejected my offhand
suggestion of Atomic with these words: "Earlier in the thread it was
made clear that that atomicity is not an intrinsic property of a type;
instead it varies across applications [...]"

> This implies that other "empty" abstract types aren't useful, though,
> as they are not subclasses of anything else...

There's a thread on this out now I believe.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Mini-Pep: An Empty String ABC

2008-06-01 Thread Guido van Rossum
This PEP is incomplete without specifying exactly which built-in and
stdlib types should be registered as String instances.

I'm also confused -- the motivation seems mostly "so that you can skip
iterating over it when flattening a nested sequence" but earlier you
rejected my "Atomic" proposal, saying "Earlier in the thread it was
made clear that that atomicity is not an intrinsic property of a type;
instead it varies across applications [...]". Isn't this String
proposal just that by another name?

Finally, I fully expect lots of code writing isinstance(x, String) and
making many more assumptions than promised by the String ABC. For
example that s[0] has the same type as s (not true for bytes). Or that
it is hashable (the Sequence class doesn't define __hash__). Or that
s1+s2 will work (not in the Sequence class either). And many more.

All this makes me lean towards a rejection of this proposal -- it
seems worse than no proposal at all. It could perhaps be rescued by
adding some small set of defined operations.

--Guido

On Sat, May 31, 2008 at 11:59 PM, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> Mini-Pep:  An Empty String ABC
> Target:  Py2.6 and Py3.0
> Author:  Raymond Hettinger
>
> Proposal
> 
>
> Add a new collections ABC specified as:
>
>   class String(Sequence):
>   pass
>
> Motivation
> --
> Having an ABC for strings allows string look-alike classes to declare
> themselves as sequences that contain text.  Client code (such as a flatten
> operation or tree searching tool) may use that ABC to usefully differentiate
> strings from other sequences (i.e. containers vs containees).  And in code
> that only relies on sequence behavior, isinstance(x,str) may be usefully
> replaced by isinstance(x,String) so that look-alikes can be substituted in
> calling code.
>
> A natural temptation is add other methods to the String ABC, but strings are
> a
> tough case.  Beyond simple sequence manipulation, the string methods get
> very
> complex.  An ABC that included those methods would make it tough to write a
> compliant class that could be registered as a String.  The split(),
> rsplit(),
> partition(), and rpartition() methods are examples of methods that would be
> difficult to emulate correctly.  Also, starting with Py3.0, strings are
> essentially abstract sequences of code points, meaning that an encode()
> method
> is essential to being able to usefully transform them back into concrete
> data.
> Unfortunately, the encode method is so complex that it cannot be readily
> emulated by an aspiring string look-alike.
>
> Besides complexity, another problem with the concrete str API is the
> extensive
> number of methods.  If string look-alikes were required to emulate the likes
> of zfill(), ljust(), title(), translate(), join(), etc., it would
> significantly add to the burden of writing a class complying with the String
> ABC.
>
> The fundamental problem is that of balancing a client function's desire to
> rely on a broad number of behaviors against the difficulty of writing a
> compliant look-alike class.  For other ABCs, the balance is more easily
> struck
> because the behaviors are fewer in number, because they are easier to
> implement correctly, and because some methods can be provided as mixins.
>  For
> a String ABC, the balance should lean toward minimalism due to the large
> number of methods and how difficult it is to implement some of the
> correctly.
>
> A last reason to avoid expanding the String API is that almost none of the
> candidate methods characterize the notion of "stringiness".  With something
> calling itself an integer, an __add__() method would be expected as it is
> fundamental to the notion of "integeriness".  In contrast, methods like
> startswith() and title() are non-essential extras -- we would not discount
> something as being not stringlike if those methods were not present.
>
>
>
>
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] [Python-3000] Stabilizing the C API of 2.6 and 3.0

2008-06-01 Thread Gregory P. Smith
On Fri, May 30, 2008 at 1:37 AM, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> On 2008-05-30 00:57, Nick Coghlan wrote:
>>
>> M.-A. Lemburg wrote:
>>>
>>> * Why can't we have both PyString *and* PyBytes exposed in 2.x,
>>> with one redirecting to the other ?
>>
>> We do have that - the PyString_* names still work perfectly fine in 2.x.
>> They just won't be used in the Python core codebase anymore - everything in
>> the Python core will use either PyBytes_* or PyUnicode_* regardless of which
>> branch (2.x or 3.x) you're working on. I think that's a good thing for ease
>> of maintenance in the future, even if it takes people a while to get their
>> heads around it right now.
>
> Sorry, I probably wasn't clear enough:
>
> Why can't we have both PyString *and* PyBytes exposed as C
> APIs (ie. visible in code and in the linker) in 2.x, with one redirecting
> to the other ?
>
>>> * Why should the 2.x code base turn to hacks, just because 3.x wants
>>> to restructure itself ?
>>
>> With the better explanation from Greg of what the checked in approach
>> achieves (i.e. preserving exact ABI compatibility for PyString_*, while
>> allowing PyBytes_* to be used at the source code level), I don't see what
>> has been done as being any more of a hack than the possibly more common
>> "#define  " (which *would* break binary compatibility).
>>
>> The only things that I think would tidy it up further would be to:
>> - include an explanation of the approach and its effects on API and ABI
>> backward and forward compatibility within 2.x and between 2.x and 3.x in
>> stringobject.h
>> - expose the PyBytes_* functions to the linker in 2.6 as well as 3.0
>
> Which is what I was suggesting all along; sorry if I wasn't
> clear enough on that.
>
> The standard approach is that you provide #define redirects from the
> old APIs to the new ones (which are then picked up by the compiler)
> *and* add function wrappers to the same affect (to make linkers,
> dynamic load APIs such ctypes and debuggers happy).
>
>
> Example from pythonrun.h|c:
> ---
>
> /* Use macros for a bunch of old variants */
> #define PyRun_String(str, s, g, l) PyRun_StringFlags(str, s, g, l, NULL)
>
> /* Deprecated C API functions still provided for binary compatiblity */
>
> #undef PyRun_String
> PyAPI_FUNC(PyObject *)
> PyRun_String(const char *str, int s, PyObject *g, PyObject *l)
> {
>return PyRun_StringFlags(str, s, g, l, NULL);
> }
>

Okay, how about this?  http://codereview.appspot.com/1521

Using that patch, both PyString_ and PyBytes_ APIs are available using
function stubs similar to the above.  I opted to define the stub
functions right next to the ones they were stubbing rather than
putting them all at the end of the file or in another file but they
could be moved if someone doesn't like them that way.

> I still believe that we should *not* make "easy of merging" the
> primary motivation for backporting changes in 3.x to 2.x. Software
> design should not be guided by restrictions in the tool chain,
> if not absolutely necessary.
>
> The main argument for a backport needs to be general usefulness
> to the 2.x users, IMHO... just like any other feature that
> makes it into 2.x.
>
> If merging is difficult then this needs to be addressed, but
> there are more options to that than always going back to the
> original 2.x trunk code. I've given a few suggestions on how
> this could be approached in other emails on this thread.

I am not the one doing the merging or working on merge tools so I'll
leave this up to those that are.

-gps
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Iterable String Redux (aka String ABC)

2008-06-01 Thread Greg Ewing

Guido van Rossum wrote:

There are quite a few core APIs that accept no
substitutes, and being an instance of basestring was intended to
guarantee that a value is accepted by such APIs.


In that case, the idea of a user-defined string class
that doesn't inherit from str or unicode seems to be
a lost cause, since it will never be acceptable in
those places, whatever is done with ABCs.

--
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com