date:20050824

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Niko Matsakis

> As for the rest, I'm not as sure and it would be helpful to get  
> thoughts
> from others on this one.  My sense is that blocking the clause from
> appearing in the middle is treating the symptom and not the disease.

+1

It would be better to prohibit bare except entirely (well, presumably  
at some point in the future with appropriate warnings at the moment)  
than change its semantics.  I agree that its intuitive meaning is "if  
anything is thrown", not, "if a non-programmer-error exception is  
thrown," but I'm not sure if that's even important.  The point is  
that it has existing well defined semantics; changing them just seems  
unnecessary to the aims of the rewrite and confusing to existing  
Python programmers.

I've written plenty of code with bare excepts and they all intended  
to catch *any* exception, usually in a user interface where I wanted  
to return to the main loop on programmer error not abort the entire  
program.  I don't relish the thought of going back and changing  
existing code, and I imagine there are few who do.



My 2 cents,
Niko
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Nick Coghlan

Raymond Hettinger wrote:
> The latest version of PEP 348 still proposes that a bare except clause
> will default to Exception instead of BaseException.  Initially, I had
> thought that might be a good idea but now think it is doomed and needs
> to be removed from the PEP.

One thing I assumed was that _if_ bare excepts were kept, they would still 
only be allowed as the last except clause.

That is, this example:
>   try: ...
>   except:  ... # A bare except in the middle.  WTF?
>   except (KeyboardInterrupt, SystemExit): ...

would still be a syntax error, even if bare excepts were allowed.

I still have some qualms about the idea of a bare except that doesn't catch 
everything (I'd prefer to see them gone altogether), but I don't mind quite as 
much if the above code stays as a syntax error.

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://boredomandlaziness.blogspot.com
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Keir Mierle wrote:

> Hi, I'm working on Argon (http://www.third-bit.com/trac/argon) with Greg
> Wilson this summer
> 
> We're having a very strange problem with Python's unicode parsing of source
> files. Basically, our CGI script was running extremely slowly on our 
> production
> box (a pokey dual-Xeon 3GHz w/ 4GB RAM and 15K SCSI drives). Slow to the tune
> of 6-10 seconds per request. I eventually tracked this down to imports of our
> source tree; the actual request was completing in 300ms, the rest of the time
> was spent in __import__.

This is caused by the chances to the codecs in 2.4. Basically the codecs 
no longer rely on C's readline() to do line splitting (which can't work 
for UTF-16), but do it themselves (via unicode.splitlines()).

> After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was
> getting called 51 million times. Our code is 1.2 million characters, so I
> hardly think it makes sense to call IsLinebreak 50 times for each character;
> and we're not even importing our entire source tree on every invocation.

But if you're using CGI, you're importing your source on every 
invocation. Switching to a different server side technology might help. 
Nevertheless 50 million calls seems to be a bit much.

> Our code is a fork of Trac, and originally had these lines at the top:
> 
> # -*- coding: iso8859-1 -*-  
> 
> This made me suspicious, so I removed all of them. The CGI execution time
> immediately dropped to ~1 second. gprof revealed that
> _PyUnicodeUCS2_IsLinebreak is not called at all anymore.
> 
> Now that our code works fast enough, I don't really care about this, but I
> thought python-dev might want to know something weird is going on with unicode
> splitlines.

I wonder if we should switch back to a simple readline() implementation 
for those codecs that don't require the current implementation 
(basically every charmap codec). AFAIK source files are opened in 
universal newline mode, so at least we'd get proper treatment of "\n", 
"\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", 
u"\x85", u"\u2028" and u"\u2029" (which are line terminators according 
to unicode.splitlines()).

> I documented my investigation of this problem; if anyone wants further 
> details,
> just email me. (I'm not on python-dev)
> http://www.third-bit.com/trac/argon/ticket/525

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Martin v. Löwis

Walter Dörwald wrote:
> This is caused by the chances to the codecs in 2.4. Basically the codecs 
> no longer rely on C's readline() to do line splitting (which can't work 
> for UTF-16), but do it themselves (via unicode.splitlines()).

That explains why you get any calls to IsLineBreak; it doesn't explain
why you get so many of them.

I investigated this a bit, and one issue seems to be that
StreamReader.readline performs splitline on the entire input, only to
fetch the first line. It then joins the rest for later processing.
In addition, it also performs splitlines on a single line, just to
strip any trailing line breaks.

The net effect is that, for a file with N lines, IsLineBreak is invoked
up to N*N/2 times per character (atleast for the last character).

So I think it would be best if Unicode characters exposed a .islinebreak
method (or, failing that, codecs just knew what the line break
characters are in Unicode 3.2), and then codecs would split off
the first line of input itself.

>>After doing some gprof profiling, I discovered _PyUnicodeUCS2_IsLinebreak was
>>getting called 51 million times. Our code is 1.2 million characters, so I
>>hardly think it makes sense to call IsLinebreak 50 times for each character;
>>and we're not even importing our entire source tree on every invocation.
> 
> 
> But if you're using CGI, you're importing your source on every 
> invocation.

Well, no. Only the CGI script needs to be parsed every time; all modules
could load off bytecode files.

Which suggests that Keir Mierle doesn't use bytecode files, I think he
should.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread M.-A. Lemburg

Walter Dörwald wrote:
> I wonder if we should switch back to a simple readline() implementation 
> for those codecs that don't require the current implementation 
> (basically every charmap codec). 

That would be my preference as well. The 2.4 .readline() approach
is really only needed for codecs that have to deal with encodings
that:

a) use multi-byte formats, or
b) support more line-end formats than just CR, CRLF, LF, or
c) are stateful.

This can easily be had by using a mix-in class for
codecs which do need the buffered .readline() approach.

> AFAIK source files are opened in 
> universal newline mode, so at least we'd get proper treatment of "\n", 
> "\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", 
> u"\x85", u"\u2028" and u"\u2029" (which are line terminators according 
> to unicode.splitlines()).

While the Unicode standard defines these characters as line
end code points, I think their definition does not necessarily
apply to data that is converted from a certain encoding to
Unicode, so that's not a big loss.

E.g. in ASCII or Latin-1, FILE, GROUP and RECORD
SEPARATOR and NEXT LINE characters (0x1c, 0x1d, 0x1e, 0x85)
are not interpreted as line end characters.

Furthermore, we had no reports of anyone complaining in
Python 1.6, 2.0 - 2.3 that line endings were not detected
properly. All these Python versions relied on the stream's
.readline() method to get the next line. The only bug reports
we had were for UTF-16 which falls into the above
category a) and did not support .readline() until Python 2.4.

A note on the performance of _PyUnicode_IsLinebreak():
in Python 2.0 Fredrik changed this to use the two step
lookup (reducing the size of the lookup tables considerably).

I think it's worthwhile reconsidering this approach for
character type queries that do no involve a huge number
of code points.

In Python 1.6 the function looked like this (and was
inlined by the compiler using its own fast lookup
table):

int _PyUnicode_IsLinebreak(register const Py_UNICODE ch)
{
switch (ch) {
case 0x000A: /* LINE FEED */
case 0x000D: /* CARRIAGE RETURN */
case 0x001C: /* FILE SEPARATOR */
case 0x001D: /* GROUP SEPARATOR */
case 0x001E: /* RECORD SEPARATOR */
case 0x0085: /* NEXT LINE */
case 0x2028: /* LINE SEPARATOR */
case 0x2029: /* PARAGRAPH SEPARATOR */
return 1;
default:
return 0;
}
}

another candidate to convert back is:

int _PyUnicode_IsWhitespace(register const Py_UNICODE ch)
{
switch (ch) {
case 0x0009: /* HORIZONTAL TABULATION */
case 0x000A: /* LINE FEED */
case 0x000B: /* VERTICAL TABULATION */
case 0x000C: /* FORM FEED */
case 0x000D: /* CARRIAGE RETURN */
case 0x001C: /* FILE SEPARATOR */
case 0x001D: /* GROUP SEPARATOR */
case 0x001E: /* RECORD SEPARATOR */
case 0x001F: /* UNIT SEPARATOR */
case 0x0020: /* SPACE */
case 0x0085: /* NEXT LINE */
case 0x00A0: /* NO-BREAK SPACE */
case 0x1680: /* OGHAM SPACE MARK */
case 0x2000: /* EN QUAD */
case 0x2001: /* EM QUAD */
case 0x2002: /* EN SPACE */
case 0x2003: /* EM SPACE */
case 0x2004: /* THREE-PER-EM SPACE */
case 0x2005: /* FOUR-PER-EM SPACE */
case 0x2006: /* SIX-PER-EM SPACE */
case 0x2007: /* FIGURE SPACE */
case 0x2008: /* PUNCTUATION SPACE */
case 0x2009: /* THIN SPACE */
case 0x200A: /* HAIR SPACE */
case 0x200B: /* ZERO WIDTH SPACE */
case 0x2028: /* LINE SEPARATOR */
case 0x2029: /* PARAGRAPH SEPARATOR */
case 0x202F: /* NARROW NO-BREAK SPACE */
case 0x3000: /* IDEOGRAPHIC SPACE */
return 1;
default:
return 0;
}
}

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 23 2005)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/

::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Martin v. Löwis

M.-A. Lemburg wrote:
> I think it's worthwhile reconsidering this approach for
> character type queries that do no involve a huge number
> of code points.

I would advise against that. I measure both versions
(your version called PyUnicode_IsLinebreak2) with the
following code

volatile int result;
void unibench()
{
#define REPS 100LL
  long long i;
  clock_t s1,s2,s3,s4,s5;
  s1 = clock();
  for(i=0;ihttp://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Martin v. Löwis wrote:

> Walter Dörwald wrote:
> 
>>This is caused by the chances to the codecs in 2.4. Basically the codecs 
>>no longer rely on C's readline() to do line splitting (which can't work 
>>for UTF-16), but do it themselves (via unicode.splitlines()).
> 
> That explains why you get any calls to IsLineBreak; it doesn't explain
> why you get so many of them.
> 
> I investigated this a bit, and one issue seems to be that
> StreamReader.readline performs splitline on the entire input, only to
> fetch the first line. It then joins the rest for later processing.
> In addition, it also performs splitlines on a single line, just to
> strip any trailing line breaks.

This is because unicode.splitlines() is the only API available to Python 
that knows about unicode line feeds.

> The net effect is that, for a file with N lines, IsLineBreak is invoked
> up to N*N/2 times per character (atleast for the last character).
 >
> So I think it would be best if Unicode characters exposed a .islinebreak
> method (or, failing that, codecs just knew what the line break
> characters are in Unicode 3.2), and then codecs would split off
> the first line of input itself.

I think a maxsplit argument (just as for unicode.split()) would help too.

> [...]

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread M.-A. Lemburg

Martin v. Löwis wrote:
> M.-A. Lemburg wrote:
> 
>>I think it's worthwhile reconsidering this approach for
>>character type queries that do no involve a huge number
>>of code points.
> 
> 
> I would advise against that. I measure both versions
> (your version called PyUnicode_IsLinebreak2) with the
> following code
> 
> volatile int result;
> void unibench()
> {
> #define REPS 100LL
>   long long i;
>   clock_t s1,s2,s3,s4,s5;
>   s1 = clock();
>   for(i=0;i result = _PyUnicode_IsLinebreak('(');
>   s2 = clock();
>   for(i=0;i result = PyUnicode_IsLinebreak2('(');
>   s3 = clock();
>   for(i=0;i result = _PyUnicode_IsLinebreak('\n');
>   s4 = clock();
>   for(i=0;i result = PyUnicode_IsLinebreak2('\n');
>   s5 = clock();
>   printf("f1, (: %d\nf2, (: %d\nf1, CR: %d\n, f2, CR: %d\n",
>(int)(s2-s1),(int)(s3-s2),(int)(s4-s3),(int)(s5-s4));
> }
> 
> and got those numbers
> 
> f1, (: 1321
> f2, (: 1330
> f1, CR: 1322
> , f2, CR: 1325
> 
> What can be seen is that performance the two versions is nearly
> identical, with the code currently used being slightly better.
> What can also be seen is that, on my machine, 1e10 calls to
> IsLinebreak take 13.2 seconds. So 51  Mio calls take about 70ms.

Your test is somewhat biased: the current solution
works using type records, so it has to swap in a new
record for each character you test. In you benchmark,
the same character is tested over and over again
and the type record likely already stored in the
CPU cache.

The .splitlines() routine itself calls the above
function for each and every character in the string,
so quite a few of these type records have to be
looked up.

Here's a version that uses os.py as basis:

#include 
#include 
#include "Python.h"

int _PyUnicode_IsLinebreak16(register const Py_UNICODE ch)
{
switch (ch) {
case 0x000A: /* LINE FEED */
case 0x000D: /* CARRIAGE RETURN */
case 0x001C: /* FILE SEPARATOR */
case 0x001D: /* GROUP SEPARATOR */
case 0x001E: /* RECORD SEPARATOR */
case 0x0085: /* NEXT LINE */
case 0x2028: /* LINE SEPARATOR */
case 0x2029: /* PARAGRAPH SEPARATOR */
return 1;
default:
return 0;
}
}

#define REPS 1
#define BUFFERSIZE 3

int main(void)
{
long i, j;
clock_t s1,s2,s3;
char *buffer;
FILE *datafile;
long filelen;
int result;

datafile = fopen("os.py", "rb");
if (datafile == NULL) {
printf("could not find os.py\n");
return -1;
}
buffer = (char *)malloc(BUFFERSIZE);
filelen = fread(buffer, 1, BUFFERSIZE, datafile);
printf("filelen=%li bytes\n", filelen);

s1 = clock();

/* Python 2.4 */
for(i = 0; i < REPS; i++)
for (j = 0; j < filelen; j++)
result = _PyUnicode_IsLinebreak((Py_UNICODE)buffer[j]);
s2 = clock();

/* Python 1.6 */
for(i = 0; i < REPS; i++)
for (j = 0; j < filelen; j++)
result = _PyUnicode_IsLinebreak16((Py_UNICODE)buffer[j]);
s3 = clock();

printf("2.4: %d\n"
   "1.6: %d\n",
   (int)(s2-s1),
   (int)(s3-s2));
return 0;
}

Output, compiled with -O3:

filelen=23147 bytes
2.4: 257
1.6: 123

That's a factor 2.

> The reported performance problem is more likely in the allocation
> of all these splitlines results, and the copying of the same
> strings over and over again.

True.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Aug 23 2005)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! 
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Martin v. Löwis

Walter Dörwald wrote:
> I think a maxsplit argument (just as for unicode.split()) would help too.

Correct - that would allow to get rid of the quadratic part.
We should also strive for avoiding the second copy of the line,
if the user requested keepends.

I wonder whether it would be worthwhile to cache the .splitlines result.
An application that has just invoked .readline() will likely invoke
.readline() again. If there is more than one line left, we could return
the first line right away (potentially trimming the line ending if
necessary). Only when a single line is left, we would attempt to
read more data. In a plain .read(), we would first join the lines
back.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Michael Chermside

Raymond Hettinger writes:
> The latest version of PEP 348 still proposes that a bare except clause
> will default to Exception instead of BaseException.  Initially, I had
> thought that might be a good idea but now think it is doomed and needs
> to be removed from the PEP.

Guido writes:
> If we syntactically enforce that the bare except, if present, must be
> last, would that remove your objection? I agree that a bare except in
> the middle is an anomaly, but that doesn't mean we can't keep bare
> except: as a shorthand for except Exception:.

Explicit is better than Implicit. I think that in newly written code
"except Exception:" is better (more explicit and easier to understand)
than "except:" Legacy code that uses "except:" can remain unchanged *IF*
the meaning of "except:" is unchanged... but I think we all agree that
this is unwise because the existing meaning is a tempting trap for the
unwary. So I don't see any advantage to keeping bare "except:" in the
long run. What we do to ease the transition is a different question,
but one more easily resolved.

-- Michael Chermside

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Martin v. Löwis wrote:

> Walter Dörwald wrote:
> 
>>I think a maxsplit argument (just as for unicode.split()) would help too.
> 
> Correct - that would allow to get rid of the quadratic part.

OK, such a patch should be rather simple. I'll give it a try.

> We should also strive for avoiding the second copy of the line,
> if the user requested keepends.

Your suggested unicode method islinebreak() would help with that. Then 
we could add the following to the string module:

unicodelinebreaks = u"".join(unichr(c) for c in xrange(0, 
sys.maxunicode) if unichr(c).islinebreak())

Then

 if line and not keepends:
 line = line.splitlines(False)[0]

could be

 if line and not keepends:
 line = line.rstrip(string.unicodelinebreaks)

> I wonder whether it would be worthwhile to cache the .splitlines result.
> An application that has just invoked .readline() will likely invoke
> .readline() again. If there is more than one line left, we could return
> the first line right away (potentially trimming the line ending if
> necessary). Only when a single line is left, we would attempt to
> read more data. In a plain .read(), we would first join the lines
> back.

OK, this would mean we'd have to distinguish between a direct call to 
read() and one done by readline() (which we do anyway through the 
firstline argument).

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Martin v. Löwis

Walter Dörwald wrote:
> Martin v. Löwis wrote:
> 
>> Walter Dörwald wrote:
>>
>>> I think a maxsplit argument (just as for unicode.split()) would help
>>> too.
>>
>>
>> Correct - that would allow to get rid of the quadratic part.
> 
> 
> OK, such a patch should be rather simple. I'll give it a try.

Actually, on a second thought - it would not remove the quadratic
aspect. You would still copy the rest string completely on each
split. So on the first split, you copy N lines (one result line,
and N-1 lines into the rest string), on the second split, N-2
lines, and so on, totalling N*N/2 line copies again. The only
thing you save is the join (as the rest is already joined), and
the IsLineBreak calls (which are necessary only for the first
line).

Please see python.org/sf/1268314; it solves the problem by
keeping the splitlines result. It only invokes IsLineBreak
once per character, and also copies each character only once,
and allocates each line only once, totalling in O(N) for
these operations. It still does contain a quadratic operation:
the lines are stored in a list, and the result list is
removed from the list with del lines[0]. This copies N-1
pointers, result in N*N/2 pointer copies. That should still
be much faster than the current code.

> unicodelinebreaks = u"".join(unichr(c) for c in xrange(0,
> sys.maxunicode) if unichr(c).islinebreak())

That is very inefficient. I would rather add a static list
to the string module, and have a test that says

assert str.unicodelinebreaks == u"".join(ch for ch in (unichr(c) for c
in xrange(0, sys.maxunicode)) if unicodedata.bidirectional(ch)=='B' or
unicodedata.category(ch)=='Zl')

unicodelinebreaks could then be defined as

# u"\r\n\x1c\x1d\x1e\x85\u2028\u2029
'\n\r\x1c\x1d\x1e\xc2\x85\xe2\x80\xa8\xe2\x80\xa9'.decode("utf-8")

> OK, this would mean we'd have to distinguish between a direct call to
> read() and one done by readline() (which we do anyway through the
> firstline argument).

See my patch. If we have cached lines, we don't need to call .read
at all.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread James Y Knight


On Aug 23, 2005, at 10:41 PM, Raymond Hettinger wrote:

> [Guido van Rossum]
>
>> If we syntactically enforce that the bare except, if present, must be
>> last, would that remove your objection? I agree that a bare except in
>> the middle is an anomaly, but that doesn't mean we can't keep bare
>> except: as a shorthand for except Exception:.
>>
>
> Hmm.  Prohibiting mid-suite bare excepts is progress and eliminates  
> the
> case that causes immediate indigestion.
>
> As for the rest, I'm not as sure and it would be helpful to get  
> thoughts
> from others on this one.  My sense is that blocking the clause from
> appearing in the middle is treating the symptom and not the disease.
>

I would rather see "except:" be deprecated eventually, and force the  
user to say either except Exception, except BaseException, or even  
better, except ActualExceptionIWantToCatch.

James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Barry Warsaw

On Wed, 2005-08-24 at 10:34, James Y Knight wrote:

> I would rather see "except:" be deprecated eventually, and force the  
> user to say either except Exception, except BaseException, or even  
> better, except ActualExceptionIWantToCatch.

I agree about bare except, but there is a very valid use case for an
except clause that catches every possible exception.  We need to make
sure we don't overlook this use case.  As an example, say I'm building a
transaction-aware system, I'm going to want to write code like this:

txn = new_transaction()
try:
txn.begin()
rtn = do_work()
except AllPossibleExceptions:
txn.abort()
raise
else:
txn.commit()
return rtn

I'm fine with not spelling that except statement as "except:" but I
don't want there to be any exceptions that can sneak past that middle
suite, including non-errors like SystemExit or KeyboardInterrupt.

I can't remember ever writing a bare except with a suite that didn't
contain (end in?) a bare raise.  Maybe we can allow bare except, but
constrain things so that the only way out of its suite is via a bare
raise.

-Barry

signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Guido van Rossum

On 8/24/05, Michael Chermside <[EMAIL PROTECTED]> wrote:
> Explicit is better than Implicit. I think that in newly written code
> "except Exception:" is better (more explicit and easier to understand)
> than "except:" Legacy code that uses "except:" can remain unchanged *IF*
> the meaning of "except:" is unchanged... but I think we all agree that
> this is unwise because the existing meaning is a tempting trap for the
> unwary. So I don't see any advantage to keeping bare "except:" in the
> long run. What we do to ease the transition is a different question,
> but one more easily resolved.

OK, I'm convinced. Let's drop bare except for Python 3.0, and
deprecate them until then, without changing the meaning.

The deprecation message (to be generated by the compiler!) should
steer people in the direction of specifying one particular exception
(e.g. KeyError etc.) rather than Exception.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread James Y Knight


On Aug 24, 2005, at 11:10 AM, Guido van Rossum wrote:

> On 8/24/05, Michael Chermside <[EMAIL PROTECTED]> wrote:
>
>> Explicit is better than Implicit. I think that in newly written code
>> "except Exception:" is better (more explicit and easier to  
>> understand)
>> than "except:" Legacy code that uses "except:" can remain  
>> unchanged *IF*
>> the meaning of "except:" is unchanged... but I think we all agree  
>> that
>> this is unwise because the existing meaning is a tempting trap for  
>> the
>> unwary. So I don't see any advantage to keeping bare "except:" in the
>> long run. What we do to ease the transition is a different question,
>> but one more easily resolved.
>>
>
> OK, I'm convinced. Let's drop bare except for Python 3.0, and
> deprecate them until then, without changing the meaning.
>
> The deprecation message (to be generated by the compiler!) should
> steer people in the direction of specifying one particular exception
> (e.g. KeyError etc.) rather than Exception.

I agree but there's the minor nit of non-Exception exceptions.

I think it must be the case that raising an object which does not  
derive from an exception class must be deprecated as well in order  
for "except:" to be deprecated. Otherwise, there is nothing you can  
change "except:" to in order not to get a deprecation warning and  
still have your code be correct in the face of documented features of  
python.

James

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] FW: Bare except clauses in PEP 348

2005-08-24 Thread Raymond Hettinger

Hey guys, don't give up your bare except clauses so easily.

They are useful.  And, if given the natural meaning of "catch
everything" and put in a natural position at the end of a suite, their
meaning is plain and obvious.  Remember beauty counts.  I don't think
there would be similar temptation to eliminate a dangling else clause
and replace it with "else Everything".  Nor would a final default case
in a switch statement benefit from being written as "default
Everything".

The thought is that it is okay to have useful defaults.  My whole issue
was that the PEP was choosing the wrong default.  If we leave it alone,
all is well.  An empty except will continue to mean "catch everything",
it will always appear at the end, its meaning will be obvious, and
existing working code won't break :-)

On the occasions where you really intended to catch everything, do you
really want to go on an editing binge just to uglify the code to
something like:

   try:   
   ...
   except SomeException: 
   ...
   except BaseException:  
   ...


It is more beautiful and clear as:

   try:   
   ...
   except SomeException: 
   ...
   except:  
   ...

To me, the latter is more attractive and is more obviously a catchall,
just like an else-clause or a default-statement.  It is a strong visual
cue that at least one of the except clauses will always be triggered.
In contrast, the first example makes you think twice about whether the
final case really does get everything (sometimes implicit IS better than
explicit).



Raymond

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Shane Hathaway

Barry Warsaw wrote:
> I agree about bare except, but there is a very valid use case for an
> except clause that catches every possible exception.  We need to make
> sure we don't overlook this use case.  As an example, say I'm building a
> transaction-aware system, I'm going to want to write code like this:
> 
> txn = new_transaction()
> try:
> txn.begin()
> rtn = do_work()
> except AllPossibleExceptions:
> txn.abort()
> raise
> else:
> txn.commit()
> return rtn
> 
> I'm fine with not spelling that except statement as "except:" but I
> don't want there to be any exceptions that can sneak past that middle
> suite, including non-errors like SystemExit or KeyboardInterrupt.
> 
> I can't remember ever writing a bare except with a suite that didn't
> contain (end in?) a bare raise.  Maybe we can allow bare except, but
> constrain things so that the only way out of its suite is via a bare
> raise.

I also use this idiom quite frequently, but I wonder if a finally clause 
would be a better way to write it:

txn = new_transaction()
try:
 txn.begin()
 rtn = do_work()
finally:
 if exception_occurred():
 txn.abort()
 else:
 txn.commit()
 return rtn

Since this doesn't use try/except/else, it's not affected by changes to 
the meaning of except clauses.  However, it forces more indentation and 
expects a new builtin, and the name "exception_occurred" is probably too 
long for a builtin.

Now for a weird idea.

txn = new_transaction()
try:
 txn.begin()
 rtn = do_work()
finally except:
 txn.abort()
finally else:
 txn.commit()
 return rtn

This is what I would call qualified finally clauses.  The interpreter 
chooses exactly one of the finally clauses.  If a "finally except" 
clause is chosen, the exception is re-raised before execution continues. 
  Most code that currently uses bare raise inside bare except could just 
prefix the "except" and "else" keywords with "finally".

Shane
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Niko Matsakis

>
> txn = new_transaction()
> try:
>  txn.begin()
>  rtn = do_work()
> finally:
>  if exception_occurred():
>  txn.abort()
>  else:
>  txn.commit()
>  return rtn
>

Couldn't you just do:

txn = new_transaction ()
try:
 complete = 0
 txn.begin ()
 rtn = do_work ()
 complete = 1
finally:
 if not complete: txn.abort ()
 else: txn.commit ()

and then not need new builtins or anything fancy?


Niko

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] FW: Bare except clauses in PEP 348

2005-08-24 Thread Michael Chermside

Raymond writes:
> Hey guys, don't give up your bare except clauses so easily.
  [...]

Raymond:

I agree that when comparing:

   // Code snippet A
   try:
   ...
   except SomeException:
   ...
   except BaseException:
   ...

with

   // Code snippet B
   try:
   ...
   except SomeException:
   ...
   except:
   ...

that B is nicer than A. Slightly nicer. It's a minor esthetic point. But
consider these:

// Code snippet C
try:
...
except Exception:
...

// Code snippet D
try:
...
except:
...

Esthetically I'd say that D is nicer than A for the same reasons. It's a minor
esthetic point. But you see, this case is different. You and I would likely
never bother to compare C and D because they do different things! (D is
equivalent to catching BaseException, not Exception). But we know that people
who are not so careful or not so knowlegable WILL make this mistake... they
make it all the time today!

Since situation C (catching an exception) is hundreds of times more common than
situation A (needing default processing for exceptions that don't get caught,
but doing it with try-except instead of try-finally because the
nothing-was-thrown case is different), I would FAR rather protect beginners
from erroniously confusing C and D than I would provide a marginally more
elegent syntax for the experts using A or B. And that elegence is arguable...
there's something to be said for simplicity, and having only one kind of
"except" clause for try statements is clearly simpler than having both "except
:" and also bare "except:".

-- Michael Chermside

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Martin v. Löwis

Niko Matsakis wrote:
> Couldn't you just do:
> 
> txn = new_transaction ()
> try:
>  complete = 0
>  txn.begin ()
>  rtn = do_work ()
>  complete = 1
> finally:
>  if not complete: txn.abort ()
>  else: txn.commit ()
> 
> and then not need new builtins or anything fancy?

I personally dislike recording the execution path in
local variables. This is like setting a flag in a loop
before the break, and testing the flag afterwards.
You can do this, but the else: clause of the loop is
just more readable.

This specific fragment has also the bug that a
KeyboardInterrupt before the assignment to complete
will cause a NameError/UnboundLocalError; this
can easily be fixed by moving the assignment before
the try block.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Shane Hathaway

Niko Matsakis wrote:
>>
>> txn = new_transaction()
>> try:
>>  txn.begin()
>>  rtn = do_work()
>> finally:
>>  if exception_occurred():
>>  txn.abort()
>>  else:
>>  txn.commit()
>>  return rtn
>>
> 
> Couldn't you just do:
> 
> txn = new_transaction ()
> try:
> complete = 0
> txn.begin ()
> rtn = do_work ()
> complete = 1
> finally:
> if not complete: txn.abort ()
> else: txn.commit ()
> 
> and then not need new builtins or anything fancy?

That would work, though it's less readable.  If I were looking over code 
like that written by someone else, I'd have verify that the "complete" 
variable is handled correctly in all cases.  (As Martin noted, your code 
already has a bug.)  The nice try/except/else idiom we have today, with 
a bare except and bare raise, is much easier to verify.

Shane
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Martin v. Löwis wrote:

> Walter Dörwald wrote:
> 
>>Martin v. Löwis wrote:
>>
>>>Walter Dörwald wrote:
>>>
I think a maxsplit argument (just as for unicode.split()) would help
too.
>>>
>>>Correct - that would allow to get rid of the quadratic part.
>>
>>OK, such a patch should be rather simple. I'll give it a try.
> 
> Actually, on a second thought - it would not remove the quadratic
> aspect.

At least it would remove the quadratic number of calls to 
_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only 
once.

> You would still copy the rest string completely on each
> split. So on the first split, you copy N lines (one result line,
> and N-1 lines into the rest string), on the second split, N-2
> lines, and so on, totalling N*N/2 line copies again.

OK, that's true.

We could prevent string copying if we kept the unsplit string and the 
position of the current line terminator, but this would require a "first 
position after a line terminator" method.

> The only
> thing you save is the join (as the rest is already joined), and
> the IsLineBreak calls (which are necessary only for the first
> line).
> 
> Please see python.org/sf/1268314;

The last part of the patch seems to be more related to bug #1235646.

With the patch test_pep263 and test_codecs fail (and test_parser, but 
this might be unrelated):

python Lib/test/test_pep263.py gives the following output:

File "Lib/test/test_pep263.py", line 22
SyntaxError: list index out of range

test_codecs.py has the following two complaints:

File "/var/home/walter/Achtung/Python-linecache/dist/src/Lib/codecs.py", 
line 366, in readline
 self.charbuffer = lines[1] + self.charbuffer
IndexError: list index out of range

and

File "/var/home/walter/Achtung/Python-linecache/dist/src/Lib/codecs.py", 
line 336, in readline
 line = result.splitlines(False)[0]
NameError: global name 'result' is not defined

> it solves the problem by
> keeping the splitlines result. It only invokes IsLineBreak
> once per character, and also copies each character only once,
> and allocates each line only once, totalling in O(N) for
> these operations. It still does contain a quadratic operation:
> the lines are stored in a list, and the result list is
> removed from the list with del lines[0]. This copies N-1
> pointers, result in N*N/2 pointer copies. That should still
> be much faster than the current code.

Using collections.deque() should get rid of this problem.

>>unicodelinebreaks = u"".join(unichr(c) for c in xrange(0,
>>sys.maxunicode) if unichr(c).islinebreak())
> 
> That is very inefficient. I would rather add a static list
> to the string module, and have a test that says
> 
> assert str.unicodelinebreaks == u"".join(ch for ch in (unichr(c) for c
> in xrange(0, sys.maxunicode)) if unicodedata.bidirectional(ch)=='B' or
> unicodedata.category(ch)=='Zl')

You mean, in the test suite?

> unicodelinebreaks could then be defined as
> 
> # u"\r\n\x1c\x1d\x1e\x85\u2028\u2029
> '\n\r\x1c\x1d\x1e\xc2\x85\xe2\x80\xa8\xe2\x80\xa9'.decode("utf-8")

That might be better, as this definition won't change very often.

BTW, why the decode() call? For a Python without unicode?

>>OK, this would mean we'd have to distinguish between a direct call to
>>read() and one done by readline() (which we do anyway through the
>>firstline argument).
> 
> See my patch. If we have cached lines, we don't need to call .read
> at all.

I wonder what happens, if calls to read() and readline() are mixed (e.g. 
if I'm reading Fortran source or anything with a fixed line header): 
read() would be used to read the first n character (which joins the line 
buffer) and readline() reads the rest (which would split it again) etc.
(Of course this could be done via a single readline() call).

But, I think a maxsplit argument for splitlines() woould make sense 
independent of this problem.

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] FW: Bare except clauses in PEP 348

2005-08-24 Thread Ron Adam

Raymond Hettinger wrote:
 > Hey guys, don't give up your bare except clauses so easily.

Yes, Don't give up.  I often write code starting with a bare except, 
then after it works, stick a raise in it to determine exactly what 
exception I'm catching. Then use that to rewrite a more explicit except 
statement.

Your comment earlier about treating the symptom is also accurate.  This 
isn't just an issue with bare excepts not being allowed in the middle, 
it also comes up whenever we catch exceptions out of order in the tree. 
  Ie.. catching an exception closer to the base will block a following 
except clause that tries to catch an exception on the same branch.

So should except clauses be checked for orderliness?

Regards, Ron

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Guido van Rossum

On 8/24/05, James Y Knight <[EMAIL PROTECTED]> wrote:
> I think it must be the case that raising an object which does not
> derive from an exception class must be deprecated as well in order
> for "except:" to be deprecated. Otherwise, there is nothing you can
> change "except:" to in order not to get a deprecation warning and
> still have your code be correct in the face of documented features of
> python.

I agree; isn't that already in ther PEP? This surely has been the
thinking all along.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] FW: Bare except clauses in PEP 348

2005-08-24 Thread Guido van Rossum

On 8/24/05, Raymond Hettinger <[EMAIL PROTECTED]> wrote:
> Hey guys, don't give up your bare except clauses so easily.

They are an attractive nuisance by being so much shorter to type than
the "right thing to do".

Especially if they default to something whose use cases are rather
esoteric (i.e. BaseException).

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Donovan Baarda

On Wed, 2005-08-24 at 07:33, "Martin v. Löwis" wrote:
> Walter Dörwald wrote:
> > Martin v. Löwis wrote:
> > 
> >> Walter Dörwald wrote:
[...]
> Actually, on a second thought - it would not remove the quadratic
> aspect. You would still copy the rest string completely on each
> split. So on the first split, you copy N lines (one result line,
> and N-1 lines into the rest string), on the second split, N-2
> lines, and so on, totalling N*N/2 line copies again. The only
> thing you save is the join (as the rest is already joined), and
> the IsLineBreak calls (which are necessary only for the first
> line).
[...]

In the past, I've avoided the string copy overhead inherent in split()
by using buffers...

I've always wondered why Python didn't use buffer type tricks internally
for split-type operations. I haven't looked at Python's string
implementation, but the fact that strings are immutable surely means
that you can safely and efficiently reference an implementation level
"data" object for all strings... ie all strings are "buffers".

The only problem I can see with this is huge "data" objects might hang
around just because some small fragment of it is still referenced by a
string. Surely a simple huristic or two like "if len(string) <
len(data)/8: copy data; else: reference data" would go a long way
towards avoiding that.

In my limited playing around with manipulating of strings and
benchmarking stuff, the biggest overhead is nearly always the copys.

-- 
Donovan Baarda <[EMAIL PROTECTED]>

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

M.-A. Lemburg wrote:

> Walter Dörwald wrote:
> 
>>I wonder if we should switch back to a simple readline() implementation 
>>for those codecs that don't require the current implementation 
>>(basically every charmap codec). 
> 
> That would be my preference as well. The 2.4 .readline() approach
> is really only needed for codecs that have to deal with encodings
> that:
> 
> a) use multi-byte formats, or
> b) support more line-end formats than just CR, CRLF, LF, or
> c) are stateful.
> 
> This can easily be had by using a mix-in class for
> codecs which do need the buffered .readline() approach.

Should this be a mix-in or should we simply have two base classes? Which 
of those bases/mix-ins should be the default?

>>AFAIK source files are opened in 
>>universal newline mode, so at least we'd get proper treatment of "\n", 
>>"\r" and "\r\n" line ends, but we'd loose u"\x1c", u"\x1d", u"\x1e", 
>>u"\x85", u"\u2028" and u"\u2029" (which are line terminators according 
>>to unicode.splitlines()).
> 
> While the Unicode standard defines these characters as line
> end code points, I think their definition does not necessarily
> apply to data that is converted from a certain encoding to
> Unicode, so that's not a big loss.
> 
> E.g. in ASCII or Latin-1, FILE, GROUP and RECORD
> SEPARATOR and NEXT LINE characters (0x1c, 0x1d, 0x1e, 0x85)
> are not interpreted as line end characters.
> 
> Furthermore, we had no reports of anyone complaining in
> Python 1.6, 2.0 - 2.3 that line endings were not detected
> properly.  All these Python versions relied on the stream's
> .readline() method to get the next line. The only bug reports
> we had were for UTF-16 which falls into the above
> category a) and did not support .readline() until Python 2.4.

True.

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Martin v. Löwis

Walter Dörwald wrote:
> At least it would remove the quadratic number of calls to
> _PyUnicodeUCS2_IsLinebreak(). For each character it would be called only
> once.

Correct. However, I very much doubt that this is the cause of the
slowdown.

> The last part of the patch seems to be more related to bug #1235646.

You mean the last chunk (linebuffer=None)? This is just the extension
to reset.

> With the patch test_pep263 and test_codecs fail (and test_parser, but
> this might be unrelated):

Oops, I thought I ran the test suite, but apparently with the patch
removed. New version uploaded.

> Using collections.deque() should get rid of this problem.

Alright. There are so many types in Python I've never heard of :-)

> You mean, in the test suite?

Right.

> BTW, why the decode() call? For a Python without unicode?

Right. Not sure what people think whether this should still be
supported, but I keep supporting it whenever I think of it.

> I wonder what happens, if calls to read() and readline() are mixed (e.g.
> if I'm reading Fortran source or anything with a fixed line header):
> read() would be used to read the first n character (which joins the line
> buffer) and readline() reads the rest (which would split it again) etc.
> (Of course this could be done via a single readline() call).

Then performance would drop again - it should still be correct, though.

If this is becomes a frequent problem, we could satisfy read requests
from the split lines as well (i.e. join as many lines as you need).
However, I would rather expect that callers of read() typically want
the entire file, or want to read in large chunks (with no line
orientation at all).

> But, I think a maxsplit argument for splitlines() woould make sense
> independent of this problem.

I'm not so sure anymore. It is good for consistency, but I doubt there
are actual use cases: how often do you want only the first n lines
of some string? Reading the first n lines of a file might be an
application, but then, you would rather use .readline() directly.

For readline, I don't think there is a clear case for splitting of
only the first line (unless you want to return an index instead of
the rest string): if the application eventually wants all of the
data, we better split it right away into individual strings, instead
of dealing with a gradually decreasing trailer.

Anyway, I don't think we should go back to C's readline/fgets. This
is just too messy wrt. buffering and text vs. binary mode. I wish
Python would stop using stdio entirely.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Martin v. Löwis wrote:

> Walter Dörwald wrote:
> 
>>At least it would remove the quadratic number of calls to
>>_PyUnicodeUCS2_IsLinebreak(). For each character it would be called only
>>once.
> 
> Correct. However, I very much doubt that this is the cause of the
> slowdown.

Probably. We'd need a test with the original Argon source to really know.

>>The last part of the patch seems to be more related to bug #1235646.
> 
> You mean the last chunk (linebuffer=None)? This is just the extension
> to reset.

Ouch, you're right: The part of "cvs diff" was part of my checkout, not 
your patch. I have so many Python checkouts, that I sometimes forget 
which is which! ;)

>>With the patch test_pep263 and test_codecs fail (and test_parser, but
>>this might be unrelated):
> 
> Oops, I thought I ran the test suite, but apparently with the patch
> removed. New version uploaded.

Looks much better now.

>>Using collections.deque() should get rid of this problem.
> 
> Alright. There are so many types in Python I've never heard of :-)

The problem is that unicode.splitlines() returns a list, so the push/pop 
performance advantange of collections.deque might be eaten by having to 
create a collections.deque object in the first place.

>>You mean, in the test suite?
> 
> Right.
> 
>>BTW, why the decode() call? For a Python without unicode?
> 
> Right. Not sure what people think whether this should still be
> supported, but I keep supporting it whenever I think of it.

OK, so should we add this for 2.4.2 or only for 2.5?

Should this really be put into string.py, or should it be a class 
attribute of unicode? (At least that's what was proposed for the other 
strings in string.py (string.whitespace etc.) too.

>>I wonder what happens, if calls to read() and readline() are mixed (e.g.
>>if I'm reading Fortran source or anything with a fixed line header):
>>read() would be used to read the first n character (which joins the line
>>buffer) and readline() reads the rest (which would split it again) etc.
>>(Of course this could be done via a single readline() call).
> 
> Then performance would drop again - it should still be correct, though.
> 
> If this is becomes a frequent problem, we could satisfy read requests
> from the split lines as well (i.e. join as many lines as you need).
> However, I would rather expect that callers of read() typically want
> the entire file, or want to read in large chunks (with no line
> orientation at all).

Agreed! Don't fix a bug that hasn't been reported! ;)

>>But, I think a maxsplit argument for splitlines() woould make sense
>>independent of this problem.
> 
> I'm not so sure anymore. It is good for consistency, but I doubt there
> are actual use cases: how often do you want only the first n lines
> of some string? Reading the first n lines of a file might be an
> application, but then, you would rather use .readline() directly.

Not every unicode string is read from a StreamReader.

> For readline, I don't think there is a clear case for splitting of
> only the first line (unless you want to return an index instead of
> the rest string): if the application eventually wants all of the
> data, we better split it right away into individual strings, instead
> of dealing with a gradually decreasing trailer.

True, this would be best for a readline loop.

Another solution would be to have a unicode.itersplitlines() and store 
the iterator. Then we wouldn't need a maxsplit because you simply can 
stop iterating once you have what you want.

> Anyway, I don't think we should go back to C's readline/fgets. This
> is just too messy wrt. buffering and text vs. binary mode.

I don't know about C's readline, but StreamReader.read() and 
StreamReader.readline() are messy enough. But at least it's something we 
can fix ourselves.

> I wish
> Python would stop using stdio entirely.

So reverting to the 2.3 behaviour for simple codecs is out?

Bye,
Walter Dörwald
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Barry Warsaw

On Wed, 2005-08-24 at 12:38, "Martin v. Löwis" wrote:

> I personally dislike recording the execution path in
> local variables. This is like setting a flag in a loop
> before the break, and testing the flag afterwards.
> You can do this, but the else: clause of the loop is
> just more readable.

Agreed!

> This specific fragment has also the bug that a
> KeyboardInterrupt before the assignment to complete
> will cause a NameError/UnboundLocalError; this
> can easily be fixed by moving the assignment before
> the try block.

And that begs the question whether getting rid of this common idiom is
trading one common problem for another.

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] Docs/Pointer to Tools/scripts?

2005-08-24 Thread Reinhold Birkenfeld

Hi,

after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I wonder
whether the Tools directory is documented at all. There are many useful
scripts there which many people will not find if they are not listed
anywhere in the docs.

Just a thought.

Reinhold

-- 
Mail address is perfectly valid!

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Docs/Pointer to Tools/scripts?

2005-08-24 Thread Oleg Broytmann

Hello!

On Wed, Aug 24, 2005 at 08:33:02PM +0200, Reinhold Birkenfeld wrote:
> after adding Oleg Broytmann's findnocoding.py to Tools/scripts

   What's more, pysource.py is more than just a script - it's a generally
useful module.

   Thank you for committing the code.

Oleg.
-- 
 Oleg Broytmannhttp://phd.pp.ru/[EMAIL PROTECTED]
   Programmers don't die, they just GOSUB without RETURN.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Martin v. Löwis

Walter Dörwald wrote:
>> Right. Not sure what people think whether this should still be
>> supported, but I keep supporting it whenever I think of it.
> 
> 
> OK, so should we add this for 2.4.2 or only for 2.5?

You mean, string.unicodelinebreaks? I think something needs to be
done to fix the performance problem. In doing so, API changes
might occur. We should not add API changes in 2.4.2 unless they
contribute to the bug fix, and even then, the release manager
probably needs to approve them (in any case, they certainly
need to be backwards compatible)

> Should this really be put into string.py, or should it be a class
> attribute of unicode? (At least that's what was proposed for the other
> strings in string.py (string.whitespace etc.) too.

If the 2.4.2 fix is based on this kind of data, I think it should go
into a private attribute of codecs.py. For 2.5, I would put it
into strings for tradition. There is no point in having some of these
constants in strings and others as class attributes (unless we also
add them as class attributes in 2.5, in which case adding
unicodelinebreaks into strings would be pointless).

So I think in 2.5, I would like to see

# string.py
ascii_letters = str.ascii_letters

in which case unicode.linebreaks would be the right spelling.

>> I'm not so sure anymore. It is good for consistency, but I doubt there
>> are actual use cases: how often do you want only the first n lines
>> of some string? Reading the first n lines of a file might be an
>> application, but then, you would rather use .readline() directly.
> 
> 
> Not every unicode string is read from a StreamReader.

Sure: but how often do you want to fetch the first line of a Unicode
string you happen to have in memory, without iterating over all lines
eventually?

> Another solution would be to have a unicode.itersplitlines() and store
> the iterator. Then we wouldn't need a maxsplit because you simply can
> stop iterating once you have what you want.

That might work. I would then ask for itersplitlines to return pairs
of (line, truncated) so you can easily know whether you merely ran
into the end of the string, or whether you got a complete line
(although it might be a bit too specific for the readlines() case)

> So reverting to the 2.3 behaviour for simple codecs is out?

I'm -1, atleast. It would also fix the problem at hand, for the reported
case. However, it does leave some codecs in the cold, most notably
UTF-8 (which, in turn, isn't an issue for PEP 262, since UTF-8 is
built-in in the parser). I think the UTF-8 stream reader should support
all Unicode line breaks, so it should continue to use the Python
approach. However, UTF-8 is fairly common, so that reading an
UTF-8-encoded file line-by-line shouldn't suck.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Raymond Hettinger

[Guido van Rossum]
> > OK, I'm convinced. Let's drop bare except for Python 3.0, and
> > deprecate them until then, without changing the meaning.
> >
> > The deprecation message (to be generated by the compiler!) should
> > steer people in the direction of specifying one particular exception
> > (e.g. KeyError etc.) rather than Exception.

[James Y Knight]
> I agree but there's the minor nit of non-Exception exceptions.
> 
> I think it must be the case that raising an object which does not
> derive from an exception class must be deprecated as well in order
> for "except:" to be deprecated. Otherwise, there is nothing you can
> change "except:" to in order not to get a deprecation warning and
> still have your code be correct in the face of documented features of
> python.

Hmm, that may not be a killer.  I wonder if it is possible to treat
BaseException as a constant (like we do with None) and teach the
compiler to interpret it as catching anything that gets raised so that
"except BaseException" will work like a bare except clause does now.



Raymond

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Barry Warsaw

On Wed, 2005-08-24 at 15:15, Raymond Hettinger wrote:

> Hmm, that may not be a killer.  I wonder if it is possible to treat
> BaseException as a constant (like we do with None) and teach the
> compiler to interpret it as catching anything that gets raised so that
> "except BaseException" will work like a bare except clause does now.

Sorry Raymond, but my first reaction is "ick" :).  That seems to be a
big change in the semantics of exception matching.  I think I'd rather
keep bare except than add that!

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Raymond Hettinger

> > Hmm, that may not be a killer.  I wonder if it is possible to treat
> > BaseException as a constant (like we do with None) and teach the
> > compiler to interpret it as catching anything that gets raised so
that
> > "except BaseException" will work like a bare except clause does now.
> 
> Sorry Raymond, but my first reaction is "ick" :).  That seems to be a
> big change in the semantics of exception matching.  I think I'd rather
> keep bare except than add that!

That may be your only other option if we're waiting until 3.0 to
eliminate string exceptions and class exceptions not derived from the
hierarchy.


Raymond

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Michael Hudson

"Raymond Hettinger" <[EMAIL PROTECTED]> writes:

>> > Hmm, that may not be a killer.  I wonder if it is possible to treat
>> > BaseException as a constant (like we do with None) and teach the
>> > compiler to interpret it as catching anything that gets raised so
> that
>> > "except BaseException" will work like a bare except clause does now.
>> 
>> Sorry Raymond, but my first reaction is "ick" :).  That seems to be a
>> big change in the semantics of exception matching.  I think I'd rather
>> keep bare except than add that!
>
> That may be your only other option if we're waiting until 3.0 to
> eliminate string exceptions and class exceptions not derived from the
> hierarchy.

I really hope string exceptions can be killed off before 3.0.  They
should be fully deprecated in 2.5.

Cheers,
mwh

-- 
  The Oxford Bottled Beer Database heartily disapproves of the 
  excessive consumption of alcohol.  No, really.
-- http://www.bottledbeer.co.uk/beergames.html
   (now sadly gone to the big 404 in the sky)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Am 24.08.2005 um 21:15 schrieb Martin v. Löwis:

> Walter Dörwald wrote:
>
>
>>> Right. Not sure what people think whether this should still be
>>> supported, but I keep supporting it whenever I think of it.
>>>
>>
>> OK, so should we add this for 2.4.2 or only for 2.5?
>>
>
> You mean, string.unicodelinebreaks?
>

Yes.

> I think something needs to be
> done to fix the performance problem. In doing so, API changes
> might occur. We should not add API changes in 2.4.2 unless they
> contribute to the bug fix, and even then, the release manager
> probably needs to approve them (in any case, they certainly
> need to be backwards compatible)
>

OK. Your version of the patch (without replacing line =  
line.splitlines(False)[0] with something better) might be enough for  
2.4.2.

>> Should this really be put into string.py, or should it be a class
>> attribute of unicode? (At least that's what was proposed for the  
>> other
>> strings in string.py (string.whitespace etc.) too.
>>
>
> If the 2.4.2 fix is based on this kind of data, I think it should go
> into a private attribute of codecs.py.
>

I think codecs.unicodelinebreaks has one big problem: it will not  
work for codecs that do str->str decoding.

> For 2.5, I would put it
> into strings for tradition. There is no point in having some of these
> constants in strings and others as class attributes (unless we also
> add them as class attributes in 2.5, in which case adding
> unicodelinebreaks into strings would be pointless).
>
> So I think in 2.5, I would like to see
>
> # string.py
> ascii_letters = str.ascii_letters
>
> in which case unicode.linebreaks would be the right spelling.
>

And it would have the advantage, that it could work both with str and  
unicode if we had both str.linebreaks and unicode.linebreaks

>>> I'm not so sure anymore. It is good for consistency, but I doubt  
>>> there
>>> are actual use cases: how often do you want only the first n lines
>>> of some string? Reading the first n lines of a file might be an
>>> application, but then, you would rather use .readline() directly.
>>>
>>
>> Not every unicode string is read from a StreamReader.
>>
>
> Sure: but how often do you want to fetch the first line of a Unicode
> string you happen to have in memory, without iterating over all lines
> eventually?
>

I don't know. The only obvious spot in the standard library (apart  
from codecs.py) seems to be
def shortdescription(self): return self.description().splitlines() 
[0]
in Lib/plat-mac/pimp.py

>> Another solution would be to have a unicode.itersplitlines() and  
>> store
>> the iterator. Then we wouldn't need a maxsplit because you simply can
>> stop iterating once you have what you want.
>>
>
> That might work. I would then ask for itersplitlines to return pairs
> of (line, truncated) so you can easily know whether you merely ran
> into the end of the string, or whether you got a complete line
> (although it might be a bit too specific for the readlines() case)
>

Or maybe (line, terminatorlength) which gives you the same info  
(terminatorlength == 0 means truncated) and makes it easy to strip  
the terminator.

>> So reverting to the 2.3 behaviour for simple codecs is out?
>>
>
> I'm -1, atleast. It would also fix the problem at hand, for the  
> reported
> case. However, it does leave some codecs in the cold, most notably
> UTF-8 (which, in turn, isn't an issue for PEP 262, since UTF-8 is
> built-in in the parser).
>

You meant PEP 263, right?

> I think the UTF-8 stream reader should support
> all Unicode line breaks, so it should continue to use the Python
> approach.
>

OK.

> However, UTF-8 is fairly common, so that reading an
> UTF-8-encoded file line-by-line shouldn't suck.
>

OK, so what's missing is a solution for str->str codecs (or we keep  
line = line.splitlines(False)[0] and test, whether this is fast enough).

Bye,
Walter Dörwald


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Docs/Pointer to Tools/scripts?

2005-08-24 Thread Christos Georgiou

"Reinhold Birkenfeld" <[EMAIL PROTECTED]> wrote in 
message news:[EMAIL PROTECTED]
> Hi,
>
> after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I wonder
> whether the Tools directory is documented at all. There are many useful
> scripts there which many people will not find if they are not listed
> anywhere in the docs.

AFAIK the only documentation is the README file in said directory. 


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] [Argon] Re: 51 Million calls to _PyUnicodeUCS2_IsLinebreak() (???)

2005-08-24 Thread Walter Dörwald

Am 24.08.2005 um 20:20 schrieb Greg Wilson:

>>> Walter Dörwald wrote:
>>>
 At least it would remove the quadratic number of calls to
 _PyUnicodeUCS2_IsLinebreak(). For each character it would be  
 called only
 once.
>
>> Martin v. Löwis wrote:
>>
>>> Correct. However, I very much doubt that this is the cause of the
>>> slowdown.
>
>> Walter Dörwald wrote:
>> Probably. We'd need a test with the original Argon source to  
>> really know.
>
> We can do that.

So, can you try Martin's patch?

>> OK, so should we add this for 2.4.2 or only for 2.5?
>
> 2.4.2 please ;-)

If we use the patch as is, I think it can go into 2.4.2.

Bye,
Walter Dörwald

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Docs/Pointer to Tools/scripts?

2005-08-24 Thread Martin v. Löwis

Reinhold Birkenfeld wrote:
> after adding Oleg Broytmann's findnocoding.py to Tools/scripts, I
> wonder whether the Tools directory is documented at all. There are
> many useful scripts there which many people will not find if they are
> not listed anywhere in the docs.

Christos already mentioned it: there is a README file in both Tools
and Tools/scripts; you should update it whenever you add something.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Guido van Rossum

On 8/24/05, Michael Hudson <[EMAIL PROTECTED]> wrote:
> I really hope string exceptions can be killed off before 3.0.  They
> should be fully deprecated in 2.5.

But what about class exceptions that don't inherit from Exception?
That will take a while before we can deprecate that.

Anyway, there have been plenty of cases where I was only interested in
catching arbitrary exceptions generated by *Python* (as opposed to
broken 3rd party code or even obscure Python library code) and those
all inherit from Exception. And in those cases I've written "except
Exception:" and so far never regretted it.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Brett Cannon

On 8/24/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 8/24/05, James Y Knight <[EMAIL PROTECTED]> wrote:
> > I think it must be the case that raising an object which does not
> > derive from an exception class must be deprecated as well in order
> > for "except:" to be deprecated. Otherwise, there is nothing you can
> > change "except:" to in order not to get a deprecation warning and
> > still have your code be correct in the face of documented features of
> > python.
> 
> I agree; isn't that already in ther PEP? This surely has been the
> thinking all along.
> 

Requiring inheritance of BaseException in order to pass it to 'raise'
has been in the PEP since the beginning.

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Brett Cannon

On 8/24/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> On 8/24/05, Michael Chermside <[EMAIL PROTECTED]> wrote:
> > Explicit is better than Implicit. I think that in newly written code
> > "except Exception:" is better (more explicit and easier to understand)
> > than "except:" Legacy code that uses "except:" can remain unchanged *IF*
> > the meaning of "except:" is unchanged... but I think we all agree that
> > this is unwise because the existing meaning is a tempting trap for the
> > unwary. So I don't see any advantage to keeping bare "except:" in the
> > long run. What we do to ease the transition is a different question,
> > but one more easily resolved.
> 
> OK, I'm convinced. Let's drop bare except for Python 3.0, and
> deprecate them until then, without changing the meaning.
> 

Woohoo!  I am currently on vacation before school starts (orientation
is Sept 1., classes start Sept. 6), so it might take me a little while
to edit the PEP, but I will try to fit into my schedule ASAP (assuming
the tide doesn't turn on me before then).

> The deprecation message (to be generated by the compiler!) should
> steer people in the direction of specifying one particular exception
> (e.g. KeyError etc.) rather than Exception.

Is there any desire for a __future__ statement that makes it a syntax
error?  How about making 'raise' statements only work with objects
that inherit from BaseException?

-Brett
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Guido van Rossum

On 8/24/05, Brett Cannon <[EMAIL PROTECTED]> wrote:
> Is there any desire for a __future__ statement that makes it a syntax
> error?  How about making 'raise' statements only work with objects
> that inherit from BaseException?

I doubt it. Few people are going to put a __future__ statement in to
make sure that *don't* use a particular feature: it's just as easy to
grep your source code for "except:". __future__ is in general only
used to enable new syntax that previously has a different meaning.

Anyway, you can make it an error globally by using the -W option creatively.
-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

[Python-Dev] New mailbox module

2005-08-24 Thread A.M. Kuchling

Gregory K. Johnson, who's been working on the mailbox module in
nondist/sandbox/mailbox for Google's Summer of Code, thinks his
project is essentially complete.  He's added the ability to modifying
mailboxes by adding and removing messages, adding test cases for the
new features, and written the corresponding documentation.

So, it's time to start considering it for inclusion in the standard
library.  This is a big change to a non-obscure module, so don't feel
able to make this decision on my own.

I believe the code quality is acceptable, but would appreciate
comments on any cleanups that need to be made.  I still need to read
through the docs and make editing suggestions, and check that the code
is still backward-compatible with the old version of the module.

--amk
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] New mailbox module

2005-08-24 Thread Barry Warsaw

On Wed, 2005-08-24 at 22:33, A.M. Kuchling wrote:

> So, it's time to start considering it for inclusion in the standard
> library.  This is a big change to a non-obscure module, so don't feel
> able to make this decision on my own.
> 
> I believe the code quality is acceptable, but would appreciate
> comments on any cleanups that need to be made.  I still need to read
> through the docs and make editing suggestions, and check that the code
> is still backward-compatible with the old version of the module.

I plan to take a look at it, but won't get a chance to do so for several
days.

-Barry



signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread James Y Knight


On Aug 24, 2005, at 9:39 PM, Brett Cannon wrote:
> On 8/24/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>> On 8/24/05, James Y Knight <[EMAIL PROTECTED]> wrote:
>>> I think it must be the case that raising an object which does not
>>> derive from an exception class must be deprecated as well in order
>>> for "except:" to be deprecated. Otherwise, there is nothing you can
>>> change "except:" to in order not to get a deprecation warning and
>>> still have your code be correct in the face of documented  
>>> features of
>>> python.
>>>
>>
>> I agree; isn't that already in ther PEP? This surely has been the
>> thinking all along.
>>
>>
>
> Requiring inheritance of BaseException in order to pass it to 'raise'
> has been in the PEP since the beginning.

Yes, it talks about that as a change that will happen in Python 3.0.  
I was responding to

>> OK, I'm convinced. Let's drop bare except for Python 3.0, and
>> deprecate them until then, without changing the meaning.

which is talking about deprecating bare excepts in Python 2.5. Now  
maybe it's the idea that everything that's slated for removal in  
Python 3.0 by PEP 348 is supposed to be getting a deprecation warning  
in Python 2.5, but that certainly isn't stated. The transition plan  
section says that all that will happen in Python 2.5 is the addition  
of "BaseException".

James
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

Re: [Python-Dev] Bare except clauses in PEP 348

2005-08-24 Thread Guido van Rossum

On 8/24/05, James Y Knight <[EMAIL PROTECTED]> wrote:
> On Aug 24, 2005, at 9:39 PM, Brett Cannon wrote:
> > On 8/24/05, Guido van Rossum <[EMAIL PROTECTED]> wrote:
> >> On 8/24/05, James Y Knight <[EMAIL PROTECTED]> wrote:
> >>> I think it must be the case that raising an object which does not
> >>> derive from an exception class must be deprecated as well in order
> >>> for "except:" to be deprecated. Otherwise, there is nothing you can
> >>> change "except:" to in order not to get a deprecation warning and
> >>> still have your code be correct in the face of documented
> >>> features of
> >>> python.
> >>>
> >>
> >> I agree; isn't that already in ther PEP? This surely has been the
> >> thinking all along.
> >>
> >>
> >
> > Requiring inheritance of BaseException in order to pass it to 'raise'
> > has been in the PEP since the beginning.
> 
> Yes, it talks about that as a change that will happen in Python 3.0.
> I was responding to
> 
> >> OK, I'm convinced. Let's drop bare except for Python 3.0, and
> >> deprecate them until then, without changing the meaning.
> 
> which is talking about deprecating bare excepts in Python 2.5. Now
> maybe it's the idea that everything that's slated for removal in
> Python 3.0 by PEP 348 is supposed to be getting a deprecation warning
> in Python 2.5, but that certainly isn't stated. The transition plan
> section says that all that will happen in Python 2.5 is the addition
> of "BaseException".

Then maybe the PEP isn't perfect just yet. :-)

It's never too early to start deprecating a feature we know will
disappear in 3.0.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com

50 matches

Mail list logo