[Python-Dev] Pickle format machine independence

2008-04-14 Thread Hrvoje Nikšić
Are pickle files intended to be readable across different machine
architectures?  The documentation states unequivocally that they are
compatible across Python versions, but compatibility across machine
architectures (wrt to differences in size and layout of primitive C
types) is not explicitly addressed.

One example where I stumbled upon the incompatibility is the pickling of
arrays.  While pickle is normally very careful to write out numbers in a
platform-independent way, arrays are written out in "tostring" format.
This is filed under http://bugs.python.org/issue2389.

I can work around this issue in my application, but if this is
considered a bug, I'd prefer to fix it in Python instead.


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] very bad network performance

2008-04-14 Thread Ralf Schmitt
Hi all,

I'm using mercurial with the release25-maint branch. I noticed that checking
out a local repository now takes more than
5 minutes (it should be around 30s).

I've tracked it down to this change:
http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
this is svn revision 61009. Here is the diff inline:

--- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100
+++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100
@@ -305,7 +305,7 @@
 self._rbuf = ""
 while True:
 left = size - buf_len
-recv_size = max(self._rbufsize, left)
+recv_size = min(self._rbufsize, left)
 data = self._sock.recv(recv_size)
 if not data:
 break



self._rbufsize if 1, and so the code reads one byte at a time. this is
clearly wrong, I'm posting it to the mailing list, as I don't want
this issue to get lost in the bugtracker.


- Ralf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Terry Reedy

"Ralf Schmitt" <[EMAIL PROTECTED]> wrote in message 
news:[EMAIL PROTECTED]
| Hi all,
|
| I'm using mercurial with the release25-maint branch. I noticed that 
checking
| out a local repository now takes more than
| 5 minutes (it should be around 30s).
|
| I've tracked it down to this change:
| http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
| this is svn revision 61009. Here is the diff inline:
|
| --- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100
| +++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100
| @@ -305,7 +305,7 @@
| self._rbuf = ""
| while True:
| left = size - buf_len
| -recv_size = max(self._rbufsize, left)
| +recv_size = min(self._rbufsize, left)
| data = self._sock.recv(recv_size)
| if not data:
| break
|
|
|
| self._rbufsize if 1, and so the code reads one byte at a time. this is
| clearly wrong, I'm posting it to the mailing list, as I don't want
| this issue to get lost in the bugtracker.



It is at least as likely to get lost here.  There is a mailing list for new 
tracker items that many devs subscribe to.



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Guido van Rossum
Ralf,

Terry is right. Please file a bug. I do think there may be a problem
with that change but I don't have the time to review it in depth.
Hopefully others will. I do recall that sockets reading one byte at a
time has been a problem before -- I recall a bug about this in the
1.5.2 era for Windows... Too bad it's back. :-(

--Guido

On Mon, Apr 14, 2008 at 10:25 AM, Terry Reedy <[EMAIL PROTECTED]> wrote:
>
>  "Ralf Schmitt" <[EMAIL PROTECTED]> wrote in message
>  news:[EMAIL PROTECTED]
>
>
> | Hi all,
>  |
>  | I'm using mercurial with the release25-maint branch. I noticed that
>  checking
>  | out a local repository now takes more than
>  | 5 minutes (it should be around 30s).
>  |
>  | I've tracked it down to this change:
>  | http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
>  | this is svn revision 61009. Here is the diff inline:
>  |
>  | --- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100
>  | +++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100
>  | @@ -305,7 +305,7 @@
>  | self._rbuf = ""
>  | while True:
>  | left = size - buf_len
>  | -recv_size = max(self._rbufsize, left)
>  | +recv_size = min(self._rbufsize, left)
>  | data = self._sock.recv(recv_size)
>  | if not data:
>  | break
>  |
>  |
>  |
>  | self._rbufsize if 1, and so the code reads one byte at a time. this is
>  | clearly wrong, I'm posting it to the mailing list, as I don't want
>  | this issue to get lost in the bugtracker.
>
>  
> 
>
>  It is at least as likely to get lost here.  There is a mailing list for new
>  tracker items that many devs subscribe to.
>
>
>
>  ___
>  Python-Dev mailing list
>  [email protected]
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Pickle format machine independence

2008-04-14 Thread Guido van Rossum
On Mon, Apr 14, 2008 at 6:56 AM, Hrvoje Nikšić <[EMAIL PROTECTED]> wrote:
> Are pickle files intended to be readable across different machine
>  architectures?  The documentation states unequivocally that they are
>  compatible across Python versions, but compatibility across machine
>  architectures (wrt to differences in size and layout of primitive C
>  types) is not explicitly addressed.

They're supposed to be compatible across all architectures.

>  One example where I stumbled upon the incompatibility is the pickling of
>  arrays.  While pickle is normally very careful to write out numbers in a
>  platform-independent way, arrays are written out in "tostring" format.
>  This is filed under http://bugs.python.org/issue2389.
>
>  I can work around this issue in my application, but if this is
>  considered a bug, I'd prefer to fix it in Python instead.

It may not be easy to fix this in a backwards-compatible way, but I
agree that there's something fishy there. If you can think of a fix,
please do submit a patch!

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Curt Hagenlocher
On Mon, Apr 14, 2008 at 9:12 AM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
>
> I've tracked it down to this change:
> http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
> this is svn revision 61009.
> [...]
> self._rbufsize if 1, and so the code reads one byte at a time

The change is correct, but exposes a flaw earlier in the same method.
"_rbufsize == 1" represents a request to buffer "by line", which is
clearly irrelevant in this context.  A request to read n bytes should
just use the default buffer size if buffering "by line".  Sample patch
is attached.


--
Curt Hagenlocher
[EMAIL PROTECTED]
***  
---  
***
*** 277,294 
  
  def read(self, size=-1):
  data = self._rbuf
  if size < 0:
  # Read until EOF
  buffers = []
  if data:
  buffers.append(data)
  self._rbuf = ""
- if self._rbufsize <= 1:
- recv_size = self.default_bufsize
- else:
- recv_size = self._rbufsize
  while True:
! data = self._sock.recv(recv_size)
  if not data:
  break
  buffers.append(data)
--- 277,294 
  
  def read(self, size=-1):
  data = self._rbuf
+ if self._rbufsize <= 1:
+ rbufsize = self.default_bufsize
+ else:
+ rbufsize = self._rbufsize
  if size < 0:
  # Read until EOF
  buffers = []
  if data:
  buffers.append(data)
  self._rbuf = ""
  while True:
! data = self._sock.recv(rbufsize)
  if not data:
  break
  buffers.append(data)
***
*** 305,311 
  self._rbuf = ""
  while True:
  left = size - buf_len
! recv_size = max(self._rbufsize, left)
  data = self._sock.recv(recv_size)
  if not data:
  break
--- 305,311 
  self._rbuf = ""
  while True:
  left = size - buf_len
! recv_size = max(rbufsize, left)
  data = self._sock.recv(recv_size)
  if not data:
  break
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Guido van Rossum
Eek! Please use the bug tracker.

On Mon, Apr 14, 2008 at 11:10 AM, Curt Hagenlocher <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 14, 2008 at 9:12 AM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
>  >
>  > I've tracked it down to this change:
>  > http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
>  > this is svn revision 61009.
>  > [...]
>
> > self._rbufsize if 1, and so the code reads one byte at a time
>
>  The change is correct, but exposes a flaw earlier in the same method.
>  "_rbufsize == 1" represents a request to buffer "by line", which is
>  clearly irrelevant in this context.  A request to read n bytes should
>  just use the default buffer size if buffering "by line".  Sample patch
>  is attached.
>
>
>  --
>  Curt Hagenlocher
>  [EMAIL PROTECTED]
>
> ___
>  Python-Dev mailing list
>  [email protected]
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Bill Janssen
There's some really convoluted code in socket._fileobject.__init__()
here.  When initializing a _fileobject, if the 'bufsize' parameter is
explicitly given as zero, that's turned into an _rbufsize of 1, which,
combined with the 'min' change, will produce the read-one-byte
behavior.  The code for setting _rbufsize seems odd; be nice if it was
commented with some notes on why these specific selections were made.

if bufsize < 0:
bufsize = self.default_bufsize
if bufsize == 0:
self._rbufsize = 1
elif bufsize == 1:
self._rbufsize = self.default_bufsize
else:
self._rbufsize = bufsize
self._wbufsize = bufsize

It also depends on whether 'read' is called with an explicit # of
bytes to read (which appears to be the case here).

So, it's not the code in socket.py, necessarily; it's the code which
opens the socket, most likely.  The only library which seems to use a
bufsize of zero is httplib (which has a lot of other problems as
well).  I think the change cited below (while IMO correct) will affect
a number of other HTTP-based services, as well.

Bill

> Ralf,
> 
> Terry is right. Please file a bug. I do think there may be a problem
> with that change but I don't have the time to review it in depth.
> Hopefully others will. I do recall that sockets reading one byte at a
> time has been a problem before -- I recall a bug about this in the
> 1.5.2 era for Windows... Too bad it's back. :-(
> 
> --Guido
> 
> On Mon, Apr 14, 2008 at 10:25 AM, Terry Reedy <[EMAIL PROTECTED]> wrote:
> >
> >  "Ralf Schmitt" <[EMAIL PROTECTED]> wrote in message
> >  news:[EMAIL PROTECTED]
> >
> >
> > | Hi all,
> >  |
> >  | I'm using mercurial with the release25-maint branch. I noticed that
> >  checking
> >  | out a local repository now takes more than
> >  | 5 minutes (it should be around 30s).
> >  |
> >  | I've tracked it down to this change:
> >  | http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
> >  | this is svn revision 61009. Here is the diff inline:
> >  |
> >  | --- a/Lib/socket.py Fri Mar 23 14:27:29 2007 +0100
> >  | +++ b/Lib/socket.py Sat Feb 23 20:30:59 2008 +0100
> >  | @@ -305,7 +305,7 @@
> >  | self._rbuf = ""
> >  | while True:
> >  | left = size - buf_len
> >  | -recv_size = max(self._rbufsize, left)
> >  | +recv_size = min(self._rbufsize, left)
> >  | data = self._sock.recv(recv_size)
> >  | if not data:
> >  | break
> >  |
> >  |
> >  |
> >  | self._rbufsize if 1, and so the code reads one byte at a time. this is
> >  | clearly wrong, I'm posting it to the mailing list, as I don't want
> >  | this issue to get lost in the bugtracker.
> >
> >  
> > 
> >
> >  It is at least as likely to get lost here.  There is a mailing list for new
> >  tracker items that many devs subscribe to.





___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on having EOFError inherit from EnvironmentError?

2008-04-14 Thread Guido van Rossum
Offhand, -0. I don't think of EOFError as an environmental error. Its
primary purpose was to have something raised by raw_input() (in 3.0,
input()) when there is no more input. This is quite a different level
of error than what EnvironmentError typically means (a problem in the
filesystem or network, or a permissions thing).

On Sat, Apr 12, 2008 at 3:01 PM, Gregory P. Smith <[EMAIL PROTECTED]> wrote:
> http://bugs.python.org/issue1481036
>
> Basically as things are now EOFError is on its own but often wants to be
> handled the same as other I/O errors that EnvironmentError currently covers.
>
> Many uses of EOFError in our code base do not provide it any arguments so it
> doesn't really fit the (errno, message [, filename]) tuple style that
> EnvironmentError promises.  But we could fudge that with a reasonable
> default (whats reasonable?) if we rerooted this under EnvironmentError.
>
> Alternatively the bug suggests a new parent exception for EnvironmentError
> and EOFError both to inherit from.
>
> Last time changing the heirarchy around this came up there was pushback
> against adding yet another exception type so I'm thinking the simple
> re-rooting may be the best answer if anything is done at all.
>
> any thoughts?
>
> -gps
> ___
>  Python-Dev mailing list
>  [email protected]
>  http://mail.python.org/mailman/listinfo/python-dev
>  Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>



-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread A.M. Kuchling
On Mon, Apr 14, 2008 at 11:10:12AM -0700, Curt Hagenlocher wrote:
>   while True:
>   left = size - buf_len
> ! recv_size = max(self._rbufsize, left)
>   data = self._sock.recv(recv_size)

What version is this patch against?  (The last 2.5 release, maybe?)
The max() in the above line should be min(), because you want to use
the *smaller* number of the buffer size and the # of remaining bytes
to read, not the *larger*.  This code is using min() in both 25-maint
and trunk.

--amk

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Curt Hagenlocher
On Mon, Apr 14, 2008 at 12:17 PM, A.M. Kuchling <[EMAIL PROTECTED]> wrote:
> On Mon, Apr 14, 2008 at 11:10:12AM -0700, Curt Hagenlocher wrote:
> >   while True:
> >   left = size - buf_len
> > ! recv_size = max(self._rbufsize, left)
> >   data = self._sock.recv(recv_size)
>
> What version is this patch against?  (The last 2.5 release, maybe?)

Yes, sorry.  I thought I had checked this against the repository --
particularly because the max->min change is what kicked off this
thread.

--
Curt Hagenlocher
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] string representation of range in 3.0

2008-04-14 Thread Brad Miller
After posting a patch to implement this some good discussion followed   
see:  http://bugs.python.org/issue2610


It was suggested that a broader discussion might be in order around  
the issue of iterators and how they are displayed in the command line  
interpreter.
Several new iterators have appeared in Python 3.0 that makes the  
language less transparent to beginning programmers.  The examples that  
immediately come to mind are shown below, and I would guess there are  
others I haven't run across yet.



>>> range(10)
range(0, 10)
>>> myd = {chr(i):i for i in range(32,42)}
>>> myd.keys()

>>> myd.values()

>>> myd.items()



Although none of the above are a problem for intermediate or advanced  
programmers  I would like to find a way so that beginning students  
would automatically get a more helpful representation when they  
evaluate expressions in the interpreter.


My solution of implementing the __str__ method for range is one  
solution, and that could also be done for the dict_xxx objects as  
well.  Other solutions that were suggested were to include some kind  
of a module that overrides sys.displayhook or to simply make the  
command line interpreter more intelligence.  For example it already  
handles a return value of None in a special way, maybe it should do  
something for these iterators as well.


Any other comments or ideas?

Thanks,

Brad

On Apr 7, 2008, at 6:24 PM, Guido van Rossum wrote:


I'd object to it returning something that resembles a list too
closely, but I could live with str(range(3)) return <0, 1, 2>. We
should probably have a cutoff so that if there are more than 6 values
it'll show the first 3 values, then dots, then the last 2 values. (The
cutoff would be computed so that '...' always represents at least 2
values.

On Mon, Apr 7, 2008 at 4:14 PM, Brad Miller <[EMAIL PROTECTED]>  
wrote:

Hi,

I use Python in my CS1 and CS2 curriculum and I have a question.
As I've been using the Python 3.0 alphas one of the things that I am
bothered by is that I cannot see the sequence produced by range
without introducing students to the list() function.

I typically introduce range on day 1 of class and show students what
it produces without making a big deal out of the fact that it creates
a list.  They all accept this and things work out nicely when I
introduce lists for real in a week or two.

My question is why couldn't the __str__ method for the range object  
be
more friendly and show a representation of the sequence?  I  
understand

why __repr__ should return range(0,10) for an object created using
range(10) but couldn't print(range(10)) produce [0, 1, 2, ... 9]
The ... could even be used if the sequence were excessively wrong.

If this is acceptable, I would be happy to accept the challenge of
providing a patch.

Thanks,

Brad


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/guido%40python.org





--
--Guido van Rossum (home page: http://www.python.org/~guido/)


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c

2008-04-14 Thread Christian Heimes
alexandre.vassalotti schrieb:
> Author: alexandre.vassalotti
> Date: Mon Apr 14 22:51:05 2008
> New Revision: 62342
> 
> Log:
> Improved bytes_extend() to avoid making a full copy of the temporary
> buffer. This also makes the code slightly cleaner.

Changes to Objects/bytesobject.c should be applied to the trunk and
merged into the py3k branch via svnmerge.py.

We need to agree on a policy how we are going to sync the trunk and py3k
for new code like bytesobject.c and io.py. The former is easy because
the file is almost identical. The later is going to be hard because 2.6
doesn't have annotations.

Collin:
How hard is it to write a fixer that removes all annotations from
functions? A set of small 3to2 fixers for annotations and metaclasses
would make the syncing job much easier.

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c

2008-04-14 Thread Collin Winter
On Mon, Apr 14, 2008 at 2:05 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:
> alexandre.vassalotti schrieb:
>
> > Author: alexandre.vassalotti
>  > Date: Mon Apr 14 22:51:05 2008
>  > New Revision: 62342
>  >
>  > Log:
>  > Improved bytes_extend() to avoid making a full copy of the temporary
>  > buffer. This also makes the code slightly cleaner.
>
>  Changes to Objects/bytesobject.c should be applied to the trunk and
>  merged into the py3k branch via svnmerge.py.
>
>  We need to agree on a policy how we are going to sync the trunk and py3k
>  for new code like bytesobject.c and io.py. The former is easy because
>  the file is almost identical. The later is going to be hard because 2.6
>  doesn't have annotations.
>
>  Collin:
>  How hard is it to write a fixer that removes all annotations from
>  functions? A set of small 3to2 fixers for annotations and metaclasses
>  would make the syncing job much easier.

It should be pretty easy. I'm working on a 2to3 metaclass fixer, which
could provide guidance for a 3to2 fixer.

Do we want to take this opportunity to create a real 3to2 project in
the sandbox? If so, I'd like to refactor lib2to3 into its own project,
then import that into 2to3 and 3to2. I can do the work.

Collin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Ralf Schmitt
On Mon, Apr 14, 2008 at 8:22 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote:

> Eek! Please use the bug tracker.
>

I 've made some comments on: http://bugs.python.org/issue1092502 (which is
the original issue). However I cannot reopen this issue.

Regards,
- Ralf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c

2008-04-14 Thread Martin v. Löwis
> Do we want to take this opportunity to create a real 3to2 project in
> the sandbox? If so, I'd like to refactor lib2to3 into its own project,
> then import that into 2to3 and 3to2. I can do the work.

In that case, I would propose that the copy in the Python trunk becomes
official, and the other copies use svn:externals, or merge tracking.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Ralf Schmitt
On Mon, Apr 14, 2008 at 8:10 PM, Curt Hagenlocher <[EMAIL PROTECTED]>
wrote:

> On Mon, Apr 14, 2008 at 9:12 AM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
> >
> > I've tracked it down to this change:
> > http://hgpy.de/py/release25-maint/rev/e9446c6ab3cd
> > this is svn revision 61009.
> > [...]
> > self._rbufsize if 1, and so the code reads one byte at a time
>
> The change is correct, but exposes a flaw earlier in the same method.
> "_rbufsize == 1" represents a request to buffer "by line", which is
> clearly irrelevant in this context.  A request to read n bytes should
> just use the default buffer size if buffering "by line".  Sample patch
> is attached.
>

Sorry to reply on the mailing list. But this change is wrong.
e.g. if you're using a buffer size of 16 bytes and try to read 256 bytes, it
should call recv with a value of 256 and not call recv 16 times with a value
of 16.
However, there should be an upper limit (as shown by the imap bug).

Regards,
- Ralf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c

2008-04-14 Thread Collin Winter
On Mon, Apr 14, 2008 at 2:26 PM, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> > Do we want to take this opportunity to create a real 3to2 project in
>  > the sandbox? If so, I'd like to refactor lib2to3 into its own project,
>  > then import that into 2to3 and 3to2. I can do the work.
>
>  In that case, I would propose that the copy in the Python trunk becomes
>  official, and the other copies use svn:externals, or merge tracking.

Can do.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Ralf Schmitt
On Mon, Apr 14, 2008 at 11:18 PM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:

>
> On Mon, Apr 14, 2008 at 8:22 PM, Guido van Rossum <[EMAIL PROTECTED]>
> wrote:
>
> > Eek! Please use the bug tracker.
> >
>
> I 've made some comments on: http://bugs.python.org/issue1092502 (which is
> the original issue). However I cannot reopen this issue.
>

Curt opened another bug for it:
http://bugs.python.org/issue2632
someone should change the priority.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c

2008-04-14 Thread Christian Heimes
Collin Winter schrieb:
> It should be pretty easy. I'm working on a 2to3 metaclass fixer, which
> could provide guidance for a 3to2 fixer.
> 
> Do we want to take this opportunity to create a real 3to2 project in
> the sandbox? If so, I'd like to refactor lib2to3 into its own project,
> then import that into 2to3 and 3to2. I can do the work.

Good idea! A 3to2 project is going to make backporting io.py and other
new stuff less painful and much faster. For the io.py backport I spent
most time on removing annotations and replacing "" with u"".

What needs to be done?

* remove funtion annotation

* Add object to all empty class definition

* replace class Egg(metaclass=Spam) with class
Egg(object):\n__metaclass__ = Spam

* Add __future__ imports for print_function and unicode literals

Anything else?

Christian
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Curt Hagenlocher
On Mon, Apr 14, 2008 at 2:29 PM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
>
> Sorry to reply on the mailing list. But this change is wrong.
> e.g. if you're using a buffer size of 16 bytes and try to read 256 bytes, it
> should call recv with a value of 256 and not call recv 16 times with a value
> of 16.
> However, there should be an upper limit (as shown by the imap bug).

There is an upper limit.  It's called "the buffer size".  If someone
specifies a buffer size of 16 bytes, it means "read 16 bytes at a
time".  I don't know why someone would want such a small buffer size,
but presumably they have their reasons.

The only reason "min" is a problem is that there's standard library
code passing a zero to socket.makefile, which gets turned into a
bufsize of 1 by the constructor.  I actually agree with Bill Janssen
-- __init__ is where the real problem lies.  But I think the change to
read() is safer.

--
Curt Hagenlocher
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Ralf Schmitt
On Tue, Apr 15, 2008 at 12:19 AM, Curt Hagenlocher <[EMAIL PROTECTED]>
wrote:

> On Mon, Apr 14, 2008 at 2:29 PM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
> >
> > Sorry to reply on the mailing list. But this change is wrong.
> > e.g. if you're using a buffer size of 16 bytes and try to read 256
> bytes, it
> > should call recv with a value of 256 and not call recv 16 times with a
> value
> > of 16.
> > However, there should be an upper limit (as shown by the imap bug).
>
> There is an upper limit.  It's called "the buffer size".  If someone
> specifies a buffer size of 16 bytes, it means "read 16 bytes at a
> time".  I don't know why someone would want such a small buffer size,
> but presumably they have their reasons.
>

No, I don't agree. To me buffer size means buffer up to buffer_size bytes in
memory.
It does not mean that it should read only buffer_size bytes at once when
asked to read more bytes than buffer size.

 The upper limit I was talking about is the buffer size limit of the
operating system, i.e. the operating system will at a maximum return N bytes
from recv call. It doesn't make sense to ask for more then, and the original
problem with imaplip asking for 10MB of data and then realloc'ing that
buffer would be gone.


> The only reason "min" is a problem is that there's standard library
> code passing a zero to socket.makefile, which gets turned into a
> bufsize of 1 by the constructor.  I actually agree with Bill Janssen
> -- __init__ is where the real problem lies.  But I think the change to
> read() is safer.
>

again no, if I pass in 4 as buffer size, I don't expect the system to make
1024 calls to recv when I want to read 4096 bytes.

Regards,
- Ralf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Guido van Rossum
On Mon, Apr 14, 2008 at 3:57 PM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
>
>
>
> On Tue, Apr 15, 2008 at 12:19 AM, Curt Hagenlocher <[EMAIL PROTECTED]>
> wrote:
> >
> > On Mon, Apr 14, 2008 at 2:29 PM, Ralf Schmitt <[EMAIL PROTECTED]> wrote:
> > >
> > > Sorry to reply on the mailing list. But this change is wrong.
> > > e.g. if you're using a buffer size of 16 bytes and try to read 256
> bytes, it
> > > should call recv with a value of 256 and not call recv 16 times with a
> value
> > > of 16.
> > > However, there should be an upper limit (as shown by the imap bug).
> >
> > There is an upper limit.  It's called "the buffer size".  If someone
> > specifies a buffer size of 16 bytes, it means "read 16 bytes at a
> > time".  I don't know why someone would want such a small buffer size,
> > but presumably they have their reasons.
> >
>
> No, I don't agree. To me buffer size means buffer up to buffer_size bytes in
> memory.
> It does not mean that it should read only buffer_size bytes at once when
> asked to read more bytes than buffer size.
>
>  The upper limit I was talking about is the buffer size limit of the
> operating system, i.e. the operating system will at a maximum return N bytes
> from recv call. It doesn't make sense to ask for more then, and the original
> problem with imaplip asking for 10MB of data and then realloc'ing that
> buffer would be gone.
>
>
> >
> > The only reason "min" is a problem is that there's standard library
> > code passing a zero to socket.makefile, which gets turned into a
> > bufsize of 1 by the constructor.  I actually agree with Bill Janssen
> > -- __init__ is where the real problem lies.  But I think the change to
> > read() is safer.
> >
>
> again no, if I pass in 4 as buffer size, I don't expect the system to make
> 1024 calls to recv when I want to read 4096 bytes.

But why was imaplib apparently specifying 10MB? Did it know there was
that much data? Or did it just not want to bother looping over all the
data in smaller buffer increments (e.g. 64K, which is probably the max
of what most TCP stacks will give you)?

If I'm right with my hunch that the TCP stack will probably clamp at
64K, perhaps we should use min(system limit, max(requested size,
buffer size))?

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Ralf Schmitt
On Tue, Apr 15, 2008 at 1:19 AM, Guido van Rossum <[EMAIL PROTECTED]> wrote:

>
> But why was imaplib apparently specifying 10MB? Did it know there was
> that much data? Or did it just not want to bother looping over all the
> data in smaller buffer increments (e.g. 64K, which is probably the max
> of what most TCP stacks will give you)?
>

Well, calling read with a size of 10MB should be possible. The problem is
that this value ended up
inside the recv call, which then did the malloc/realloc calls.


>
> If I'm right with my hunch that the TCP stack will probably clamp at
> 64K, perhaps we should use min(system limit, max(requested size,
> buffer size))?
>

yes, this is what I was trying to explain.


>
> --
> --Guido van Rossum (home page: 
> http://www.python.org/~guido/
> )
>
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] very bad network performance

2008-04-14 Thread Curt Hagenlocher
On Mon, Apr 14, 2008 at 4:19 PM, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>
> But why was imaplib apparently specifying 10MB? Did it know there was
> that much data? Or did it just not want to bother looping over all the
> data in smaller buffer increments (e.g. 64K, which is probably the max
> of what most TCP stacks will give you)?

I'm going to guess that the code in question is

size = int(self.mo.group('size'))
if __debug__:
if self.debug >= 4:
self._mesg('read literal size %s' % size)
data = self.read(size)

It's reading however many bytes are reported by the server as the size.

> If I'm right with my hunch that the TCP stack will probably clamp at
> 64K, perhaps we should use min(system limit, max(requested size,
> buffer size))?

I have indeed missed the point of the read buffer size.  This would work.

--
Curt Hagenlocher
[EMAIL PROTECTED]
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on having EOFError inherit from EnvironmentError?

2008-04-14 Thread Greg Ewing
Guido van Rossum wrote:
> I don't think of EOFError as an environmental error... This is quite
 > a different level of error than what EnvironmentError typically means

I think it depends. Any "expected" EOFErrors are going to be
caught by the surrounding code before propagating very far.
An *uncaught* EOFError probably means that a file was shorter
than you expected it to be, which counts as an environmental
error to my way of thinking.

My current coding style involves wrapping an "except EnvironmentError"
around any major operation and reporting it as a "File could not be
read/written/whatever because..." kind of message. Having
EOFError get missed by that would be a nuisance.

-- 
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] thoughts on having EOFError inherit from EnvironmentError?

2008-04-14 Thread Guido van Rossum
On Mon, Apr 14, 2008 at 6:59 PM, Greg Ewing <[EMAIL PROTECTED]> wrote:
> Guido van Rossum wrote:
>  > I don't think of EOFError as an environmental error... This is quite
>
>  > a different level of error than what EnvironmentError typically means
>
>  I think it depends. Any "expected" EOFErrors are going to be
>  caught by the surrounding code before propagating very far.
>  An *uncaught* EOFError probably means that a file was shorter
>  than you expected it to be, which counts as an environmental
>  error to my way of thinking.

No, that's some kind of parsing error. EnvironmentError doesn't
concern itself with the contents of files.

>  My current coding style involves wrapping an "except EnvironmentError"
>  around any major operation and reporting it as a "File could not be
>  read/written/whatever because..." kind of message. Having
>  EOFError get missed by that would be a nuisance.

But what operations raise EOFError? Surely you're not using
raw_input()? It's really only there for teaching.

-- 
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] NeedsReview keyword

2008-04-14 Thread Benjamin Peterson
I think it would be useful for the tracker to grow a "NeedsReview"
keyword. I realize the "patch" keyword does some of this, but it may
just represent some initial or trivial work. "NeedsReview" should
represent a mature patch that some senior dev needs to look hard at
and make the choice.

-- 
Cheers,
Benjamin Peterson
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.4.4/2.4.5 test_pty failure on Solaris 10

2008-04-14 Thread Neal Norwitz
On Sat, Apr 12, 2008 at 11:02 AM,  <[EMAIL PROTECTED]> wrote:
>
>  I know this is old stuff, but...
>
>  I want to update our Python 2.4 installation at work from 2.4.2 to 2.4.5
>  (the latest 2.4 source release).  I get a test failure for test_pty, an
>  extra ^M at the end of one line.  I don't get a failure in the 2.4.2
>  installation, but the 2.4.4 and 2.4.5 both fail this test.  Looking at the
>  code in test_pty.py, it appears to me that r43570 fixed things for OSF/1 and
>  IRIX which both do weird things with output while breaking things for any
>  other platform by suppressing the \r\n -> \n mapping which used to be
>  performed for all platforms.  So, for Solaris, that mapping doesn't happen
>  and the actual and expected outputs don't agree.

This was probably me.  Perhaps a fix wasn't backported?  I notice the
2.5 version of the test changed from the 2.4 version and does a
str.replace rather than changing the last chars of the string.  You
can try using the 2.5 version and my guess is it will work (ie, the
test will pass).  The change is in normalize_output.

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] r62342 - python/branches/py3k/Objects/bytesobject.c

2008-04-14 Thread Neal Norwitz
On Mon, Apr 14, 2008 at 2:56 PM, Christian Heimes <[EMAIL PROTECTED]> wrote:
>
>  Good idea! A 3to2 project is going to make backporting io.py and other
>  new stuff less painful and much faster. For the io.py backport I spent
>  most time on removing annotations and replacing "" with u"".
>
>  What needs to be done?
>
>  * remove funtion annotation
>
>  * Add object to all empty class definition
>
>  * replace class Egg(metaclass=Spam) with class
>  Egg(object):\n__metaclass__ = Spam
>
>  * Add __future__ imports for print_function and unicode literals
>
>  Anything else?

Iteration with the dict methods (e.g., keys -> iterkeys()),
map/zip/filter returning iterator rather than list.
int -> (int, long)
str -> basestring or (str, unicode)
__bool__ -> __nonzero__
exec/execfile
input -> rawinput

Most things that have a fixer in 2to3 would also require one in 3to2.
Only things that work in backwards compatible ways like apply/has_key
removal, etc don't need a 3to2 fixer.  Although most of these 3to2
fixers are probably pretty low priority as they are not real likely to
be used in the python code base.  They are still needed for general
user code.

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] NeedsReview keyword

2008-04-14 Thread Martin v. Löwis
> I think it would be useful for the tracker to grow a "NeedsReview"
> keyword. I realize the "patch" keyword does some of this, but it may
> just represent some initial or trivial work. "NeedsReview" should
> represent a mature patch that some senior dev needs to look hard at
> and make the choice.

Not sure what problem that would solve. Over time, I would expect that
any open patch also grows the "NeedsReview" keyword, making the keyword
pointless. If somebody specifically should review a certain proposed
change, the change should be assigned to that person. If someone in
a group should review, they should be contacted by email.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] NeedsReview keyword

2008-04-14 Thread Ralf Schmitt
On Tue, Apr 15, 2008 at 7:21 AM, "Martin v. Löwis" <[EMAIL PROTECTED]>
wrote:

> > I think it would be useful for the tracker to grow a "NeedsReview"
> > keyword. I realize the "patch" keyword does some of this, but it may
> > just represent some initial or trivial work. "NeedsReview" should
> > represent a mature patch that some senior dev needs to look hard at
> > and make the choice.
>
> Not sure what problem that would solve. Over time, I would expect that
> any open patch also grows the "NeedsReview" keyword, making the keyword
> pointless. If somebody specifically should review a certain proposed
> change, the change should be assigned to that person. If someone in
> a group should review, they should be contacted by email.
>

I think it would be nice if that patch keyword could be set by non-admins.
This would mean I didn't have to write to the mailing list asking for people
to look at
some specific bug. Like "did someone look at
http://bugs.python.org/issue2122. This isssue is about mmap.flush not
raising an exception on errors. which I think is a rather severe". (btw. can
someone please look at it? :) )

Regards,
- Ralf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Next monthly sprint/bugfix day?

2008-04-14 Thread Neal Norwitz
On Wed, Apr 9, 2008 at 7:12 AM, Trent Nelson <[EMAIL PROTECTED]> wrote:
> Hi,
>
>  Is there another online sprint/bugfix day in the pipeline?  If not, can 
> there be? ;-)

Trent, I think you just volunteered to lead it. :-)

We should either do it this weekend Apr 19-20 or wait until after the
release.  The first available date should be May 10.

The schedule http://www.python.org/dev/peps/pep-0361/ has the upcoming
May 7 release as the last alpha.  That means we are getting closer to
an API freeze.  Anything that might change an API for 2.6/3.0 should
be addressed sooner rather than later.  If we have to change an API
before release, please update the bug tracker priority to "release
blocker".

n
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] NeedsReview keyword

2008-04-14 Thread Martin v. Löwis
> I think it would be nice if that patch keyword could be set by non-admins.
> This would mean I didn't have to write to the mailing list asking for
> people to look at
> some specific bug. Like "did someone look at
> http://bugs.python.org/issue2122.

Just name your patch files .patch or .diff the next time, not .txt, and
the keyword will get automatically set.

 This isssue is about mmap.flush not
> raising an exception on errors. which I think is a rather severe". (btw.
> can someone please look at it? :) )

I've added the patch keyword. I don't think the bug is rather severe,
as it only affects the mmap module. I also don't see how this could
cause data loss if the application works correctly.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] NeedsReview keyword

2008-04-14 Thread Ralf Schmitt
On Tue, Apr 15, 2008 at 7:54 AM, "Martin v. Löwis" <[EMAIL PROTECTED]>
wrote:

>
> Just name your patch files .patch or .diff the next time, not .txt, and
> the keyword will get automatically set.
>

fine. I used .txt cause I wanted to view it in my browser (without the
browser asking me for an application)


>
>  This isssue is about mmap.flush not
> > raising an exception on errors. which I think is a rather severe". (btw.
> > can someone please look at it? :) )
>
> I've added the patch keyword. I don't think the bug is rather severe,
>

thanks.


> as it only affects the mmap module. I also don't see how this could
> cause data loss if the application works correctly.
>

the flush fails but the programs fails to recognize it? i.e. the program
assumes the data is written to disk but it isn't?

Regards,
- Ralf
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com