[ python-Bugs-1102649 ] pickle files should be opened in binary mode

2005-01-19 Thread SourceForge.net
Bugs item #1102649, was opened at 2005-01-14 16:58
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470

Category: Documentation
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: John Machin (sjmachin)
Assigned to: Nobody/Anonymous (nobody)
Summary: pickle files should be opened in binary mode

Initial Comment:
pickle (and cPickle):

At _each_ mention of the pickle file, the docs should say 
that it should be opened with 'wb' or 'rb' mode as 
appropriate, so that a pickle written on one OS can be 
read reliably on another.

The example code at the end of the section should be 
updated to use the 'b' flag.

--

>Comment By: Tim Peters (tim_one)
Date: 2005-01-19 08:45

Message:
Logged In: YES 
user_id=31435

Yes, binary mode should always be used, regardless of 
protocol.  Else pickles aren't portable across boxes (in 
particular, Unix can't read a protocol 0 pickle produced on 
Windows if the latter was written to a text-mode file).  "text 
mode" was a horrible name for protocol 0.

--

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2005-01-19 00:09

Message:
Logged In: YES 
user_id=3066

In response to irmin's comment:

freopen() is only an option for real file objects; pickles
are often stored or read from other sources.  These other
sources are usually binary to begin with, fortunately,
though this issue probably deserves some real coverage in
the documentation either way.


--

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2005-01-19 00:06

Message:
Logged In: YES 
user_id=3066

Is this true in all cases?  Shouldn't files containing text
pickles (protocol 0) be opened in text mode?  (A problem,
given that all protocols should be readable without prior
knowledge of the protocol used to write the pickle.)

--

Comment By: Irmen de Jong (irmen)
Date: 2005-01-16 10:07

Message:
Logged In: YES 
user_id=129426

Can't the pickle code just freopen() the file itself, using
binary mode?

Or is this against Python's rule "explicit is better than
implicit"

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1102649 ] pickle files should be opened in binary mode

2005-01-19 Thread SourceForge.net
Bugs item #1102649, was opened at 2005-01-15 08:58
Message generated for change (Comment added) made by sjmachin
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470

Category: Documentation
Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: John Machin (sjmachin)
Assigned to: Nobody/Anonymous (nobody)
Summary: pickle files should be opened in binary mode

Initial Comment:
pickle (and cPickle):

At _each_ mention of the pickle file, the docs should say 
that it should be opened with 'wb' or 'rb' mode as 
appropriate, so that a pickle written on one OS can be 
read reliably on another.

The example code at the end of the section should be 
updated to use the 'b' flag.

--

>Comment By: John Machin (sjmachin)
Date: 2005-01-20 00:51

Message:
Logged In: YES 
user_id=480138

Re Fred's question:
Refer to thread starting at 
http://mail.python.org/pipermail/python-dev/2003-
February/033362.html

Looks like the story is like this:

For pickle mode 1 or higher, always use binary mode for 
reading/writing.

For pickle mode 0, either (a) read/write in text mode and if 
moving to another OS, do so in text mode i.e. convert the line 
endings where necessary or (b) as for pickle mode 1+, stick 
with binary throughout.

Also should add a generalisation of Tim's comment re 
NotePad, e.g. something like """A file written with pickle mode 
0 and file mode 'wb' will contain lone linefeeds as line 
terminators. This will cause it to "look funny" when viewed on 
Windows or MacOS as a text file by editors like Notepad that 
do not understand this format."""

--

Comment By: Tim Peters (tim_one)
Date: 2005-01-20 00:45

Message:
Logged In: YES 
user_id=31435

Yes, binary mode should always be used, regardless of 
protocol.  Else pickles aren't portable across boxes (in 
particular, Unix can't read a protocol 0 pickle produced on 
Windows if the latter was written to a text-mode file).  "text 
mode" was a horrible name for protocol 0.

--

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2005-01-19 16:09

Message:
Logged In: YES 
user_id=3066

In response to irmin's comment:

freopen() is only an option for real file objects; pickles
are often stored or read from other sources.  These other
sources are usually binary to begin with, fortunately,
though this issue probably deserves some real coverage in
the documentation either way.


--

Comment By: Fred L. Drake, Jr. (fdrake)
Date: 2005-01-19 16:06

Message:
Logged In: YES 
user_id=3066

Is this true in all cases?  Shouldn't files containing text
pickles (protocol 0) be opened in text mode?  (A problem,
given that all protocols should be readable without prior
knowledge of the protocol used to write the pickle.)

--

Comment By: Irmen de Jong (irmen)
Date: 2005-01-17 02:07

Message:
Logged In: YES 
user_id=129426

Can't the pickle code just freopen() the file itself, using
binary mode?

Or is this against Python's rule "explicit is better than
implicit"

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1105286 ] Undocumented implicit strip() in split(None) string method

2005-01-19 Thread SourceForge.net
Bugs item #1105286, was opened at 2005-01-19 16:04
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: YoHell (yohell)
Assigned to: Nobody/Anonymous (nobody)
Summary: Undocumented implicit strip() in split(None) string method

Initial Comment:
Hi! 

I noticed that the string method split() first does an
implicit strip() before splitting when it's used with
no arguments or with None as the separator (sep in the
docs). There is no mention of this implicit strip() in
the docs.

Example 1:
s = " word1 word2 "

s.split() then returns ['word1', 'word2'] and not ['',
'word1', 'word2', ''] as one might expect.

WHY IS THIS BAD?

1. Because it's undocumented. See:
http://www.python.org/doc/current/lib/string-methods.html#l2h-197

2. Because it may lead to unexpected behavior in programs. 
Example 2:
FASTA sequence headers are one line descriptors of
biological sequences and are on this form: 
">" + Identifier + whitespace + free text description.

Let sHeader be a Python string containing a FASTA
header. One could then use the following syntax to
extract the identifier from the header:

sID = sHeader[1:].split(None, 1)[0]

However, this does not work if sHeader contains a
faulty FASTA header where the identifier is missing or
consists of whitespace. In that case sID will contain
the first word of the free text description, which is
not the desired behavior. 

WHAT SHOULD BE DONE?

The implicit strip() should be removed, or at least
should programmers be given the option to turn it off.
At the very least it should be documented so that
programmers have a chance of adapting their code to it.

Thank you for an otherwise splendid language!
/Joel Hedlund
Ph.D. Student
IFM Bioinformatics
Linköping University

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1105286 ] Undocumented implicit strip() in split(None) string method

2005-01-19 Thread SourceForge.net
Bugs item #1105286, was opened at 2005-01-19 10:04
Message generated for change (Comment added) made by tim_one
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470

Category: Documentation
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: YoHell (yohell)
Assigned to: Nobody/Anonymous (nobody)
Summary: Undocumented implicit strip() in split(None) string method

Initial Comment:
Hi! 

I noticed that the string method split() first does an
implicit strip() before splitting when it's used with
no arguments or with None as the separator (sep in the
docs). There is no mention of this implicit strip() in
the docs.

Example 1:
s = " word1 word2 "

s.split() then returns ['word1', 'word2'] and not ['',
'word1', 'word2', ''] as one might expect.

WHY IS THIS BAD?

1. Because it's undocumented. See:
http://www.python.org/doc/current/lib/string-methods.html#l2h-197

2. Because it may lead to unexpected behavior in programs. 
Example 2:
FASTA sequence headers are one line descriptors of
biological sequences and are on this form: 
">" + Identifier + whitespace + free text description.

Let sHeader be a Python string containing a FASTA
header. One could then use the following syntax to
extract the identifier from the header:

sID = sHeader[1:].split(None, 1)[0]

However, this does not work if sHeader contains a
faulty FASTA header where the identifier is missing or
consists of whitespace. In that case sID will contain
the first word of the free text description, which is
not the desired behavior. 

WHAT SHOULD BE DONE?

The implicit strip() should be removed, or at least
should programmers be given the option to turn it off.
At the very least it should be documented so that
programmers have a chance of adapting their code to it.

Thank you for an otherwise splendid language!
/Joel Hedlund
Ph.D. Student
IFM Bioinformatics
Linköping University

--

>Comment By: Tim Peters (tim_one)
Date: 2005-01-19 11:56

Message:
Logged In: YES 
user_id=31435

I think the docs for split() under "String Methods" are quite 
clear:

"""
...

If sep is not specified or is None, a different splitting 
algorithm is applied. Words are separated by arbitrary length 
strings of whitespace characters (spaces, tabs, newlines, 
returns, and formfeeds). Consecutive whitespace delimiters 
are treated as a single delimiter ("'1 2 3'.split()" 
returns "['1', '2', '3']"). Splitting an empty string returns "['']". 
"""

This won't change, because mountains of code rely on this 
behavior -- it's probably the single most common use case 
for .split().


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1074011 ] write failure ignored in Py_Finalize()

2005-01-19 Thread SourceForge.net
Bugs item #1074011, was opened at 2004-11-27 00:02
Message generated for change (Comment added) made by loewis
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1074011&group_id=5470

Category: Python Interpreter Core
Group: Python 2.3
Status: Open
Resolution: None
Priority: 6
Submitted By: Matthias Klose (doko)
Assigned to: Nobody/Anonymous (nobody)
Summary: write failure ignored in Py_Finalize()

Initial Comment:
[forwarded from http://bugs.debian.org/283108]

Write errors on stdout may be ignored, and hence may 
result in loss of valuable user data. 
 
Here's a quick demo: 
 
$ ./close-bug 
foo 
$ ./close-bug > /dev/full && echo unreported write failure 
unreported write failure 
$ cat close-bug 
#!/usr/bin/python 
import sys 
def main (): 
try: 
print 'foo' 
sys.stdout.close () 
except IOError, e: 
sys.stderr.write ('write failed: %s\n' % e) 
sys.exit (1) 
 
if __name__ == '__main__': 
main () 


This particular failure comes from the following
unchecked fflush 
of stdout in pythonrun.c: 
 
  static void 
  call_ll_exitfuncs(void) 
  { 
  while (nexitfuncs > 0) 
  (*exitfuncs[--nexitfuncs])(); 
 
  fflush(stdout); 
  fflush(stderr); 
  } 
 
Flushing the stream manually, python does raise an
exception.

Please note that simply adding a test for fflush
failure is 
not sufficient.  If you change the above to do this: 
 
  if (fflush(stdout) != 0) 
{ 
  ...handle error... 
} 
 
It will appear to solve the problem. 
But here is a counterexample: 
 
import sys 
def main (): 
try: 
print "x" * 4095 
print 
sys.stdout.close () 
except IOError, e: 
sys.stderr.write ('write failed: %s\n' % e) 
sys.exit (1) 

if __name__ == '__main__': 
main () 
 
If you run the above with stdout redirected to /dev/full, 
it will silently succeed (exit 0) in spite of a write
failure. 
That's what happens on my debian unstable system. 
 
Instead of just checking the fflush return value, 
it should also check ferror: 
 
  if (fflush(stdout) != 0 || ferror(stdout)) 
{ 
  ...handle error... 
} 



--

>Comment By: Martin v. Löwis (loewis)
Date: 2005-01-19 23:28

Message:
Logged In: YES 
user_id=21627

I don't think the patch is right. If somebody explicitly
invokes sys.stdout.close(), this should have the same effect
as invoking fclose(stdout) in C.

It currently doesn't, but with meyering's patch from
2004-12-02 10:20, it still doesn't, so the patch is incorrect.

It might be better to explicitly invoke fclose() if the file
object has no associated f_close function.

--

Comment By: Ben Hutchings (wom-work)
Date: 2004-12-20 00:38

Message:
Logged In: YES 
user_id=203860

Tim, these bugs are quite difficult to trigger, but they can
hide any kind of file error and lose arbitrarily large
amounts of data.

Here, the following program will run indefinitely:

full = open('/dev/full', 'w')
while 1:
print >>full, 'x' * 1023
print >>full

It seems to be essential that both the character that fills
the file buffer (here it is 1024 bytes long) and the next
are generated implicitly by print - otherwise the write
error will be detected.


--

Comment By: Tim Peters (tim_one)
Date: 2004-12-19 23:24

Message:
Logged In: YES 
user_id=31435

Sorry, don't care enough to spend time on it (not a bug I've 
had, not one I expect to have, don't care if it never 
changes).  Suggest not using /dev/full as an output device 
.

--

Comment By: Raymond Hettinger (rhettinger)
Date: 2004-12-19 22:47

Message:
Logged In: YES 
user_id=80475

Tim, what do you think?

--

Comment By: Ben Hutchings (wom-work)
Date: 2004-12-07 01:33

Message:
Logged In: YES 
user_id=203860

OK, I can reproduce the remaining problem if I substitute
1023 for 4095. The culprit seems to be the unchecked fputs()
in PyFile_WriteString, which is used for the spaces and
newlines generated by the print statement but not for the
objects. I think that's a separate bug.

--

Comment By: Jim Meyering (meyering)
Date: 2004-12-07 00:27

Message:
Logged In: YES 
user_id=41497

Even with python-2.4 (built fresh from CVS this morning),
I can still reproduce the problem on a Linux-2.6.9/ext3 system:

  /p/p/python-2.4/bin/python write-4096 > /dev/full && echo fail
  fail

The size that provokes the failure depends on the I/O block size
of your system, so you might need something as big as 131072
on some other type of system.

--

C

[ python-Bugs-1105699 ] Warnings in Python.h with gcc 4.0.0

2005-01-19 Thread SourceForge.net
Bugs item #1105699, was opened at 2005-01-19 19:52
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105699&group_id=5470

Category: Build
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Submitted By: Bob Ippolito (etrepum)
Assigned to: Nobody/Anonymous (nobody)
Summary: Warnings in Python.h with gcc 4.0.0

Initial Comment:
(this happens for every file that includes Python.h)

In file included from ../Include/Python.h:55,
 from ../Objects/intobject.c:4:
../Include/pyport.h:396: warning: 'struct winsize' declared inside 
parameter list
../Include/pyport.h:397: warning: 'struct winsize' declared inside 
parameter list

The source lines look like this:
extern int openpty(int *, int *, char *, struct termios *, struct 
winsize *);
extern int forkpty(int *, char *, struct termios *, struct winsize *);

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105699&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1105706 ] incorrect constant names in curses window objects page

2005-01-19 Thread SourceForge.net
Bugs item #1105706, was opened at 2005-01-19 23:19
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105706&group_id=5470

Category: Documentation
Group: Python 2.3
Status: Open
Resolution: None
Priority: 5
Submitted By: dcrosta (dcrosta)
Assigned to: Nobody/Anonymous (nobody)
Summary: incorrect constant names in curses window objects page

Initial Comment:
The documentation for the border() function in the
curses "Window Objects" page
(http://www.python.org/doc/2.3.4/lib/curses-window-objects.html)
says that ACS_BRCORNER and ACS_BLCORNER are the
defaults for the lower left and right corners,
respectively. The curses "Constants" page
(http://www.python.org/doc/2.3.4/lib/node218.html) has
the correct names, ACS_LRCORNER and ACS_LLCORNER,
respectively.

My system:
Python 2.3.4 on Gentoo GNU/Linux

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105706&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1105770 ] null source chars handled oddly

2005-01-19 Thread SourceForge.net
Bugs item #1105770, was opened at 2005-01-19 23:35
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105770&group_id=5470

Category: Parser/Compiler
Group: None
Status: Open
Resolution: None
Priority: 5
Submitted By: Reginald B. Charney (rcharney)
Assigned to: Nobody/Anonymous (nobody)
Summary: null source chars handled oddly

Initial Comment:
When null characters appear in the source, outside
literals, tokenize seems to either: skip the null
character and the next two following characters; or
ignore the remainder of the line, including the newline
character.

(To see the invalid characters, use vim, or an editor
that displays control characters when needed.)

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105770&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com