[ python-Bugs-1102649 ] pickle files should be opened in binary mode
Bugs item #1102649, was opened at 2005-01-14 16:58 Message generated for change (Comment added) made by tim_one You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470 Category: Documentation Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: John Machin (sjmachin) Assigned to: Nobody/Anonymous (nobody) Summary: pickle files should be opened in binary mode Initial Comment: pickle (and cPickle): At _each_ mention of the pickle file, the docs should say that it should be opened with 'wb' or 'rb' mode as appropriate, so that a pickle written on one OS can be read reliably on another. The example code at the end of the section should be updated to use the 'b' flag. -- >Comment By: Tim Peters (tim_one) Date: 2005-01-19 08:45 Message: Logged In: YES user_id=31435 Yes, binary mode should always be used, regardless of protocol. Else pickles aren't portable across boxes (in particular, Unix can't read a protocol 0 pickle produced on Windows if the latter was written to a text-mode file). "text mode" was a horrible name for protocol 0. -- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2005-01-19 00:09 Message: Logged In: YES user_id=3066 In response to irmin's comment: freopen() is only an option for real file objects; pickles are often stored or read from other sources. These other sources are usually binary to begin with, fortunately, though this issue probably deserves some real coverage in the documentation either way. -- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2005-01-19 00:06 Message: Logged In: YES user_id=3066 Is this true in all cases? Shouldn't files containing text pickles (protocol 0) be opened in text mode? (A problem, given that all protocols should be readable without prior knowledge of the protocol used to write the pickle.) -- Comment By: Irmen de Jong (irmen) Date: 2005-01-16 10:07 Message: Logged In: YES user_id=129426 Can't the pickle code just freopen() the file itself, using binary mode? Or is this against Python's rule "explicit is better than implicit" -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1102649 ] pickle files should be opened in binary mode
Bugs item #1102649, was opened at 2005-01-15 08:58 Message generated for change (Comment added) made by sjmachin You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470 Category: Documentation Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: John Machin (sjmachin) Assigned to: Nobody/Anonymous (nobody) Summary: pickle files should be opened in binary mode Initial Comment: pickle (and cPickle): At _each_ mention of the pickle file, the docs should say that it should be opened with 'wb' or 'rb' mode as appropriate, so that a pickle written on one OS can be read reliably on another. The example code at the end of the section should be updated to use the 'b' flag. -- >Comment By: John Machin (sjmachin) Date: 2005-01-20 00:51 Message: Logged In: YES user_id=480138 Re Fred's question: Refer to thread starting at http://mail.python.org/pipermail/python-dev/2003- February/033362.html Looks like the story is like this: For pickle mode 1 or higher, always use binary mode for reading/writing. For pickle mode 0, either (a) read/write in text mode and if moving to another OS, do so in text mode i.e. convert the line endings where necessary or (b) as for pickle mode 1+, stick with binary throughout. Also should add a generalisation of Tim's comment re NotePad, e.g. something like """A file written with pickle mode 0 and file mode 'wb' will contain lone linefeeds as line terminators. This will cause it to "look funny" when viewed on Windows or MacOS as a text file by editors like Notepad that do not understand this format.""" -- Comment By: Tim Peters (tim_one) Date: 2005-01-20 00:45 Message: Logged In: YES user_id=31435 Yes, binary mode should always be used, regardless of protocol. Else pickles aren't portable across boxes (in particular, Unix can't read a protocol 0 pickle produced on Windows if the latter was written to a text-mode file). "text mode" was a horrible name for protocol 0. -- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2005-01-19 16:09 Message: Logged In: YES user_id=3066 In response to irmin's comment: freopen() is only an option for real file objects; pickles are often stored or read from other sources. These other sources are usually binary to begin with, fortunately, though this issue probably deserves some real coverage in the documentation either way. -- Comment By: Fred L. Drake, Jr. (fdrake) Date: 2005-01-19 16:06 Message: Logged In: YES user_id=3066 Is this true in all cases? Shouldn't files containing text pickles (protocol 0) be opened in text mode? (A problem, given that all protocols should be readable without prior knowledge of the protocol used to write the pickle.) -- Comment By: Irmen de Jong (irmen) Date: 2005-01-17 02:07 Message: Logged In: YES user_id=129426 Can't the pickle code just freopen() the file itself, using binary mode? Or is this against Python's rule "explicit is better than implicit" -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1102649&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105286 ] Undocumented implicit strip() in split(None) string method
Bugs item #1105286, was opened at 2005-01-19 16:04 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: YoHell (yohell) Assigned to: Nobody/Anonymous (nobody) Summary: Undocumented implicit strip() in split(None) string method Initial Comment: Hi! I noticed that the string method split() first does an implicit strip() before splitting when it's used with no arguments or with None as the separator (sep in the docs). There is no mention of this implicit strip() in the docs. Example 1: s = " word1 word2 " s.split() then returns ['word1', 'word2'] and not ['', 'word1', 'word2', ''] as one might expect. WHY IS THIS BAD? 1. Because it's undocumented. See: http://www.python.org/doc/current/lib/string-methods.html#l2h-197 2. Because it may lead to unexpected behavior in programs. Example 2: FASTA sequence headers are one line descriptors of biological sequences and are on this form: ">" + Identifier + whitespace + free text description. Let sHeader be a Python string containing a FASTA header. One could then use the following syntax to extract the identifier from the header: sID = sHeader[1:].split(None, 1)[0] However, this does not work if sHeader contains a faulty FASTA header where the identifier is missing or consists of whitespace. In that case sID will contain the first word of the free text description, which is not the desired behavior. WHAT SHOULD BE DONE? The implicit strip() should be removed, or at least should programmers be given the option to turn it off. At the very least it should be documented so that programmers have a chance of adapting their code to it. Thank you for an otherwise splendid language! /Joel Hedlund Ph.D. Student IFM Bioinformatics Linköping University -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105286 ] Undocumented implicit strip() in split(None) string method
Bugs item #1105286, was opened at 2005-01-19 10:04 Message generated for change (Comment added) made by tim_one You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: YoHell (yohell) Assigned to: Nobody/Anonymous (nobody) Summary: Undocumented implicit strip() in split(None) string method Initial Comment: Hi! I noticed that the string method split() first does an implicit strip() before splitting when it's used with no arguments or with None as the separator (sep in the docs). There is no mention of this implicit strip() in the docs. Example 1: s = " word1 word2 " s.split() then returns ['word1', 'word2'] and not ['', 'word1', 'word2', ''] as one might expect. WHY IS THIS BAD? 1. Because it's undocumented. See: http://www.python.org/doc/current/lib/string-methods.html#l2h-197 2. Because it may lead to unexpected behavior in programs. Example 2: FASTA sequence headers are one line descriptors of biological sequences and are on this form: ">" + Identifier + whitespace + free text description. Let sHeader be a Python string containing a FASTA header. One could then use the following syntax to extract the identifier from the header: sID = sHeader[1:].split(None, 1)[0] However, this does not work if sHeader contains a faulty FASTA header where the identifier is missing or consists of whitespace. In that case sID will contain the first word of the free text description, which is not the desired behavior. WHAT SHOULD BE DONE? The implicit strip() should be removed, or at least should programmers be given the option to turn it off. At the very least it should be documented so that programmers have a chance of adapting their code to it. Thank you for an otherwise splendid language! /Joel Hedlund Ph.D. Student IFM Bioinformatics Linköping University -- >Comment By: Tim Peters (tim_one) Date: 2005-01-19 11:56 Message: Logged In: YES user_id=31435 I think the docs for split() under "String Methods" are quite clear: """ ... If sep is not specified or is None, a different splitting algorithm is applied. Words are separated by arbitrary length strings of whitespace characters (spaces, tabs, newlines, returns, and formfeeds). Consecutive whitespace delimiters are treated as a single delimiter ("'1 2 3'.split()" returns "['1', '2', '3']"). Splitting an empty string returns "['']". """ This won't change, because mountains of code rely on this behavior -- it's probably the single most common use case for .split(). -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1074011 ] write failure ignored in Py_Finalize()
Bugs item #1074011, was opened at 2004-11-27 00:02 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1074011&group_id=5470 Category: Python Interpreter Core Group: Python 2.3 Status: Open Resolution: None Priority: 6 Submitted By: Matthias Klose (doko) Assigned to: Nobody/Anonymous (nobody) Summary: write failure ignored in Py_Finalize() Initial Comment: [forwarded from http://bugs.debian.org/283108] Write errors on stdout may be ignored, and hence may result in loss of valuable user data. Here's a quick demo: $ ./close-bug foo $ ./close-bug > /dev/full && echo unreported write failure unreported write failure $ cat close-bug #!/usr/bin/python import sys def main (): try: print 'foo' sys.stdout.close () except IOError, e: sys.stderr.write ('write failed: %s\n' % e) sys.exit (1) if __name__ == '__main__': main () This particular failure comes from the following unchecked fflush of stdout in pythonrun.c: static void call_ll_exitfuncs(void) { while (nexitfuncs > 0) (*exitfuncs[--nexitfuncs])(); fflush(stdout); fflush(stderr); } Flushing the stream manually, python does raise an exception. Please note that simply adding a test for fflush failure is not sufficient. If you change the above to do this: if (fflush(stdout) != 0) { ...handle error... } It will appear to solve the problem. But here is a counterexample: import sys def main (): try: print "x" * 4095 print sys.stdout.close () except IOError, e: sys.stderr.write ('write failed: %s\n' % e) sys.exit (1) if __name__ == '__main__': main () If you run the above with stdout redirected to /dev/full, it will silently succeed (exit 0) in spite of a write failure. That's what happens on my debian unstable system. Instead of just checking the fflush return value, it should also check ferror: if (fflush(stdout) != 0 || ferror(stdout)) { ...handle error... } -- >Comment By: Martin v. Löwis (loewis) Date: 2005-01-19 23:28 Message: Logged In: YES user_id=21627 I don't think the patch is right. If somebody explicitly invokes sys.stdout.close(), this should have the same effect as invoking fclose(stdout) in C. It currently doesn't, but with meyering's patch from 2004-12-02 10:20, it still doesn't, so the patch is incorrect. It might be better to explicitly invoke fclose() if the file object has no associated f_close function. -- Comment By: Ben Hutchings (wom-work) Date: 2004-12-20 00:38 Message: Logged In: YES user_id=203860 Tim, these bugs are quite difficult to trigger, but they can hide any kind of file error and lose arbitrarily large amounts of data. Here, the following program will run indefinitely: full = open('/dev/full', 'w') while 1: print >>full, 'x' * 1023 print >>full It seems to be essential that both the character that fills the file buffer (here it is 1024 bytes long) and the next are generated implicitly by print - otherwise the write error will be detected. -- Comment By: Tim Peters (tim_one) Date: 2004-12-19 23:24 Message: Logged In: YES user_id=31435 Sorry, don't care enough to spend time on it (not a bug I've had, not one I expect to have, don't care if it never changes). Suggest not using /dev/full as an output device . -- Comment By: Raymond Hettinger (rhettinger) Date: 2004-12-19 22:47 Message: Logged In: YES user_id=80475 Tim, what do you think? -- Comment By: Ben Hutchings (wom-work) Date: 2004-12-07 01:33 Message: Logged In: YES user_id=203860 OK, I can reproduce the remaining problem if I substitute 1023 for 4095. The culprit seems to be the unchecked fputs() in PyFile_WriteString, which is used for the spaces and newlines generated by the print statement but not for the objects. I think that's a separate bug. -- Comment By: Jim Meyering (meyering) Date: 2004-12-07 00:27 Message: Logged In: YES user_id=41497 Even with python-2.4 (built fresh from CVS this morning), I can still reproduce the problem on a Linux-2.6.9/ext3 system: /p/p/python-2.4/bin/python write-4096 > /dev/full && echo fail fail The size that provokes the failure depends on the I/O block size of your system, so you might need something as big as 131072 on some other type of system. -- C
[ python-Bugs-1105699 ] Warnings in Python.h with gcc 4.0.0
Bugs item #1105699, was opened at 2005-01-19 19:52 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105699&group_id=5470 Category: Build Group: Python 2.5 Status: Open Resolution: None Priority: 5 Submitted By: Bob Ippolito (etrepum) Assigned to: Nobody/Anonymous (nobody) Summary: Warnings in Python.h with gcc 4.0.0 Initial Comment: (this happens for every file that includes Python.h) In file included from ../Include/Python.h:55, from ../Objects/intobject.c:4: ../Include/pyport.h:396: warning: 'struct winsize' declared inside parameter list ../Include/pyport.h:397: warning: 'struct winsize' declared inside parameter list The source lines look like this: extern int openpty(int *, int *, char *, struct termios *, struct winsize *); extern int forkpty(int *, char *, struct termios *, struct winsize *); -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105699&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105706 ] incorrect constant names in curses window objects page
Bugs item #1105706, was opened at 2005-01-19 23:19 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105706&group_id=5470 Category: Documentation Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: dcrosta (dcrosta) Assigned to: Nobody/Anonymous (nobody) Summary: incorrect constant names in curses window objects page Initial Comment: The documentation for the border() function in the curses "Window Objects" page (http://www.python.org/doc/2.3.4/lib/curses-window-objects.html) says that ACS_BRCORNER and ACS_BLCORNER are the defaults for the lower left and right corners, respectively. The curses "Constants" page (http://www.python.org/doc/2.3.4/lib/node218.html) has the correct names, ACS_LRCORNER and ACS_LLCORNER, respectively. My system: Python 2.3.4 on Gentoo GNU/Linux -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105706&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105770 ] null source chars handled oddly
Bugs item #1105770, was opened at 2005-01-19 23:35 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105770&group_id=5470 Category: Parser/Compiler Group: None Status: Open Resolution: None Priority: 5 Submitted By: Reginald B. Charney (rcharney) Assigned to: Nobody/Anonymous (nobody) Summary: null source chars handled oddly Initial Comment: When null characters appear in the source, outside literals, tokenize seems to either: skip the null character and the next two following characters; or ignore the remainder of the line, including the newline character. (To see the invalid characters, use vim, or an editor that displays control characters when needed.) -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105770&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com