[ python-Bugs-1105998 ] os.stat int/float oddity
Bugs item #1105998, was opened at 2005-01-20 15:04 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105998&group_id=5470 Category: Python Library Group: Python 2.5 >Status: Closed >Resolution: Fixed Priority: 5 Submitted By: George Yoshida (quiver) Assigned to: Martin v. Löwis (loewis) Summary: os.stat int/float oddity Initial Comment: Since the last change to posixmodule.c(Revision 2.332) by Martin, test_os.py fails on my Linux box. The reason is that when an os.stat object is accessed through obj.attr or obj[index], they do not always represent the same type. Take, for example, st_atime, st_ctime and st_mtime. With Martin's change, if they're accessed like stat_obj.st_atime, it returns a float value. On the other hand, stat_obj[stat.ST_ATIME] still remains to return an integer value. Here is the result of running test_os.py(abbreviated) :: test_tempnam (__main__.TemporaryFileTests) ... ok test_tmpfile (__main__.TemporaryFileTests) ... ok test_tmpnam (__main__.TemporaryFileTests) ... ok test_stat_attributes (__main__.StatAttributeTests) ... FAIL [snip] == FAIL: test_stat_attributes (__main__.StatAttributeTests) --- --- Traceback (most recent call last): File "./test_os.py", line 115, in test_stat_attributes result[getattr(stat, name)]) AssertionError: 1106224156.927747 != 1106224156 --- --- Ran 23 tests in 0.032s FAILED (failures=1) -- >Comment By: Martin v. Löwis (loewis) Date: 2005-01-23 10:29 Message: Logged In: YES user_id=21627 Thanks for the report. Fixed in test_os.py 1.28 -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105998&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1074011 ] write failure ignored in Py_Finalize()
Bugs item #1074011, was opened at 2004-11-27 00:02 Message generated for change (Comment added) made by loewis You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1074011&group_id=5470 Category: Python Interpreter Core Group: Python 2.3 >Status: Closed >Resolution: Accepted Priority: 6 Submitted By: Matthias Klose (doko) Assigned to: Nobody/Anonymous (nobody) Summary: write failure ignored in Py_Finalize() Initial Comment: [forwarded from http://bugs.debian.org/283108] Write errors on stdout may be ignored, and hence may result in loss of valuable user data. Here's a quick demo: $ ./close-bug foo $ ./close-bug > /dev/full && echo unreported write failure unreported write failure $ cat close-bug #!/usr/bin/python import sys def main (): try: print 'foo' sys.stdout.close () except IOError, e: sys.stderr.write ('write failed: %s\n' % e) sys.exit (1) if __name__ == '__main__': main () This particular failure comes from the following unchecked fflush of stdout in pythonrun.c: static void call_ll_exitfuncs(void) { while (nexitfuncs > 0) (*exitfuncs[--nexitfuncs])(); fflush(stdout); fflush(stderr); } Flushing the stream manually, python does raise an exception. Please note that simply adding a test for fflush failure is not sufficient. If you change the above to do this: if (fflush(stdout) != 0) { ...handle error... } It will appear to solve the problem. But here is a counterexample: import sys def main (): try: print "x" * 4095 print sys.stdout.close () except IOError, e: sys.stderr.write ('write failed: %s\n' % e) sys.exit (1) if __name__ == '__main__': main () If you run the above with stdout redirected to /dev/full, it will silently succeed (exit 0) in spite of a write failure. That's what happens on my debian unstable system. Instead of just checking the fflush return value, it should also check ferror: if (fflush(stdout) != 0 || ferror(stdout)) { ...handle error... } -- >Comment By: Martin v. Löwis (loewis) Date: 2005-01-23 10:51 Message: Logged In: YES user_id=21627 Thanks for the report and the patch. Committed as NEWS 1.1232 sysmodule.c 2.127 NEWS 1.1193.2.15 sysmodule.c 2.126.2.1 NEWS 1.831.4.164 sysmodule.c 2.120.6.2 -- Comment By: Jim Meyering (meyering) Date: 2005-01-20 10:24 Message: Logged In: YES user_id=41497 Hi Martin, I would have done that, but sys.stdout.close is already defined *not* to close stdout. Here's the relevant FAQ: 1.4.7 Why doesn't closing sys.stdout (stdin, stderr) really close it? http://www.python.org/doc/faq/library.html#id28 -- Comment By: Martin v. Löwis (loewis) Date: 2005-01-19 23:28 Message: Logged In: YES user_id=21627 I don't think the patch is right. If somebody explicitly invokes sys.stdout.close(), this should have the same effect as invoking fclose(stdout) in C. It currently doesn't, but with meyering's patch from 2004-12-02 10:20, it still doesn't, so the patch is incorrect. It might be better to explicitly invoke fclose() if the file object has no associated f_close function. -- Comment By: Ben Hutchings (wom-work) Date: 2004-12-20 00:38 Message: Logged In: YES user_id=203860 Tim, these bugs are quite difficult to trigger, but they can hide any kind of file error and lose arbitrarily large amounts of data. Here, the following program will run indefinitely: full = open('/dev/full', 'w') while 1: print >>full, 'x' * 1023 print >>full It seems to be essential that both the character that fills the file buffer (here it is 1024 bytes long) and the next are generated implicitly by print - otherwise the write error will be detected. -- Comment By: Tim Peters (tim_one) Date: 2004-12-19 23:24 Message: Logged In: YES user_id=31435 Sorry, don't care enough to spend time on it (not a bug I've had, not one I expect to have, don't care if it never changes). Suggest not using /dev/full as an output device . -- Comment By: Raymond Hettinger (rhettinger) Date: 2004-12-19 22:47 Message: Logged In: YES user_id=80475 Tim, what do you think? -- Comment By: Ben Hutchings (wom-work) Date: 2004-12-07 01:33 Message: Logged In: YES user_id=203860 OK, I can reproduce the remaining problem if I substitute 1023 for 4095. The culprit seems to be the unchecked
[ python-Bugs-1105950 ] bug with idle's stdout when executing load_source
Bugs item #1105950, was opened at 2005-01-20 08:08 Message generated for change (Settings changed) made by kbk You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 Category: IDLE Group: Python 2.3 Status: Open Resolution: None Priority: 5 Submitted By: imperialfists (imperialfists) >Assigned to: Kurt B. Kaiser (kbk) Summary: bug with idle's stdout when executing load_source Initial Comment: There is a bug in idle caused by load_source, which switches the stdout of idle to something else. Here is what I did: Python 2.3.4 (#1, Nov 2 2004, 11:18:38) [GCC 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)] on linux2 [...i leave this out...] IDLE 1.0.3 >>> from sys import stdout >>> print stdout >>> print 'a' a >>> from imp import load_source >>> print 'a' a >>> print stdout >>> m = load_source('bug.py', 'bug.py', open('bug.py')) >>> print 'a' >>> print stdout >>> the file 'bug.py' contains the following line: from types import * meanwhile i see this on my terminal: a when i type "import bug" or "from bug import *" everything works fine. This bug also works (at least for me) if I start idle from the the "Run Command" dialog under kde, instead of the terminal. -- Comment By: imperialfists (imperialfists) Date: 2005-01-21 03:45 Message: Logged In: YES user_id=1201021 Also i find a file named 'c' in my current working directory. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105950 ] bug with idle's stdout when executing load_source
Bugs item #1105950, was opened at 2005-01-20 08:08 Message generated for change (Comment added) made by kbk You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 Category: IDLE Group: Python 2.3 >Status: Closed >Resolution: Wont Fix Priority: 5 Submitted By: imperialfists (imperialfists) Assigned to: Kurt B. Kaiser (kbk) Summary: bug with idle's stdout when executing load_source Initial Comment: There is a bug in idle caused by load_source, which switches the stdout of idle to something else. Here is what I did: Python 2.3.4 (#1, Nov 2 2004, 11:18:38) [GCC 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)] on linux2 [...i leave this out...] IDLE 1.0.3 >>> from sys import stdout >>> print stdout >>> print 'a' a >>> from imp import load_source >>> print 'a' a >>> print stdout >>> m = load_source('bug.py', 'bug.py', open('bug.py')) >>> print 'a' >>> print stdout >>> the file 'bug.py' contains the following line: from types import * meanwhile i see this on my terminal: a when i type "import bug" or "from bug import *" everything works fine. This bug also works (at least for me) if I start idle from the the "Run Command" dialog under kde, instead of the terminal. -- >Comment By: Kurt B. Kaiser (kbk) Date: 2005-01-23 19:10 Message: Logged In: YES user_id=149084 These low level imp module functions don't just import, they reload() the module. This destroys the redirection of stdin, stdout, stderr that IDLE has set up to work with Tkinter. Using IDLE without its subprocess, and with verbose output: hydra /home/kbk/PYSRC/Lib/idlelib$ ../../python -v ./PyShell.py -n # installing zipimport hook import zipimport # builtin # installed zipimport hook [...] In the IDLE shell: IDLE 1.2a0 No Subprocess >>> import sys >>> sys.stdout <__main__.PseudoFile object at 0x42feac> >>> reload(sys) >>> sys.stdout >>> And on the console: [...] import sys # previously loaded (sys) ', mode 'w' at 0xfa068> imp.load_source() is incompatible with IDLE. Please use the high level import statements. I didn't find any extra files. -- Comment By: imperialfists (imperialfists) Date: 2005-01-21 03:45 Message: Logged In: YES user_id=1201021 Also i find a file named 'c' in my current working directory. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1108060 ] "\0" not listed as a valid escape in the lang reference
Bugs item #1108060, was opened at 2005-01-24 13:09 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1108060&group_id=5470 Category: Documentation Group: Python 2.4 Status: Open Resolution: None Priority: 5 Submitted By: Andrew Bennetts (spiv) Assigned to: Nobody/Anonymous (nobody) Summary: "\0" not listed as a valid escape in the lang reference Initial Comment: According to table in http://docs.python.org/ref/strings.html, the list of valid escape sequences in strings does not include \0. It appears that the parser actually allows \n for values of n in the range 0-7, but this is not documented. Many people with exposure to C expect \0 to be valid (and it does work, after all!). A quick grep on my system finds many libraries use \0 in string literals, including: - Twisted - HTMLgen - PIL - numarray - Reportlab - and of course the standard library: tarfile, gzip, pystone, binhex, and others. I suggest the documentation be updated to officially support \0 as a valid escape. I don't care as much about \1 through to \7... I was surprised they worked (and then surprised that \8 and \9 didn't), and I think they might as well be deprecated, but I don't care much either way. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1108060&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105950 ] bug with idle's stdout when executing load_source
Bugs item #1105950, was opened at 2005-01-20 14:08 Message generated for change (Comment added) made by imperialfists You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 Category: IDLE Group: Python 2.3 Status: Closed Resolution: Wont Fix Priority: 5 Submitted By: imperialfists (imperialfists) Assigned to: Kurt B. Kaiser (kbk) Summary: bug with idle's stdout when executing load_source Initial Comment: There is a bug in idle caused by load_source, which switches the stdout of idle to something else. Here is what I did: Python 2.3.4 (#1, Nov 2 2004, 11:18:38) [GCC 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)] on linux2 [...i leave this out...] IDLE 1.0.3 >>> from sys import stdout >>> print stdout >>> print 'a' a >>> from imp import load_source >>> print 'a' a >>> print stdout >>> m = load_source('bug.py', 'bug.py', open('bug.py')) >>> print 'a' >>> print stdout >>> the file 'bug.py' contains the following line: from types import * meanwhile i see this on my terminal: a when i type "import bug" or "from bug import *" everything works fine. This bug also works (at least for me) if I start idle from the the "Run Command" dialog under kde, instead of the terminal. -- >Comment By: imperialfists (imperialfists) Date: 2005-01-24 03:50 Message: Logged In: YES user_id=1201021 I did not always find a file named c either, it was more of a random occurance, than something reliably reproducable. -- Comment By: Kurt B. Kaiser (kbk) Date: 2005-01-24 01:10 Message: Logged In: YES user_id=149084 These low level imp module functions don't just import, they reload() the module. This destroys the redirection of stdin, stdout, stderr that IDLE has set up to work with Tkinter. Using IDLE without its subprocess, and with verbose output: hydra /home/kbk/PYSRC/Lib/idlelib$ ../../python -v ./PyShell.py -n # installing zipimport hook import zipimport # builtin # installed zipimport hook [...] In the IDLE shell: IDLE 1.2a0 No Subprocess >>> import sys >>> sys.stdout <__main__.PseudoFile object at 0x42feac> >>> reload(sys) >>> sys.stdout >>> And on the console: [...] import sys # previously loaded (sys) ', mode 'w' at 0xfa068> imp.load_source() is incompatible with IDLE. Please use the high level import statements. I didn't find any extra files. -- Comment By: imperialfists (imperialfists) Date: 2005-01-21 09:45 Message: Logged In: YES user_id=1201021 Also i find a file named 'c' in my current working directory. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105950 ] idle's stdout is redirected when executing imp.load_source
Bugs item #1105950, was opened at 2005-01-20 14:08 Message generated for change (Settings changed) made by imperialfists You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 Category: IDLE Group: Python 2.3 Status: Closed Resolution: Wont Fix Priority: 5 Submitted By: imperialfists (imperialfists) Assigned to: Kurt B. Kaiser (kbk) >Summary: idle's stdout is redirected when executing imp.load_source Initial Comment: There is a bug in idle caused by load_source, which switches the stdout of idle to something else. Here is what I did: Python 2.3.4 (#1, Nov 2 2004, 11:18:38) [GCC 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)] on linux2 [...i leave this out...] IDLE 1.0.3 >>> from sys import stdout >>> print stdout >>> print 'a' a >>> from imp import load_source >>> print 'a' a >>> print stdout >>> m = load_source('bug.py', 'bug.py', open('bug.py')) >>> print 'a' >>> print stdout >>> the file 'bug.py' contains the following line: from types import * meanwhile i see this on my terminal: a when i type "import bug" or "from bug import *" everything works fine. This bug also works (at least for me) if I start idle from the the "Run Command" dialog under kde, instead of the terminal. -- Comment By: imperialfists (imperialfists) Date: 2005-01-24 03:50 Message: Logged In: YES user_id=1201021 I did not always find a file named c either, it was more of a random occurance, than something reliably reproducable. -- Comment By: Kurt B. Kaiser (kbk) Date: 2005-01-24 01:10 Message: Logged In: YES user_id=149084 These low level imp module functions don't just import, they reload() the module. This destroys the redirection of stdin, stdout, stderr that IDLE has set up to work with Tkinter. Using IDLE without its subprocess, and with verbose output: hydra /home/kbk/PYSRC/Lib/idlelib$ ../../python -v ./PyShell.py -n # installing zipimport hook import zipimport # builtin # installed zipimport hook [...] In the IDLE shell: IDLE 1.2a0 No Subprocess >>> import sys >>> sys.stdout <__main__.PseudoFile object at 0x42feac> >>> reload(sys) >>> sys.stdout >>> And on the console: [...] import sys # previously loaded (sys) ', mode 'w' at 0xfa068> imp.load_source() is incompatible with IDLE. Please use the high level import statements. I didn't find any extra files. -- Comment By: imperialfists (imperialfists) Date: 2005-01-21 09:45 Message: Logged In: YES user_id=1201021 Also i find a file named 'c' in my current working directory. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105950 ] idle's stdout is redirected when executing imp.load_source
Bugs item #1105950, was opened at 2005-01-20 14:08 Message generated for change (Settings changed) made by imperialfists You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 Category: IDLE >Group: Not a Bug Status: Closed Resolution: Wont Fix Priority: 5 Submitted By: imperialfists (imperialfists) Assigned to: Kurt B. Kaiser (kbk) Summary: idle's stdout is redirected when executing imp.load_source Initial Comment: There is a bug in idle caused by load_source, which switches the stdout of idle to something else. Here is what I did: Python 2.3.4 (#1, Nov 2 2004, 11:18:38) [GCC 3.3.4 20040623 (Gentoo Linux 3.3.4-r1, ssp-3.3.2-2, pie-8.7.6)] on linux2 [...i leave this out...] IDLE 1.0.3 >>> from sys import stdout >>> print stdout >>> print 'a' a >>> from imp import load_source >>> print 'a' a >>> print stdout >>> m = load_source('bug.py', 'bug.py', open('bug.py')) >>> print 'a' >>> print stdout >>> the file 'bug.py' contains the following line: from types import * meanwhile i see this on my terminal: a when i type "import bug" or "from bug import *" everything works fine. This bug also works (at least for me) if I start idle from the the "Run Command" dialog under kde, instead of the terminal. -- Comment By: imperialfists (imperialfists) Date: 2005-01-24 03:50 Message: Logged In: YES user_id=1201021 I did not always find a file named c either, it was more of a random occurance, than something reliably reproducable. -- Comment By: Kurt B. Kaiser (kbk) Date: 2005-01-24 01:10 Message: Logged In: YES user_id=149084 These low level imp module functions don't just import, they reload() the module. This destroys the redirection of stdin, stdout, stderr that IDLE has set up to work with Tkinter. Using IDLE without its subprocess, and with verbose output: hydra /home/kbk/PYSRC/Lib/idlelib$ ../../python -v ./PyShell.py -n # installing zipimport hook import zipimport # builtin # installed zipimport hook [...] In the IDLE shell: IDLE 1.2a0 No Subprocess >>> import sys >>> sys.stdout <__main__.PseudoFile object at 0x42feac> >>> reload(sys) >>> sys.stdout >>> And on the console: [...] import sys # previously loaded (sys) ', mode 'w' at 0xfa068> imp.load_source() is incompatible with IDLE. Please use the high level import statements. I didn't find any extra files. -- Comment By: imperialfists (imperialfists) Date: 2005-01-21 09:45 Message: Logged In: YES user_id=1201021 Also i find a file named 'c' in my current working directory. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105950&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1108060 ] "\0" not listed as a valid escape in the lang reference
Bugs item #1108060, was opened at 2005-01-23 21:09 Message generated for change (Comment added) made by tim_one You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1108060&group_id=5470 Category: Documentation >Group: Not a Bug >Status: Closed >Resolution: Invalid Priority: 5 Submitted By: Andrew Bennetts (spiv) Assigned to: Nobody/Anonymous (nobody) Summary: "\0" not listed as a valid escape in the lang reference Initial Comment: According to table in http://docs.python.org/ref/strings.html, the list of valid escape sequences in strings does not include \0. It appears that the parser actually allows \n for values of n in the range 0-7, but this is not documented. Many people with exposure to C expect \0 to be valid (and it does work, after all!). A quick grep on my system finds many libraries use \0 in string literals, including: - Twisted - HTMLgen - PIL - numarray - Reportlab - and of course the standard library: tarfile, gzip, pystone, binhex, and others. I suggest the documentation be updated to officially support \0 as a valid escape. I don't care as much about \1 through to \7... I was surprised they worked (and then surprised that \8 and \9 didn't), and I think they might as well be deprecated, but I don't care much either way. -- >Comment By: Tim Peters (tim_one) Date: 2005-01-23 22:21 Message: Logged In: YES user_id=31435 Look at the table of escapes again, especially the line with footnotes 3 and 5; that line documents the octal escapes, including all cases you've mentioned here. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1108060&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1108060 ] "\0" not listed as a valid escape in the lang reference
Bugs item #1108060, was opened at 2005-01-24 13:09 Message generated for change (Comment added) made by spiv You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1108060&group_id=5470 Category: Documentation Group: Not a Bug Status: Closed Resolution: Invalid Priority: 5 Submitted By: Andrew Bennetts (spiv) Assigned to: Nobody/Anonymous (nobody) Summary: "\0" not listed as a valid escape in the lang reference Initial Comment: According to table in http://docs.python.org/ref/strings.html, the list of valid escape sequences in strings does not include \0. It appears that the parser actually allows \n for values of n in the range 0-7, but this is not documented. Many people with exposure to C expect \0 to be valid (and it does work, after all!). A quick grep on my system finds many libraries use \0 in string literals, including: - Twisted - HTMLgen - PIL - numarray - Reportlab - and of course the standard library: tarfile, gzip, pystone, binhex, and others. I suggest the documentation be updated to officially support \0 as a valid escape. I don't care as much about \1 through to \7... I was surprised they worked (and then surprised that \8 and \9 didn't), and I think they might as well be deprecated, but I don't care much either way. -- >Comment By: Andrew Bennetts (spiv) Date: 2005-01-24 15:02 Message: Logged In: YES user_id=50945 So it does. I suck. Thanks Tim! -- Comment By: Tim Peters (tim_one) Date: 2005-01-24 14:21 Message: Logged In: YES user_id=31435 Look at the table of escapes again, especially the line with footnotes 3 and 5; that line documents the octal escapes, including all cases you've mentioned here. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1108060&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1105286 ] Undocumented implicit strip() in split(None) string method
Bugs item #1105286, was opened at 2005-01-19 10:04 Message generated for change (Comment added) made by tjreedy You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1105286&group_id=5470 Category: Documentation Group: None Status: Open Resolution: None Priority: 5 Submitted By: YoHell (yohell) Assigned to: Raymond Hettinger (rhettinger) Summary: Undocumented implicit strip() in split(None) string method Initial Comment: Hi! I noticed that the string method split() first does an implicit strip() before splitting when it's used with no arguments or with None as the separator (sep in the docs). There is no mention of this implicit strip() in the docs. Example 1: s = " word1 word2 " s.split() then returns ['word1', 'word2'] and not ['', 'word1', 'word2', ''] as one might expect. WHY IS THIS BAD? 1. Because it's undocumented. See: http://www.python.org/doc/current/lib/string-methods.html#l2h-197 2. Because it may lead to unexpected behavior in programs. Example 2: FASTA sequence headers are one line descriptors of biological sequences and are on this form: ">" + Identifier + whitespace + free text description. Let sHeader be a Python string containing a FASTA header. One could then use the following syntax to extract the identifier from the header: sID = sHeader[1:].split(None, 1)[0] However, this does not work if sHeader contains a faulty FASTA header where the identifier is missing or consists of whitespace. In that case sID will contain the first word of the free text description, which is not the desired behavior. WHAT SHOULD BE DONE? The implicit strip() should be removed, or at least should programmers be given the option to turn it off. At the very least it should be documented so that programmers have a chance of adapting their code to it. Thank you for an otherwise splendid language! /Joel Hedlund Ph.D. Student IFM Bioinformatics Linköping University -- Comment By: Terry J. Reedy (tjreedy) Date: 2005-01-24 02:15 Message: Logged In: YES user_id=593130 To me, the removal of whitespace at the ends (stripping) is consistent with the removal (or collapsing) of extra whitespace in between so that .split() does not return empty words anywhere. Consider: >>> ',1,,2,'.split(',') ['', '1', '', '2', ''] If ' 1 2 '.split() were to return null strings at the beginning and end of the list, then to be consistent, it should also put one in the middle. One can get this by being explicit (mixed WS can be handled by translation): >>> ' 1 2 '.split(' ') ['', '1', '', '2', ''] Having said this, I also agree that the extra words proposed by jj are helpful. BUG?? In 2.2, splitting an empty or whitespace only string produces an empty list [], not a list with a null word ['']. >>> ''.split() [] >>> ' '.split() [] which is what I see as consistent with the rest of the no-null- word behavior. Has this changed since? (Yes, must upgrade.) I could find no indication of such change in either the tracker or CVS. -- Comment By: YoHell (yohell) Date: 2005-01-20 09:59 Message: Logged In: YES user_id=1008220 Brilliant, guys! Thanks again for a superb scripting language, and with documentation to match! Take care! /Joel Hedlund -- Comment By: Raymond Hettinger (rhettinger) Date: 2005-01-20 09:50 Message: Logged In: YES user_id=80475 The prosposed wording is fine. If there are no objections or concerns, I'll apply it soon. -- Comment By: Jim Jewett (jimjjewett) Date: 2005-01-20 09:28 Message: Logged In: YES user_id=764593 Replacing the quoted line: """ ... If sep is not specified or is None, a different splitting algorithm is applied. First whitespace (spaces, tabs, newlines, returns, and formfeeds) is stripped from both ends. Then words are separated by arbitrary length strings of whitespace characters . Consecutive whitespace delimiters are treated as a single delimiter ("'1 2 3'.split()" returns "['1', '2', '3']"). Splitting an empty (or whitespace- only) string returns "['']". """ -- Comment By: Raymond Hettinger (rhettinger) Date: 2005-01-20 09:04 Message: Logged In: YES user_id=80475 What new wording do you propose to be added? -- Comment By: YoHell (yohell) Date: 2005-01-20 05:15 Message: Logged In: YES user_id=1008220 In RE to tim_one: > I think the docs for split() under "String Methods" are quite > clear: On the countrary, my friend, and here's why: > """ > ... > If sep is not specified or is None, a different splitting > algorithm is applied. This sentecnce does not say that whitespace will be impli