[ python-Bugs-1662581 ] the re module can perform poorly: O(2**n) versus O(n**2)
Bugs item #1662581, was opened at 2007-02-17 15:39 Message generated for change (Comment added) made by josiahcarlson You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Performance Group: None Status: Open Resolution: None Priority: 4 Private: No Submitted By: Gregory P. Smith (greg) Assigned to: Nobody/Anonymous (nobody) Summary: the re module can perform poorly: O(2**n) versus O(n**2) Initial Comment: in short, the re module can degenerate to really really horrid performance. See this for how and why: http://swtch.com/~rsc/regexp/regexp1.html exponential decline instead of squared. I don't have a patch so i'm filing this bug as a starting point for future work. The Modules/_sre.c files implementation could be updated to use the parallel stepping Thompson approach instead of recursive backtracking. filing this as a bug until me or someone else comes up with a patch. -- Comment By: Josiah Carlson (josiahcarlson) Date: 2007-02-22 00:51 Message: Logged In: YES user_id=341410 Originator: NO I would file this under "feature request"; the current situation isn't so much buggy, as slow. While you can produce a segfault with the current regular expression engine (due to stack overflow), you can do the same thing with regular Python on Linux (with sys.setrecursionlimit), ctypes, etc., and none of those are considered as buggy. My only concern with such a change is that it may or may not change the semantics of the repeat operators '*' and '+', which are currently defined as "greedy". If I skimmed the article correctly late at night, switching to a Thompson family regular expression engine may result in those operators no longer being greedy. Please correct me if I am wrong. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1661745 ] finditer stuck in infinite loop
Bugs item #1661745, was opened at 2007-02-16 12:11 Message generated for change (Comment added) made by rhamphoryncus You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Regular Expressions Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Milan (migues) Assigned to: Gustavo Niemeyer (niemeyer) Summary: finditer stuck in infinite loop Initial Comment: Using iterator on Match Object results in infinite unbreakable loop. Attached is sample script and sample file. My OS: Win XP Pro. -- Comment By: Adam Olsen (rhamphoryncus) Date: 2007-02-22 03:44 Message: Logged In: YES user_id=12364 Originator: NO I've rewritten the test case. It's not an infinite loop but rather exponential runtime based on the length of the string. Matching on a string of 'x.x.', increasing the length of the left x or right x by one doubles the runtime. Increasing both quadruples it. 0: 0.350475 1: 0.259876 2: 0.669956 3: 0.0002369881 4: 0.0009140968 5: 0.0038359165 6: 0.0148119926 7: 0.0732769966 8: 0.2570281029 9: 0.9819128513 10: 3.9152498245 11:16.4304330349 12:64.8596510887 13: 264.2261950970 I'm not a re guru though, so I don't know if this is a real bug or just one of those special cases re is prone to. Now I just need to find out how to attach my file, SF doesn't want to let me.. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1661745 ] finditer stuck in infinite loop
Bugs item #1661745, was opened at 2007-02-16 12:11 Message generated for change (Comment added) made by rhamphoryncus You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Regular Expressions Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Milan (migues) Assigned to: Gustavo Niemeyer (niemeyer) Summary: finditer stuck in infinite loop Initial Comment: Using iterator on Match Object results in infinite unbreakable loop. Attached is sample script and sample file. My OS: Win XP Pro. -- Comment By: Adam Olsen (rhamphoryncus) Date: 2007-02-22 03:45 Message: Logged In: YES user_id=12364 Originator: NO Nope, won't let me attach a file. Pasted instead: #!/usr/bin/env python import re from timeit import Timer reexpr = re.compile(r"(.+\n?)+?((\.\n)|(\n\n))") def test(count): text = '%s.%s.' % ('x' * count, 'x' * count) for m in reexpr.finditer(text): pass for count in range(21): print '%2i: %20.10f' % (count, Timer('test(%i)' % count, "from __main__ import test").timeit(number=1)) -- Comment By: Adam Olsen (rhamphoryncus) Date: 2007-02-22 03:44 Message: Logged In: YES user_id=12364 Originator: NO I've rewritten the test case. It's not an infinite loop but rather exponential runtime based on the length of the string. Matching on a string of 'x.x.', increasing the length of the left x or right x by one doubles the runtime. Increasing both quadruples it. 0: 0.350475 1: 0.259876 2: 0.669956 3: 0.0002369881 4: 0.0009140968 5: 0.0038359165 6: 0.0148119926 7: 0.0732769966 8: 0.2570281029 9: 0.9819128513 10: 3.9152498245 11:16.4304330349 12:64.8596510887 13: 264.2261950970 I'm not a re guru though, so I don't know if this is a real bug or just one of those special cases re is prone to. Now I just need to find out how to attach my file, SF doesn't want to let me.. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1661745 ] finditer stuck in infinite loop
Bugs item #1661745, was opened at 2007-02-16 19:11 Message generated for change (Comment added) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Regular Expressions Group: Python 2.5 >Status: Closed >Resolution: Duplicate Priority: 5 Private: No Submitted By: Milan (migues) Assigned to: Gustavo Niemeyer (niemeyer) Summary: finditer stuck in infinite loop Initial Comment: Using iterator on Match Object results in infinite unbreakable loop. Attached is sample script and sample file. My OS: Win XP Pro. -- >Comment By: Georg Brandl (gbrandl) Date: 2007-02-22 11:59 Message: Logged In: YES user_id=849994 Originator: NO I'd say this is a duplicate of #1662581. -- Comment By: Adam Olsen (rhamphoryncus) Date: 2007-02-22 10:45 Message: Logged In: YES user_id=12364 Originator: NO Nope, won't let me attach a file. Pasted instead: #!/usr/bin/env python import re from timeit import Timer reexpr = re.compile(r"(.+\n?)+?((\.\n)|(\n\n))") def test(count): text = '%s.%s.' % ('x' * count, 'x' * count) for m in reexpr.finditer(text): pass for count in range(21): print '%2i: %20.10f' % (count, Timer('test(%i)' % count, "from __main__ import test").timeit(number=1)) -- Comment By: Adam Olsen (rhamphoryncus) Date: 2007-02-22 10:44 Message: Logged In: YES user_id=12364 Originator: NO I've rewritten the test case. It's not an infinite loop but rather exponential runtime based on the length of the string. Matching on a string of 'x.x.', increasing the length of the left x or right x by one doubles the runtime. Increasing both quadruples it. 0: 0.350475 1: 0.259876 2: 0.669956 3: 0.0002369881 4: 0.0009140968 5: 0.0038359165 6: 0.0148119926 7: 0.0732769966 8: 0.2570281029 9: 0.9819128513 10: 3.9152498245 11:16.4304330349 12:64.8596510887 13: 264.2261950970 I'm not a re guru though, so I don't know if this is a real bug or just one of those special cases re is prone to. Now I just need to find out how to attach my file, SF doesn't want to let me.. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1659171 ] Calling tparm from extension lib fails in Python 2.5
Bugs item #1659171, was opened at 2007-02-13 17:27 Message generated for change (Settings changed) made by gbrandl You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Extension Modules Group: None >Status: Pending Resolution: None Priority: 5 Private: No Submitted By: Richard B. Kreckel (richyk) Assigned to: Nobody/Anonymous (nobody) Summary: Calling tparm from extension lib fails in Python 2.5 Initial Comment: Attached is a little C++ module that fetches the terminal capability string for turning off all attributes and runs it through tparm(). (All this is done in a static Ctor of a class without init function, but never mind.) Compile with: g++ -c testlib.cc g++ testlib.o -o testlib.so -shared -Wl,-soname,testlib.so -lncurses On SuSE Linux 10.1 (and older), I get the expected behavior: Python 2.4.2 (#1, Oct 13 2006, 17:11:24) [GCC 4.1.0 (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import testlib Terminal is "xterm" Dump of sgr0: 1b 5b 30 6d Dump of instance: 1b 5b 30 6d Traceback (most recent call last): File "", line 1, in ? ImportError: dynamic module does not define init function (inittestlib) >>> However, on SuSE Linux 10.2, tparm creates a NULL pointer: Python 2.5 (r25:51908, Jan 9 2007, 16:59:32) [GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import testlib Terminal is "xterm" Dump of sgr0: 1b 5b 30 6d Rats! tparm made a NULL pointer! Traceback (most recent call last): File "", line 1, in ImportError: dynamic module does not define init function (inittestlib) >>> Why, oh why? -- Comment By: Martin v. Löwis (loewis) Date: 2007-02-14 21:24 Message: Logged In: YES user_id=21627 Originator: NO I fail to see the bug. The exception precisely describes the error in your code ImportError: dynamic module does not define init function (inittestlib) Why do you expect any meaningful behavior in the presence of this error? Your shared library isn't an extension module. If you think it is related to #1548092, please try out the subversion trunk, which has fixed this bug. -- Comment By: Richard B. Kreckel (richyk) Date: 2007-02-14 08:52 Message: Logged In: YES user_id=1718463 Originator: YES I suspect that this is a duplicate of Bug [1548092]. Note that, there it is asserted that tparm returns NULL on certain invalid strings. That does not seem to be true. It returns NULL for valid trivial strings, too. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1659171 ] Calling tparm from extension lib fails in Python 2.5
Bugs item #1659171, was opened at 2007-02-13 18:27 Message generated for change (Comment added) made by richyk You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Extension Modules Group: None >Status: Open Resolution: None Priority: 5 Private: No Submitted By: Richard B. Kreckel (richyk) Assigned to: Nobody/Anonymous (nobody) Summary: Calling tparm from extension lib fails in Python 2.5 Initial Comment: Attached is a little C++ module that fetches the terminal capability string for turning off all attributes and runs it through tparm(). (All this is done in a static Ctor of a class without init function, but never mind.) Compile with: g++ -c testlib.cc g++ testlib.o -o testlib.so -shared -Wl,-soname,testlib.so -lncurses On SuSE Linux 10.1 (and older), I get the expected behavior: Python 2.4.2 (#1, Oct 13 2006, 17:11:24) [GCC 4.1.0 (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import testlib Terminal is "xterm" Dump of sgr0: 1b 5b 30 6d Dump of instance: 1b 5b 30 6d Traceback (most recent call last): File "", line 1, in ? ImportError: dynamic module does not define init function (inittestlib) >>> However, on SuSE Linux 10.2, tparm creates a NULL pointer: Python 2.5 (r25:51908, Jan 9 2007, 16:59:32) [GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import testlib Terminal is "xterm" Dump of sgr0: 1b 5b 30 6d Rats! tparm made a NULL pointer! Traceback (most recent call last): File "", line 1, in ImportError: dynamic module does not define init function (inittestlib) >>> Why, oh why? -- >Comment By: Richard B. Kreckel (richyk) Date: 2007-02-22 13:25 Message: Logged In: YES user_id=1718463 Originator: YES The error message about the undefined init function is a red herring. The example is actually a stripped-down testcase from a much larger Boost.Python module, which of course does have an init function. The point here is the NULL pointer returned by tparm. -- Comment By: Martin v. Löwis (loewis) Date: 2007-02-14 22:24 Message: Logged In: YES user_id=21627 Originator: NO I fail to see the bug. The exception precisely describes the error in your code ImportError: dynamic module does not define init function (inittestlib) Why do you expect any meaningful behavior in the presence of this error? Your shared library isn't an extension module. If you think it is related to #1548092, please try out the subversion trunk, which has fixed this bug. -- Comment By: Richard B. Kreckel (richyk) Date: 2007-02-14 09:52 Message: Logged In: YES user_id=1718463 Originator: YES I suspect that this is a duplicate of Bug [1548092]. Note that, there it is asserted that tparm returns NULL on certain invalid strings. That does not seem to be true. It returns NULL for valid trivial strings, too. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1656559 ] I think, I have found this bug on time.mktime()
Bugs item #1656559, was opened at 2007-02-10 03:41 Message generated for change (Comment added) made by sergiomb You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1656559&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: None Group: 3rd Party Status: Closed Resolution: Invalid Priority: 5 Private: No Submitted By: Sérgio Monteiro Basto (sergiomb) Assigned to: Nobody/Anonymous (nobody) Summary: I think, I have found this bug on time.mktime() Initial Comment: well, I think, I have found this bug on time.mktime() for dates less than 1976-09-26 when I do stringtotime of 1976-09-25 print "timeint %d" % time.mktime(__extract_date(m) + __extract_time(m) + (0, 0, 0)) extract date = 1976 9 25 extract time = 0 0 0 timeint 212454000 and timetostring(212454000) = 1976-09-24T23:00:00Z !? To be honest the date that kept me the action was the 1-1-1970 that appears 31-12-1969. After timetostring(stringtotime(date))) I made the test and time.mktime got a bug when date is less than 1976-09-26 see: for 1976-09-27T00:00:00Z time.mktime gives 212630400 for 1976-09-26T00:00:00Z time.mktime gives 212544000 for 1976-09-25T00:00:00Z time.mktime gives 212454000 212630400 - 212544000 = 86400 (seconds) , one day correct ! but 212544000 - 212454000 = 9 (seconds), one day more 3600 (seconds), more one hour ?!? -- Sérgio M. B. -- >Comment By: Sérgio Monteiro Basto (sergiomb) Date: 2007-02-22 16:13 Message: Logged In: YES user_id=4882 Originator: YES please forget my last comment, it is all wrong -- Comment By: Sérgio Monteiro Basto (sergiomb) Date: 2007-02-21 22:34 Message: Logged In: YES user_id=4882 Originator: YES well I found the bug is in ./site-packages/_xmlplus/utils/iso8601.py gmt = __extract_date(m) + __extract_time(m) + (0, 0, 0) this is wrong My sugestion is: gmt = __extract_date(m) + __extract_time(m) gmt = datetime(gmt).timetuple() (0,0,0) zero for week of day, zero for day of the year and zero isdst is the error here. timetuple calculate this last 3 numbers well. and my problem is gone ! references http://docs.python.org/lib/module-time.html: 0 tm_year (for example, 1993) 1 tm_mon range [1,12] 2 tm_mday range [1,31] 3 tm_hour range [0,23] 4 tm_min range [0,59] 5 tm_sec range [0,61]; see (1) in strftime() description 6 tm_wday range [0,6], Monday is 0 7 tm_yday range [1,366] 8 tm_isdst0, 1 or -1; see below -- Comment By: Martin v. Löwis (loewis) Date: 2007-02-13 15:54 Message: Logged In: YES user_id=21627 Originator: NO cvalente, thanks for the research. Making a second attempt at closing this as third-party bug. -- Comment By: Sérgio Monteiro Basto (sergiomb) Date: 2007-02-13 14:25 Message: Logged In: YES user_id=4882 Originator: YES ok bug openned on http://sources.redhat.com/bugzilla/show_bug.cgi?id=4033 -- Comment By: Claudio Valente (cvalente) Date: 2007-02-13 12:47 Message: Logged In: YES user_id=627298 Originator: NO OK. This is almost surely NOT a Python bug but most likely a libc bug. In c: -- #include #include int main(int argc, char* argv[]){ struct tm t1; struct tm t2; /* midnight 26/SET/1076*/ t1.tm_sec = 0; t1.tm_min = 0; t1.tm_hour = 0; t1.tm_mday = 26; t1.tm_mon = 8; t1.tm_year = 76; /* midnight 25/SET/1076*/ t2.tm_sec = 0; t2.tm_min = 0; t2.tm_hour = 0; t2.tm_mday = 25; t2.tm_mon = 8; t2.tm_year = 76; printf("%li\n", mktime(&t1)-mktime(&t2)); printf("%li\n", mktime(&t1)-mktime(&t2)); return 0; } -- Outputs: 9 86400 In perl: - perl -le 'use POSIX; $t1=POSIX::mktime(0,0,0,26,8,76) -POSIX::mktime(0,0,0,25,8,76); $t2 = POSIX::mktime(0,0,0,26,8,76) -POSIX::mktime(0,0,0,25,8,76) ; print $t1."\n". $t2' - Outputs 9 86400 - My system is gentoo with glibc 2.4-r4 and my timezone is: /usr/share/zoneinfo/Europe/Lisbon When I changed this to another timezone (Say London) the problem didn't exist. Thank you all for your time. -- Comment By: Sérgio Monteiro Basto (sergiomb) Date: 2007-02-13 12:22 Message: Logged In: YES user_id=4882 Originator: YES timezone : WET in winter WEST in summer I try same with timezone of NEW YORK and >>> time.mktime
[ python-Bugs-1493676 ] time.strftime() %z error
Bugs item #1493676, was opened at 2006-05-23 15:58 Message generated for change (Comment added) made by bwooster47 You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1493676&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.4 Status: Closed Resolution: Invalid Priority: 5 Private: No Submitted By: Cillian Sharkey (csharkey) Assigned to: Nobody/Anonymous (nobody) Summary: time.strftime() %z error Initial Comment: According to the time module documentation, if the time argument for strftime() is not provided, it will use the current time as returned by localtime(). However, when the value of localtime() is explicitly given to strftime(), this produces an error in the value of the timezone offset (%z) as seen here: >>> from time import * >>> strftime("%a %b %e %H:%M:%S %Y %Z %z") 'Tue May 23 16:28:31 2006 IST +0100' >>> strftime("%a %b %e %H:%M:%S %Y %Z %z", localtime()) 'Tue May 23 16:28:31 2006 IST +' This same problem happens for other timezones (the offset is always + when localtime() is explicitly given). This problem is present in both these versions: Python 2.4.2 (#2, Sep 30 2005, 21:19:01) [GCC 4.0.2 20050808 (prerelease) (Ubuntu 4.0.1-4ubuntu8)] on linux2 Python 2.3.5 (#2, Sep 4 2005, 22:01:42) [GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2 -- Comment By: bwooster47 (bwooster47) Date: 2007-02-22 16:22 Message: Logged In: YES user_id=1209659 Originator: NO Can we confirm whether this issue is not a python issue? We are talking about small z, not capital Z. >From Python docs at http://docs.python.org/lib/module-time.html : "The use of %Z is now deprecated, but the %z escape that expands to the preferred hour/minute offset is not supported by all ANSI C libraries." Most current C libraries support %z, it is in fact the preferred way to do things, would be bad to see python reject this. Even then - isn't the above a bug? If not supported, %z should always provide a empty character, but not print out totally incorrect data as + for EST. -- Comment By: Brett Cannon (bcannon) Date: 2006-05-24 21:26 Message: Logged In: YES user_id=357491 Closing as invalid since, as Georg pointed out, %z is not supported by Python. -- Comment By: Georg Brandl (gbrandl) Date: 2006-05-23 16:58 Message: Logged In: YES user_id=849994 Note that %z isn't officially supported by Python, judging by the docs. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1493676&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1666318 ] shutil.copytree doesn't preserve directory permissions
Bugs item #1666318, was opened at 2007-02-22 11:26 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jeff McNeil (j_mcneil) Assigned to: Nobody/Anonymous (nobody) Summary: shutil.copytree doesn't preserve directory permissions Initial Comment: I am using shutil.copytree to setup new user home directories within an automated system. The copy2 function is called in order to copy individual files and preserve stat data. However, copytree simply calls os.mkdir and leaves directory creation at the mercy of my current umask (in my case, that's daemon context - 0). I've got to then iterate through the newly copied tree and set permissions on each individual subdirectory. Adding a simple copystat(src, dst) on line 112 of shutil.py fixes the problem. The result should be uniform; either preserve permissions across the board, or leave it to the mercy of the caller. I know there's an enhancement request already open to supply a 'func=' kw argument to copytree. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1666318 ] shutil.copytree doesn't preserve directory permissions
Bugs item #1666318, was opened at 2007-02-22 11:26 Message generated for change (Comment added) made by j_mcneil You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: None Status: Open Resolution: None Priority: 5 Private: No Submitted By: Jeff McNeil (j_mcneil) Assigned to: Nobody/Anonymous (nobody) Summary: shutil.copytree doesn't preserve directory permissions Initial Comment: I am using shutil.copytree to setup new user home directories within an automated system. The copy2 function is called in order to copy individual files and preserve stat data. However, copytree simply calls os.mkdir and leaves directory creation at the mercy of my current umask (in my case, that's daemon context - 0). I've got to then iterate through the newly copied tree and set permissions on each individual subdirectory. Adding a simple copystat(src, dst) on line 112 of shutil.py fixes the problem. The result should be uniform; either preserve permissions across the board, or leave it to the mercy of the caller. I know there's an enhancement request already open to supply a 'func=' kw argument to copytree. -- >Comment By: Jeff McNeil (j_mcneil) Date: 2007-02-22 11:28 Message: Logged In: YES user_id=1726175 Originator: YES python -V Python 2.4.3 on Linux marvin 2.6.18-1.2257.fc5smp #1 SMP Fri Dec 15 16:33:51 EST 2006 i686 i686 i386 GNU/Linux -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1663329 ] subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi
Bugs item #1663329, was opened at 2007-02-19 11:17 Message generated for change (Comment added) made by hvbargen You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1663329&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Performance Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: H. von Bargen (hvbargen) Assigned to: Nobody/Anonymous (nobody) Summary: subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi Initial Comment: If the value of sysconf("SC_OPEN_MAX") is high and you try to start a subprocess with subprocess.py or os.popen2 with close_fds=True, then starting the other process is very slow. This boils down to the following code in subprocess.py: def _close_fds(self, but): for i in xrange(3, MAXFD): if i == but: continue try: os.close(i) except: pass resp. the similar code in popen2.py: def _run_child(self, cmd): if isinstance(cmd, basestring): cmd = ['/bin/sh', '-c', cmd] for i in xrange(3, MAXFD): try: os.close(i) except OSError: pass There has been an optimization already (range has been replaced by xrange to reduce memory impact), but I think the problem is that for high values of MAXFD, usually a high percentage of the os.close statements will fail, raising an exception (which is an "expensive" operation). It has been suggested already to add a C implementation called "rclose" or "close_range" that tries to close all FDs in a given range (min, max) without the overhead of Python exception handling. I'd like emphasize that this is not a theoretical, but a real world problem: We have a Python application in a production environment on Sun Solaris. Some other software running on the same server needed a high value of 26 for SC_OPEN_MAX (set with ulimit -n XXX or in some /etc/-file (don't know which one). Suddenly calling any other process with subprocess.Popen (..., close_fds=True) now took 14 seconds (!) instead of some microseconds. This caused a huge performance degradation, since the subprocess itself only needs only a few seconds. See also: Patches item #1607087 "popen() slow on AIX due to large FOPEN_MAX value". This contains a fix, but only for AIX - and I think the patch does not support the "but" argument used in subprocess.py. The correct solution should be coded in C, and should do the same as the _close_fds routine in subprocess.py. It could be optimized to make use of (operating-specific) system calls to close all handles from (but+1) to MAX_FD with "closefrom" or "fcntl" as proposed in the patch. -- >Comment By: H. von Bargen (hvbargen) Date: 2007-02-22 21:16 Message: Logged In: YES user_id=1008979 Originator: YES Of course I am already closing any files as soon as possible. I know that I could use FD_CLOEXEC. But this would require that I do it explicitly for each descriptor that I use in my program. But this would be a tedious work and require platform-specific coding all around the program. And the whole bunch of python library functions (i.e. the logging module) do not use FD_CLOEXEC as well. Right now, more or less the only platform specific code in the program is where I call subprocesses, and I like to keep it that way. The same is true for the socket module. All sockets are by default inherited to child processes. So, the only way to prevent unwanted handles from inheriting to child processes, is in fact to specify close_fds=True in subprocess.py. If you think that a performance patch similar to the patch #16078087 makes no sense, then the close_fds argument should either be marked as deprecated or at least the documentation should mention that the implementation is slow for large values of SC_OPEN_MAX. -- Comment By: Martin v. Löwis (loewis) Date: 2007-02-21 19:18 Message: Logged In: YES user_id=21627 Originator: NO I understand you don't want the subprocess to inherit "incorrect" file descriptors. However, there are other ways to prevent that from happening: - you should close file descriptors as soon as you are done with the files - you should set the FD_CLOEXEC flag on all file descriptors you don't want to be inherited, using fnctl(fd, F_SETFD, 1) I understand that there are cases where neither these strategy is not practical, but if you follow it, the performance will be much better, as the closing of unused file descriptor is done in the exec(2) implementation of the operating system. -- Comment B
[ python-Bugs-1662581 ] the re module can perform poorly: O(2**n) versus O(n**2)
Bugs item #1662581, was opened at 2007-02-17 15:39 Message generated for change (Comment added) made by greg You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Performance >Group: Feature Request Status: Open Resolution: None >Priority: 3 Private: No Submitted By: Gregory P. Smith (greg) Assigned to: Nobody/Anonymous (nobody) Summary: the re module can perform poorly: O(2**n) versus O(n**2) Initial Comment: in short, the re module can degenerate to really really horrid performance. See this for how and why: http://swtch.com/~rsc/regexp/regexp1.html exponential decline instead of squared. I don't have a patch so i'm filing this bug as a starting point for future work. The Modules/_sre.c files implementation could be updated to use the parallel stepping Thompson approach instead of recursive backtracking. filing this as a bug until me or someone else comes up with a patch. -- >Comment By: Gregory P. Smith (greg) Date: 2007-02-22 14:30 Message: Logged In: YES user_id=413 Originator: YES yeah this is better as a feature request. certianly low priority either way. -nothing- I propose doing would change the syntax or behaviour of existing regular expressions at all. Doing so would be a disaster. thompson nfa does not imply changing the behaviour. anyways its a lot more than a simple "patch" to change the re module to not use backtracking so i expect this to languish unless someone has a of free time and motivation all at once. :) -- Comment By: Josiah Carlson (josiahcarlson) Date: 2007-02-22 00:51 Message: Logged In: YES user_id=341410 Originator: NO I would file this under "feature request"; the current situation isn't so much buggy, as slow. While you can produce a segfault with the current regular expression engine (due to stack overflow), you can do the same thing with regular Python on Linux (with sys.setrecursionlimit), ctypes, etc., and none of those are considered as buggy. My only concern with such a change is that it may or may not change the semantics of the repeat operators '*' and '+', which are currently defined as "greedy". If I skimmed the article correctly late at night, switching to a Thompson family regular expression engine may result in those operators no longer being greedy. Please correct me if I am wrong. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[ python-Bugs-1666807 ] Incorrect file path reported by inspect.getabsfile()
Bugs item #1666807, was opened at 2007-02-23 07:08 Message generated for change (Tracker Item Submitted) made by Item Submitter You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666807&group_id=5470 Please note that this message will contain a full copy of the comment thread, including the initial issue submission, for this request, not just the latest update. Category: Python Library Group: Python 2.5 Status: Open Resolution: None Priority: 5 Private: No Submitted By: Fernando P�rez (fer_perez) Assigned to: Nobody/Anonymous (nobody) Summary: Incorrect file path reported by inspect.getabsfile() Initial Comment: The following code demonstrates the problem succinctly: ### import inspect,sys print 'Version info:',sys.version_info print f1 = inspect.getabsfile(inspect) f2 = inspect.getabsfile(inspect.iscode) print 'File for `inspect` :',f1 print 'File for `inspect.iscode`:',f2 print 'Do these match?',f1==f2 if f1==f2: print 'OK' else: print 'BUG - this is a bug in this version of Python' ### EOF Running this on my system (Linux, Ubuntu Edgy) with 2.3, 2.4 and 2.5 produces: tlon[bin]> ./python2.3 ~/code/python/inspect_bug.py Version info: (2, 3, 6, 'final', 0) File for `inspect` : /home/fperez/tmp/local/lib/python2.3/inspect.py File for `inspect.iscode`: /home/fperez/tmp/local/lib/python2.3/inspect.py Do these match? True OK tlon[bin]> python2.4 ~/code/python/inspect_bug.py Version info: (2, 4, 4, 'candidate', 1) File for `inspect` : /usr/lib/python2.4/inspect.py File for `inspect.iscode`: /home/fperez/tmp/local/bin/inspect.py Do these match? False BUG - this is a bug in this version of Python tlon[bin]> python2.5 ~/code/python/inspect_bug.py Version info: (2, 5, 0, 'final', 0) File for `inspect` : /usr/lib/python2.5/inspect.py File for `inspect.iscode`: /home/fperez/tmp/local/bin/inspect.py Do these match? False BUG - this is a bug in this version of Python ### The problem arises in the fact that inspect relies, for functions (at least), on the func_code.co_filename attribute to contain a complete path. This changed between 2.3 and 2.4, but the inspect module was never updated. This code: ### import inspect,sys print 'Python version info:',sys.version_info print 'File info for `inspect.iscode function`:' print ' ',inspect.iscode.func_code.co_filename print ### EOF shows the problem: tlon[bin]> ./python2.3 ~/code/python/inspect_bug_details.py Python version info: (2, 3, 6, 'final', 0) File info for `inspect.iscode function`: /home/fperez/tmp/local//lib/python2.3/inspect.py tlon[bin]> python2.5 ~/code/python/inspect_bug_details.py Python version info: (2, 5, 0, 'final', 0) File info for `inspect.iscode function`: inspect.py ### (2.4 has the same issue). Basically, if the func_code.co_filename attribute now stores only the final filename without the full path, then the logic in the inspect module needs to be changed to accomodate this so that correct paths are reported to the user like they were in the 2.3 days. -- You can respond by visiting: https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666807&group_id=5470 ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com