[ python-Bugs-1662581 ] the re module can perform poorly: O(2**n) versus O(n**2)

2007-02-22 Thread SourceForge.net
Bugs item #1662581, was opened at 2007-02-17 15:39
Message generated for change (Comment added) made by josiahcarlson
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Performance
Group: None
Status: Open
Resolution: None
Priority: 4
Private: No
Submitted By: Gregory P. Smith (greg)
Assigned to: Nobody/Anonymous (nobody)
Summary: the re module can perform poorly: O(2**n) versus O(n**2)

Initial Comment:
in short, the re module can degenerate to really really horrid performance.  
See this for how and why:

 http://swtch.com/~rsc/regexp/regexp1.html

exponential decline instead of squared.

I don't have a patch so i'm filing this bug as a starting point for future 
work.  The Modules/_sre.c files implementation could be updated to use the 
parallel stepping Thompson approach instead of recursive backtracking.

filing this as a bug until me or someone else comes up with a patch.

--

Comment By: Josiah Carlson (josiahcarlson)
Date: 2007-02-22 00:51

Message:
Logged In: YES 
user_id=341410
Originator: NO

I would file this under "feature request"; the current situation isn't so
much buggy, as slow.  While you can produce a segfault with the current
regular expression engine (due to stack overflow), you can do the same
thing with regular Python on Linux (with sys.setrecursionlimit), ctypes,
etc., and none of those are considered as buggy.

My only concern with such a change is that it may or may not change the
semantics of the repeat operators '*' and '+', which are currently defined
as "greedy".  If I skimmed the article correctly late at night, switching
to a Thompson family regular expression engine may result in those
operators no longer being greedy.  Please correct me if I am wrong.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1661745 ] finditer stuck in infinite loop

2007-02-22 Thread SourceForge.net
Bugs item #1661745, was opened at 2007-02-16 12:11
Message generated for change (Comment added) made by rhamphoryncus
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Milan (migues)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: finditer stuck in infinite loop

Initial Comment:
Using iterator on Match Object results in infinite unbreakable loop. Attached 
is sample script and sample file.

My OS: Win XP Pro.



--

Comment By: Adam Olsen (rhamphoryncus)
Date: 2007-02-22 03:44

Message:
Logged In: YES 
user_id=12364
Originator: NO

I've rewritten the test case.  It's not an infinite loop but rather
exponential runtime based on the length of the string.  Matching on a
string of 'x.x.', increasing the length of the left x or right x by one
doubles the runtime.  Increasing both quadruples it.

 0: 0.350475
 1: 0.259876
 2: 0.669956
 3: 0.0002369881
 4: 0.0009140968
 5: 0.0038359165
 6: 0.0148119926
 7: 0.0732769966
 8: 0.2570281029
 9: 0.9819128513
10: 3.9152498245
11:16.4304330349
12:64.8596510887
13:   264.2261950970

I'm not a re guru though, so I don't know if this is a real bug or just
one of those special cases re is prone to.

Now I just need to find out how to attach my file, SF doesn't want to let
me..

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1661745 ] finditer stuck in infinite loop

2007-02-22 Thread SourceForge.net
Bugs item #1661745, was opened at 2007-02-16 12:11
Message generated for change (Comment added) made by rhamphoryncus
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Milan (migues)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: finditer stuck in infinite loop

Initial Comment:
Using iterator on Match Object results in infinite unbreakable loop. Attached 
is sample script and sample file.

My OS: Win XP Pro.



--

Comment By: Adam Olsen (rhamphoryncus)
Date: 2007-02-22 03:45

Message:
Logged In: YES 
user_id=12364
Originator: NO

Nope, won't let me attach a file.  Pasted instead:

#!/usr/bin/env python
import re
from timeit import Timer

reexpr = re.compile(r"(.+\n?)+?((\.\n)|(\n\n))")

def test(count):
text = '%s.%s.' % ('x' * count, 'x' * count)
for m in reexpr.finditer(text):
pass

for count in range(21):
print '%2i: %20.10f' % (count,
Timer('test(%i)' % count, "from __main__ import
test").timeit(number=1))


--

Comment By: Adam Olsen (rhamphoryncus)
Date: 2007-02-22 03:44

Message:
Logged In: YES 
user_id=12364
Originator: NO

I've rewritten the test case.  It's not an infinite loop but rather
exponential runtime based on the length of the string.  Matching on a
string of 'x.x.', increasing the length of the left x or right x by one
doubles the runtime.  Increasing both quadruples it.

 0: 0.350475
 1: 0.259876
 2: 0.669956
 3: 0.0002369881
 4: 0.0009140968
 5: 0.0038359165
 6: 0.0148119926
 7: 0.0732769966
 8: 0.2570281029
 9: 0.9819128513
10: 3.9152498245
11:16.4304330349
12:64.8596510887
13:   264.2261950970

I'm not a re guru though, so I don't know if this is a real bug or just
one of those special cases re is prone to.

Now I just need to find out how to attach my file, SF doesn't want to let
me..

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1661745 ] finditer stuck in infinite loop

2007-02-22 Thread SourceForge.net
Bugs item #1661745, was opened at 2007-02-16 19:11
Message generated for change (Comment added) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Regular Expressions
Group: Python 2.5
>Status: Closed
>Resolution: Duplicate
Priority: 5
Private: No
Submitted By: Milan (migues)
Assigned to: Gustavo Niemeyer (niemeyer)
Summary: finditer stuck in infinite loop

Initial Comment:
Using iterator on Match Object results in infinite unbreakable loop. Attached 
is sample script and sample file.

My OS: Win XP Pro.



--

>Comment By: Georg Brandl (gbrandl)
Date: 2007-02-22 11:59

Message:
Logged In: YES 
user_id=849994
Originator: NO

I'd say this is a duplicate of #1662581.

--

Comment By: Adam Olsen (rhamphoryncus)
Date: 2007-02-22 10:45

Message:
Logged In: YES 
user_id=12364
Originator: NO

Nope, won't let me attach a file.  Pasted instead:

#!/usr/bin/env python
import re
from timeit import Timer

reexpr = re.compile(r"(.+\n?)+?((\.\n)|(\n\n))")

def test(count):
text = '%s.%s.' % ('x' * count, 'x' * count)
for m in reexpr.finditer(text):
pass

for count in range(21):
print '%2i: %20.10f' % (count,
Timer('test(%i)' % count, "from __main__ import
test").timeit(number=1))


--

Comment By: Adam Olsen (rhamphoryncus)
Date: 2007-02-22 10:44

Message:
Logged In: YES 
user_id=12364
Originator: NO

I've rewritten the test case.  It's not an infinite loop but rather
exponential runtime based on the length of the string.  Matching on a
string of 'x.x.', increasing the length of the left x or right x by one
doubles the runtime.  Increasing both quadruples it.

 0: 0.350475
 1: 0.259876
 2: 0.669956
 3: 0.0002369881
 4: 0.0009140968
 5: 0.0038359165
 6: 0.0148119926
 7: 0.0732769966
 8: 0.2570281029
 9: 0.9819128513
10: 3.9152498245
11:16.4304330349
12:64.8596510887
13:   264.2261950970

I'm not a re guru though, so I don't know if this is a real bug or just
one of those special cases re is prone to.

Now I just need to find out how to attach my file, SF doesn't want to let
me..

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1661745&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1659171 ] Calling tparm from extension lib fails in Python 2.5

2007-02-22 Thread SourceForge.net
Bugs item #1659171, was opened at 2007-02-13 17:27
Message generated for change (Settings changed) made by gbrandl
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: None
>Status: Pending
Resolution: None
Priority: 5
Private: No
Submitted By: Richard B. Kreckel (richyk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Calling tparm from extension lib fails in Python 2.5

Initial Comment:
Attached is a little C++ module that fetches the terminal capability string for 
turning off all attributes and runs it through tparm(). (All this is done in a 
static Ctor of a class without init function, but never mind.)

Compile with:
g++ -c testlib.cc
g++ testlib.o -o testlib.so -shared -Wl,-soname,testlib.so -lncurses

On SuSE Linux 10.1 (and older), I get the expected behavior:

Python 2.4.2 (#1, Oct 13 2006, 17:11:24) 
[GCC 4.1.0 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import testlib
Terminal is "xterm"
Dump of sgr0: 1b 5b 30 6d
Dump of instance: 1b 5b 30 6d
Traceback (most recent call last):
  File "", line 1, in ?
ImportError: dynamic module does not define init function (inittestlib)
>>> 

However, on SuSE Linux 10.2, tparm creates a NULL pointer:
Python 2.5 (r25:51908, Jan  9 2007, 16:59:32) 
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import testlib
Terminal is "xterm"
Dump of sgr0: 1b 5b 30 6d
Rats! tparm made a NULL pointer!
Traceback (most recent call last):
  File "", line 1, in 
ImportError: dynamic module does not define init function (inittestlib)
>>> 

Why, oh why?


--

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-14 21:24

Message:
Logged In: YES 
user_id=21627
Originator: NO

I fail to see the bug. The exception precisely describes the error in your
code

ImportError: dynamic module does not define init function (inittestlib)

Why do you expect any meaningful behavior in the presence of this error?
Your shared library isn't an extension module.

If you think it is related to #1548092, please try out the subversion
trunk, which has fixed this bug.

--

Comment By: Richard B. Kreckel (richyk)
Date: 2007-02-14 08:52

Message:
Logged In: YES 
user_id=1718463
Originator: YES

I suspect that this is a duplicate of Bug [1548092].
Note that, there it is asserted that tparm returns NULL on certain invalid
strings.
That does not seem to be true. It returns NULL for valid trivial strings,
too.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1659171 ] Calling tparm from extension lib fails in Python 2.5

2007-02-22 Thread SourceForge.net
Bugs item #1659171, was opened at 2007-02-13 18:27
Message generated for change (Comment added) made by richyk
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Extension Modules
Group: None
>Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Richard B. Kreckel (richyk)
Assigned to: Nobody/Anonymous (nobody)
Summary: Calling tparm from extension lib fails in Python 2.5

Initial Comment:
Attached is a little C++ module that fetches the terminal capability string for 
turning off all attributes and runs it through tparm(). (All this is done in a 
static Ctor of a class without init function, but never mind.)

Compile with:
g++ -c testlib.cc
g++ testlib.o -o testlib.so -shared -Wl,-soname,testlib.so -lncurses

On SuSE Linux 10.1 (and older), I get the expected behavior:

Python 2.4.2 (#1, Oct 13 2006, 17:11:24) 
[GCC 4.1.0 (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import testlib
Terminal is "xterm"
Dump of sgr0: 1b 5b 30 6d
Dump of instance: 1b 5b 30 6d
Traceback (most recent call last):
  File "", line 1, in ?
ImportError: dynamic module does not define init function (inittestlib)
>>> 

However, on SuSE Linux 10.2, tparm creates a NULL pointer:
Python 2.5 (r25:51908, Jan  9 2007, 16:59:32) 
[GCC 4.1.2 20061115 (prerelease) (SUSE Linux)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import testlib
Terminal is "xterm"
Dump of sgr0: 1b 5b 30 6d
Rats! tparm made a NULL pointer!
Traceback (most recent call last):
  File "", line 1, in 
ImportError: dynamic module does not define init function (inittestlib)
>>> 

Why, oh why?


--

>Comment By: Richard B. Kreckel (richyk)
Date: 2007-02-22 13:25

Message:
Logged In: YES 
user_id=1718463
Originator: YES

The error message about the undefined init function is a red herring. The
example is actually a stripped-down testcase from a much larger
Boost.Python module, which of course does have an init function. The point
here is the NULL pointer returned by tparm.

--

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-14 22:24

Message:
Logged In: YES 
user_id=21627
Originator: NO

I fail to see the bug. The exception precisely describes the error in your
code

ImportError: dynamic module does not define init function (inittestlib)

Why do you expect any meaningful behavior in the presence of this error?
Your shared library isn't an extension module.

If you think it is related to #1548092, please try out the subversion
trunk, which has fixed this bug.

--

Comment By: Richard B. Kreckel (richyk)
Date: 2007-02-14 09:52

Message:
Logged In: YES 
user_id=1718463
Originator: YES

I suspect that this is a duplicate of Bug [1548092].
Note that, there it is asserted that tparm returns NULL on certain invalid
strings.
That does not seem to be true. It returns NULL for valid trivial strings,
too.


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1659171&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1656559 ] I think, I have found this bug on time.mktime()

2007-02-22 Thread SourceForge.net
Bugs item #1656559, was opened at 2007-02-10 03:41
Message generated for change (Comment added) made by sergiomb
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1656559&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: None
Group: 3rd Party
Status: Closed
Resolution: Invalid
Priority: 5
Private: No
Submitted By: Sérgio Monteiro Basto (sergiomb)
Assigned to: Nobody/Anonymous (nobody)
Summary: I think, I have found this bug on time.mktime()

Initial Comment:
well, I think, I have found this bug on time.mktime() for dates less
than 1976-09-26

when I do stringtotime of 1976-09-25 

print "timeint %d" % time.mktime(__extract_date(m) + __extract_time(m) + (0, 0, 
0)) 

extract date = 1976 9 25
extract time = 0 0 0
timeint 212454000
and 
timetostring(212454000) = 1976-09-24T23:00:00Z !? 

To be honest the date that kept me the action was the 1-1-1970 that
appears 31-12-1969. After timetostring(stringtotime(date)))

I made the test and time.mktime got a bug when date is less than
1976-09-26 
see:
for 1976-09-27T00:00:00Z time.mktime gives 212630400
for 1976-09-26T00:00:00Z time.mktime gives 212544000
for 1976-09-25T00:00:00Z time.mktime gives 212454000

212630400 - 212544000 = 86400 (seconds) , one day correct !
but
212544000 - 212454000 = 9 (seconds), one day more 3600 (seconds),
more one hour ?!? 

--
Sérgio M. B. 



--

>Comment By: Sérgio Monteiro Basto (sergiomb)
Date: 2007-02-22 16:13

Message:
Logged In: YES 
user_id=4882
Originator: YES

please forget my last comment, it is all wrong 

--

Comment By: Sérgio Monteiro Basto (sergiomb)
Date: 2007-02-21 22:34

Message:
Logged In: YES 
user_id=4882
Originator: YES

well I found the bug is in ./site-packages/_xmlplus/utils/iso8601.py

 gmt = __extract_date(m) + __extract_time(m) + (0, 0, 0) this is wrong 
My sugestion is:  
 gmt = __extract_date(m) + __extract_time(m)
 gmt = datetime(gmt).timetuple()

(0,0,0) zero for week of day, zero for day of the year and zero isdst is
the error here. 

timetuple calculate this last 3 numbers well. 
and my problem is gone !

references http://docs.python.org/lib/module-time.html: 
0   tm_year (for example, 1993)
1   tm_mon  range [1,12]
2   tm_mday range [1,31]
3   tm_hour range [0,23]
4   tm_min  range [0,59]
5   tm_sec  range [0,61]; see (1) in strftime() description
6   tm_wday range [0,6], Monday is 0
7   tm_yday range [1,366]
8   tm_isdst0, 1 or -1; see below


--

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-13 15:54

Message:
Logged In: YES 
user_id=21627
Originator: NO

cvalente, thanks for the research. Making a second attempt at closing this
as third-party bug.

--

Comment By: Sérgio Monteiro Basto (sergiomb)
Date: 2007-02-13 14:25

Message:
Logged In: YES 
user_id=4882
Originator: YES

ok bug openned on 
http://sources.redhat.com/bugzilla/show_bug.cgi?id=4033

--

Comment By: Claudio Valente (cvalente)
Date: 2007-02-13 12:47

Message:
Logged In: YES 
user_id=627298
Originator: NO

OK. This is almost surely NOT a Python bug but most likely a libc bug.

In c:
--
#include 
#include 

int main(int argc, char* argv[]){
struct tm t1;
struct tm t2;

/* midnight 26/SET/1076*/
t1.tm_sec  = 0;
t1.tm_min  = 0;
t1.tm_hour = 0;
t1.tm_mday = 26;
t1.tm_mon  = 8;
t1.tm_year = 76;

/* midnight 25/SET/1076*/
t2.tm_sec  = 0;
t2.tm_min  = 0;
t2.tm_hour = 0;
t2.tm_mday = 25;
t2.tm_mon  = 8;
t2.tm_year = 76;

printf("%li\n", mktime(&t1)-mktime(&t2));
printf("%li\n", mktime(&t1)-mktime(&t2));

return 0;
}
--
Outputs:

9
86400


In perl:
-
perl -le 'use POSIX; $t1=POSIX::mktime(0,0,0,26,8,76)
-POSIX::mktime(0,0,0,25,8,76); $t2 = POSIX::mktime(0,0,0,26,8,76)
-POSIX::mktime(0,0,0,25,8,76) ; print $t1."\n". $t2'
-

Outputs

9
86400

-

My system is gentoo with glibc 2.4-r4
and my timezone is:
/usr/share/zoneinfo/Europe/Lisbon

When I changed this to another timezone (Say London) the problem didn't
exist.

Thank you all for your time.

--

Comment By: Sérgio Monteiro Basto (sergiomb)
Date: 2007-02-13 12:22

Message:
Logged In: YES 
user_id=4882
Originator: YES

timezone :  WET in winter WEST in summer 
I try same with timezone of NEW YORK and 
>>>
time.mktime

[ python-Bugs-1493676 ] time.strftime() %z error

2007-02-22 Thread SourceForge.net
Bugs item #1493676, was opened at 2006-05-23 15:58
Message generated for change (Comment added) made by bwooster47
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1493676&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.4
Status: Closed
Resolution: Invalid
Priority: 5
Private: No
Submitted By: Cillian Sharkey (csharkey)
Assigned to: Nobody/Anonymous (nobody)
Summary: time.strftime() %z error

Initial Comment:
According to the time module documentation, if the time
argument for strftime() is not provided, it will use
the current time as returned by localtime().

However, when the value of localtime() is explicitly
given to strftime(), this produces an error in the
value of the timezone offset (%z) as seen here:

>>> from time import *
>>> strftime("%a %b %e %H:%M:%S %Y %Z %z")
'Tue May 23 16:28:31 2006 IST +0100'
>>> strftime("%a %b %e %H:%M:%S %Y %Z %z", localtime())
'Tue May 23 16:28:31 2006 IST +'

This same problem happens for other timezones (the
offset is always + when localtime() is explicitly
given).

This problem is present in both these versions:

Python 2.4.2 (#2, Sep 30 2005, 21:19:01)
[GCC 4.0.2 20050808 (prerelease) (Ubuntu
4.0.1-4ubuntu8)] on linux2

Python 2.3.5 (#2, Sep  4 2005, 22:01:42)
[GCC 3.3.5 (Debian 1:3.3.5-13)] on linux2


--

Comment By: bwooster47 (bwooster47)
Date: 2007-02-22 16:22

Message:
Logged In: YES 
user_id=1209659
Originator: NO

Can we confirm whether this issue is not a python issue?
We are talking about small z, not capital Z.

>From Python docs at http://docs.python.org/lib/module-time.html  :
"The use of %Z is now deprecated, but the %z escape that expands to the
preferred hour/minute offset is not supported by all ANSI C libraries."

Most current C libraries support %z, it is in fact the preferred way to do
things, would be bad to see python reject this.
Even then - isn't the above a bug? If not supported, %z should always
provide a empty character, but not print out totally incorrect data as
+ for EST.



--

Comment By: Brett Cannon (bcannon)
Date: 2006-05-24 21:26

Message:
Logged In: YES 
user_id=357491

Closing as invalid since, as Georg pointed out, %z is not
supported by Python.

--

Comment By: Georg Brandl (gbrandl)
Date: 2006-05-23 16:58

Message:
Logged In: YES 
user_id=849994

Note that %z isn't officially supported by Python, judging
by the docs.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1493676&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1666318 ] shutil.copytree doesn't preserve directory permissions

2007-02-22 Thread SourceForge.net
Bugs item #1666318, was opened at 2007-02-22 11:26
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jeff McNeil (j_mcneil)
Assigned to: Nobody/Anonymous (nobody)
Summary: shutil.copytree doesn't preserve directory permissions

Initial Comment:
I am using shutil.copytree to setup new user home directories within an 
automated system.  The copy2 function is called in order to copy individual 
files and preserve stat data. 

However, copytree simply calls os.mkdir and leaves directory creation at the 
mercy of my current umask (in my case, that's daemon context - 0).

I've got to then iterate through the newly copied tree and set permissions on 
each individual subdirectory. 

Adding a simple copystat(src, dst) on line 112 of shutil.py fixes the problem. 

The result should be uniform; either preserve permissions across the board, or 
leave it to the mercy of the caller.  I know there's an enhancement request 
already open to supply a 'func=' kw argument to copytree.



--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1666318 ] shutil.copytree doesn't preserve directory permissions

2007-02-22 Thread SourceForge.net
Bugs item #1666318, was opened at 2007-02-22 11:26
Message generated for change (Comment added) made by j_mcneil
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: None
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Jeff McNeil (j_mcneil)
Assigned to: Nobody/Anonymous (nobody)
Summary: shutil.copytree doesn't preserve directory permissions

Initial Comment:
I am using shutil.copytree to setup new user home directories within an 
automated system.  The copy2 function is called in order to copy individual 
files and preserve stat data. 

However, copytree simply calls os.mkdir and leaves directory creation at the 
mercy of my current umask (in my case, that's daemon context - 0).

I've got to then iterate through the newly copied tree and set permissions on 
each individual subdirectory. 

Adding a simple copystat(src, dst) on line 112 of shutil.py fixes the problem. 

The result should be uniform; either preserve permissions across the board, or 
leave it to the mercy of the caller.  I know there's an enhancement request 
already open to supply a 'func=' kw argument to copytree.



--

>Comment By: Jeff McNeil (j_mcneil)
Date: 2007-02-22 11:28

Message:
Logged In: YES 
user_id=1726175
Originator: YES

python -V
Python 2.4.3

on 

Linux marvin 2.6.18-1.2257.fc5smp #1 SMP Fri Dec 15 16:33:51 EST 2006 i686
i686 i386 GNU/Linux


--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666318&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1663329 ] subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi

2007-02-22 Thread SourceForge.net
Bugs item #1663329, was opened at 2007-02-19 11:17
Message generated for change (Comment added) made by hvbargen
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1663329&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Performance
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: H. von Bargen (hvbargen)
Assigned to: Nobody/Anonymous (nobody)
Summary: subprocess/popen close_fds perform poor if SC_OPEN_MAX is hi

Initial Comment:
If the value of sysconf("SC_OPEN_MAX") is high
and you try to start a subprocess with subprocess.py or os.popen2 with 
close_fds=True, then starting the other process is very slow.
This boils down to the following code in subprocess.py:
def _close_fds(self, but):
for i in xrange(3, MAXFD):
if i == but:
continue
try:
os.close(i)
except:
pass

resp. the similar code in popen2.py:
def _run_child(self, cmd):
if isinstance(cmd, basestring):
cmd = ['/bin/sh', '-c', cmd]
for i in xrange(3, MAXFD):
try:
os.close(i)
except OSError:
pass

There has been an optimization already (range has been replaced by xrange to 
reduce memory impact), but I think the problem is that for high values of 
MAXFD, usually a high percentage of the os.close statements will fail, raising 
an exception (which is an "expensive" operation).
It has been suggested already to add a C implementation called "rclose" or 
"close_range" that tries to close all FDs in a given range (min, max) without 
the overhead of Python exception handling.

I'd like emphasize that this is not a theoretical, but a real world problem:
We have a Python application in a production environment on Sun Solaris. Some 
other software running on the same server needed a high value of 26 for 
SC_OPEN_MAX (set with ulimit -n XXX or in some /etc/-file (don't know which 
one).
Suddenly calling any other process with subprocess.Popen (..., close_fds=True) 
now took 14 seconds (!) instead of some microseconds.
This caused a huge performance degradation, since the subprocess itself only 
needs only  a few seconds.

See also:
Patches item #1607087 "popen() slow on AIX due to large FOPEN_MAX value".
This contains a fix, but only for AIX - and I think the patch does not support 
the "but" argument used in subprocess.py.
The correct solution should be coded in C, and should
do the same as the _close_fds routine in subprocess.py.
It could be optimized to make use of (operating-specific) system calls to close 
all handles from (but+1) to MAX_FD with "closefrom" or "fcntl" as proposed in 
the patch.


--

>Comment By: H. von Bargen (hvbargen)
Date: 2007-02-22 21:16

Message:
Logged In: YES 
user_id=1008979
Originator: YES

Of course I am already closing any files as soon as possible.

I know that I could use FD_CLOEXEC. But this would require that I do it
explicitly for each descriptor that I use in my program. But this would be
a tedious work and require platform-specific coding all around the program.
And the whole bunch of python library functions (i.e. the logging module)
do not use FD_CLOEXEC as well.
Right now, more or less the only platform specific code in the program is
where I call subprocesses, and I like to keep it that way.
The same is true for the socket module. All sockets are by default
inherited to child processes.
So, the only way to prevent unwanted handles from inheriting to child
processes, is in fact to specify close_fds=True in subprocess.py.
If you think that a performance patch similar to the patch #16078087 makes
no sense, then the close_fds argument should either be marked as deprecated
or at least the documentation should mention that the implementation is
slow for large values of SC_OPEN_MAX.


--

Comment By: Martin v. Löwis (loewis)
Date: 2007-02-21 19:18

Message:
Logged In: YES 
user_id=21627
Originator: NO

I understand you don't want the subprocess to inherit "incorrect" file
descriptors. However, there are other ways to prevent that from happening:
- you should close file descriptors as soon as you are done with the
files
- you should set the FD_CLOEXEC flag on all file descriptors you don't
want to be inherited, using fnctl(fd, F_SETFD, 1)

I understand that there are cases where neither these strategy is not
practical, but if you follow it, the performance will be much better, as
the closing of unused file descriptor is done in the exec(2) implementation
of the operating system.


--

Comment B

[ python-Bugs-1662581 ] the re module can perform poorly: O(2**n) versus O(n**2)

2007-02-22 Thread SourceForge.net
Bugs item #1662581, was opened at 2007-02-17 15:39
Message generated for change (Comment added) made by greg
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Performance
>Group: Feature Request
Status: Open
Resolution: None
>Priority: 3
Private: No
Submitted By: Gregory P. Smith (greg)
Assigned to: Nobody/Anonymous (nobody)
Summary: the re module can perform poorly: O(2**n) versus O(n**2)

Initial Comment:
in short, the re module can degenerate to really really horrid performance.  
See this for how and why:

 http://swtch.com/~rsc/regexp/regexp1.html

exponential decline instead of squared.

I don't have a patch so i'm filing this bug as a starting point for future 
work.  The Modules/_sre.c files implementation could be updated to use the 
parallel stepping Thompson approach instead of recursive backtracking.

filing this as a bug until me or someone else comes up with a patch.

--

>Comment By: Gregory P. Smith (greg)
Date: 2007-02-22 14:30

Message:
Logged In: YES 
user_id=413
Originator: YES

yeah this is better as a feature request.  certianly low priority either
way.

-nothing- I propose doing would change the syntax or behaviour of existing
regular expressions at all.  Doing so would be a disaster.  thompson nfa
does not imply changing the behaviour.

anyways its a lot more than a simple "patch" to change the re module to
not use backtracking so i expect this to languish unless someone has a of
free time and motivation all at once. :)


--

Comment By: Josiah Carlson (josiahcarlson)
Date: 2007-02-22 00:51

Message:
Logged In: YES 
user_id=341410
Originator: NO

I would file this under "feature request"; the current situation isn't so
much buggy, as slow.  While you can produce a segfault with the current
regular expression engine (due to stack overflow), you can do the same
thing with regular Python on Linux (with sys.setrecursionlimit), ctypes,
etc., and none of those are considered as buggy.

My only concern with such a change is that it may or may not change the
semantics of the repeat operators '*' and '+', which are currently defined
as "greedy".  If I skimmed the article correctly late at night, switching
to a Thompson family regular expression engine may result in those
operators no longer being greedy.  Please correct me if I am wrong.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1662581&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[ python-Bugs-1666807 ] Incorrect file path reported by inspect.getabsfile()

2007-02-22 Thread SourceForge.net
Bugs item #1666807, was opened at 2007-02-23 07:08
Message generated for change (Tracker Item Submitted) made by Item Submitter
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666807&group_id=5470

Please note that this message will contain a full copy of the comment thread,
including the initial issue submission, for this request,
not just the latest update.
Category: Python Library
Group: Python 2.5
Status: Open
Resolution: None
Priority: 5
Private: No
Submitted By: Fernando P�rez (fer_perez)
Assigned to: Nobody/Anonymous (nobody)
Summary: Incorrect file path reported by inspect.getabsfile()

Initial Comment:
The following code demonstrates the problem succinctly:

###
import inspect,sys

print 'Version info:',sys.version_info
print

f1 = inspect.getabsfile(inspect)
f2 = inspect.getabsfile(inspect.iscode)
print 'File for `inspect`   :',f1
print 'File for `inspect.iscode`:',f2
print 'Do these match?',f1==f2
if f1==f2:
print 'OK'
else:
print 'BUG - this is a bug in this version of Python'

###  EOF

Running this on my system (Linux, Ubuntu Edgy) with 2.3, 2.4 and 2.5 produces:

tlon[bin]> ./python2.3 ~/code/python/inspect_bug.py
Version info: (2, 3, 6, 'final', 0)

File for `inspect`   : /home/fperez/tmp/local/lib/python2.3/inspect.py
File for `inspect.iscode`: /home/fperez/tmp/local/lib/python2.3/inspect.py
Do these match? True
OK
tlon[bin]> python2.4 ~/code/python/inspect_bug.py
Version info: (2, 4, 4, 'candidate', 1)

File for `inspect`   : /usr/lib/python2.4/inspect.py
File for `inspect.iscode`: /home/fperez/tmp/local/bin/inspect.py
Do these match? False
BUG - this is a bug in this version of Python
tlon[bin]> python2.5 ~/code/python/inspect_bug.py
Version info: (2, 5, 0, 'final', 0)

File for `inspect`   : /usr/lib/python2.5/inspect.py
File for `inspect.iscode`: /home/fperez/tmp/local/bin/inspect.py
Do these match? False
BUG - this is a bug in this version of Python


###

The problem arises in the fact that inspect relies, for functions (at least), 
on the func_code.co_filename attribute to contain a complete path.  This 
changed between 2.3 and 2.4, but the inspect module was never updated.  This 
code:

###
import inspect,sys

print 'Python version info:',sys.version_info
print 'File info for `inspect.iscode function`:'
print ' ',inspect.iscode.func_code.co_filename
print
### EOF

shows the problem:

tlon[bin]> ./python2.3 ~/code/python/inspect_bug_details.py
Python version info: (2, 3, 6, 'final', 0)
File info for `inspect.iscode function`:
  /home/fperez/tmp/local//lib/python2.3/inspect.py

tlon[bin]> python2.5 ~/code/python/inspect_bug_details.py
Python version info: (2, 5, 0, 'final', 0)
File info for `inspect.iscode function`:
  inspect.py

###

(2.4 has the same issue).

Basically, if the func_code.co_filename attribute now stores only the final 
filename without the full path, then the logic in the inspect module needs to 
be changed to accomodate this so that correct paths are reported to the user 
like they were in the 2.3 days.

--

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=1666807&group_id=5470
___
Python-bugs-list mailing list 
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com