Re: [Numpy-discussion] Re: [Python-Dev] Re: Numeric life as I see it
On 10.02.2005, at 05:36, Guido van Rossum wrote: And why would a Matrix need to inherit from a C-array? Wouldn't it make more sense from an OO POV for the Matrix to *have* a C-array without *being* one? Definitely. Most array operations make no sense on matrices. And matrices are limited to two dimensions. Making Matrix a subclass of Array would be inheritance for implementation while removing 90% of the interface. On the other hand, a Matrix object is perfectly defined by its behaviour and independent of its implementation. One could perfectly well implement one using Python lists or dictionaries, even though that would be pointless from a performance point of view. Konrad. -- --- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: [EMAIL PROTECTED] --- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: [Numpy-discussion] Re: Numeric life as I see it
On 10.02.2005, at 05:09, Travis Oliphant wrote: I'm not sure I agree. The ufuncobject is the only place where this concern existed (should we trip OverFlow, ZeroDivision, etc. errors durring array math). Numarray introduced and implemented the concept of error modes that can be pushed and popped. I believe this is the right solution for the ufuncobject. Indeed. Note also that the ufunc stuff is less critical to agree on than the array data structure. Anyone unhappy with ufuncs could write their own module and use it instead. It would be the data structure and its access rules that fix the structure of all the code that uses it, so that's what needs to be acceptable to everyone. One question we are pursuing is could the arrayobject get into the core without a particular ufunc object. Most see this as sub-optimal, but maybe it is the only way. Since all the artithmetic operations are in ufunc that would be suboptimal solution, but indeed still a workable one. I appreciate some of what Paul is saying here, but I'm not fully convinced that this is still true with Python 2.2 and up new-style c-types. The concerns seem to be over the fact that you have to re-implement everything in the sub-class because the base-class will always return one of its objects instead of a sub-class object. I'd say that such discussions should be postponed until someone proposes a good use for subclassing arrays. Matrices are not one, in my opinion. Konrad. -- --- Konrad Hinsen Laboratoire Leon Brillouin, CEA Saclay, 91191 Gif-sur-Yvette Cedex, France Tel.: +33-1 69 08 79 25 Fax: +33-1 69 08 82 61 E-Mail: [EMAIL PROTECTED] --- ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: [Numpy-discussion] Re: Numeric life as I see it
On Feb 10, 2005, at 10:30 AM, Travis Oliphant wrote: One question we are pursuing is could the arrayobject get into the core without a particular ufunc object. Most see this as sub-optimal, but maybe it is the only way. Since all the artithmetic operations are in ufunc that would be suboptimal solution, but indeed still a workable one. I think replacing basic number operations of the arrayobject should simple, so perhaps a default ufunc object could be worked out for inclusion. I agree, getting it in the core is among others, intended to give it broad access, not just to hard-core numeric people. For many uses (including many of my simpler scripts) you don't need the more exotic functionality of ufuncs. You could just do with implementing the standard math functions, possibly leaving out things like reduce. That would be very easy to implement. I appreciate some of what Paul is saying here, but I'm not fully convinced that this is still true with Python 2.2 and up new-style c-types. The concerns seem to be over the fact that you have to re-implement everything in the sub-class because the base-class will always return one of its objects instead of a sub-class object. I'd say that such discussions should be postponed until someone proposes a good use for subclassing arrays. Matrices are not one, in my opinion. Agreed. It is is not critical to what I am doing, and I obviously need more understanding before tackling such things. Numeric3 uses the new c-type largely because of the nice getsets table which is separate from the methods table. This replaces the rather ugly C-functions getattr and setattr. I would agree that sub-classing arrays might not be worth the trouble. Peter ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] RE: [Numpy-discussion] Numeric life as I see it
Paul Dubois wrote: > > Aside: While I am at it, let me reiterate what I have said to the other > developers privately: there is NO value to inheriting from the array > class. Don't try to achieve that capability if it costs anything, even > just effort, because it buys you nothing. Those of you who keep > remarking on this as if it would simply haven't thought it through IMHO. > It sounds so intellectually appealing that David Ascher and I had a > version of Numeric that almost did it before we realized our folly. > To be contrarian, we did find great benefit (at least initially) for inheritance for developing the record array and character array classes since they share so many structural operations (indexing, slicing, transposes, concatenation, etc.) with numeric arrays. It's possible that the approach that Travis is considering doesn't need to use inheritance to accomplish this (I don't know enough about the details yet), but it sure did save a lot of duplication of implementation. I do understand what you are getting at. Any numerical array inheritance generally forces one to reimplement all ufuncs and such, and that does make it less useful in that case (though I still wonder if it still isn't better than the alternatives) Perry Greenfield ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Py2.3.1
iis it "pydos" ? your net add?/ Yahoo! Messenger - Communicate instantly..."Ping" your friends today! Download Messenger Now___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%
Dear all, I'm running a large Zope application on a 1x1GHz CPU 1GB mem Window XP Prof machine using Zope 2.7.3 and Py 2.3.4 The application typically builds large lists by appending and extending them. We regularly observed that using a given functionality a second time using the same process was much slower (50%) than when it ran the first time after startup. This behavior greatly improved with Python 2.3 (thanks to the improved Python object allocator, I presume). Nevertheless, I tried to convert the heap used by Python to a Windows Low Fragmentation Heap (available on XP and 2003 Server). This improved the overall run time of a typical CPU-intensive report by about 15% (overall run time is in the 5 minutes range), with the same memory consumption. I consider 15% significant enough to let you know about it. For information about the Low Fragmentation Heap, see http://msdn.microsoft.com/library/default.asp?url=/library/en-us/memory/base/low_fragmentation_heap.asp Best regards, Martin PS: Since I don't speak C, I used ctypes to convert all heaps in the process to LFH (I don't know how to determine which one is the C heap). COMIT AG Risk Management Systems Pflanzschulstrasse 7 CH-8004 Zürich Telefon +41 (44) 1 298 92 84 http://www.comit.ch http://www.quantax.com - Quantax Trading and Risk System ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] builtin_id() returns negative numbers
> Maybe it's just a wart we have to live with now; OTOH, > the docs explicitly warn that id() may return a long, so any code > relying on "short int"-ness has always been relying on an > implementation quirk. Well, the docs say that %x does unsigned conversion, so they've been relying on an implementation quirk as well ;) Would it be practical to add new conversion syntax to string interpolation? Like, for example, %p as an unsigned hex number the same size as (void *). Otherwise, unless I misunderstand integer unification, one would just have to strike the distinction between, say, %d and %u. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subclassing PyCFunction_Type
Nick Rasmussen <[EMAIL PROTECTED]> writes: [five days ago] > Should boost::python functions be modified in some way to show > up as builtin function types or is the right fix really to patch > pydoc? My heart leans towards the latter. > Is PyCFunction_Type intended to be subclassable? Doesn't look like it, does it? :) More seriosly, "no". Cheers, mwh -- ARTHUR: Don't ask me how it works or I'll start to whimper. -- The Hitch-Hikers Guide to the Galaxy, Episode 11 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] subclassing PyCFunction_Type
tommy said that this would be the best place to ask this question I'm trying to get functions wrapped via boost to show up as builtin types so that pydoc includes them when documenting the module containing them. Right now boost python functions are created using a PyTypeObject such that when inspect.isbuiltin does: return isinstance(object, types.BuiltinFunctionType) isintance returns 0. Initially I had just modified a local pydoc to document all functions with unknown source modules (since the module can't be deduced from non-python functions), but I figured that the right fix was to get boost::python functions to correctly show up as builtins, so I tried setting PyCFunction_Type as the boost function type object's tp_base, which worked fine for me using linux on amd64, but when my patch was tried out on other platforms, it ran into regression test failures: http://mail.python.org/pipermail/c++-sig/2005-February/008545.html So I have some questions: Should boost::python functions be modified in some way to show up as builtin function types or is the right fix really to patch pydoc? Is PyCFunction_Type intended to be subclassable? I noticed that it does not have Py_TPFLAGS_BASETYPE set in its tp_flags. Also, PyCFunction_Type has Py_TPFLAGS_HAVE_GC, and as the assertion failures in the testsuite seemed to be centered around object allocation/ garbage collection, so is there something related to subclassing a gc-aware class that needs to be happening (currently the boost type object doesn't support garbage collection). If subclassing PyCFunction_Type isn't the right way to make these functions be considered as builtin functions, what is? -nick ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subclassing PyCFunction_Type
At 02:32 PM 2/11/05 -0800, Nick Rasmussen wrote: tommy said that this would be the best place to ask this question I'm trying to get functions wrapped via boost to show up as builtin types so that pydoc includes them when documenting the module containing them. Right now boost python functions are created using a PyTypeObject such that when inspect.isbuiltin does: return isinstance(object, types.BuiltinFunctionType) FYI, this may not be the "right" way to do this, but since 2.3 'isinstance()' looks at an object's __class__ rather than its type(), so you could perhaps include a '__class__' descriptor in your method type that returns BuiltinFunctionType and see if that works. It's a kludge, but it might let your code work with existing versions of Python. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subclassing PyCFunction_Type
On Feb 16, 2005, at 11:02, Phillip J. Eby wrote:
At 02:32 PM 2/11/05 -0800, Nick Rasmussen wrote:
tommy said that this would be the best place to ask
this question
I'm trying to get functions wrapped via boost to show
up as builtin types so that pydoc includes them when
documenting the module containing them. Right now
boost python functions are created using a PyTypeObject
such that when inspect.isbuiltin does:
return isinstance(object, types.BuiltinFunctionType)
FYI, this may not be the "right" way to do this, but since 2.3
'isinstance()' looks at an object's __class__ rather than its type(),
so you could perhaps include a '__class__' descriptor in your method
type that returns BuiltinFunctionType and see if that works.
It's a kludge, but it might let your code work with existing versions
of Python.
It works in Python 2.3.0:
import types
class FakeBuiltin(object):
__doc__ = property(lambda self: self.doc)
__name__ = property(lambda self: self.name)
__self__ = property(lambda self: None)
__class__ = property(lambda self: types.BuiltinFunctionType)
def __init__(self, name, doc):
self.name = name
self.doc = doc
>>> help(FakeBuiltin("name", "name(foo, bar, baz) -> rval"))
Help on built-in function name:
name(...)
name(foo, bar, baz) -> rval
-bob
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subclassing PyCFunction_Type
At 11:26 AM 2/16/05 -0500, Bob Ippolito wrote:
>>> help(FakeBuiltin("name", "name(foo, bar, baz) -> rval"))
Help on built-in function name:
name(...)
name(foo, bar, baz) -> rval
If you wanted to be even more ambitious, you could return FunctionType and
have a fake func_code so pydoc will be able to see the argument signature
directly. :)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] subclassing PyCFunction_Type
On Feb 16, 2005, at 11:43, Phillip J. Eby wrote:
At 11:26 AM 2/16/05 -0500, Bob Ippolito wrote:
>>> help(FakeBuiltin("name", "name(foo, bar, baz) -> rval"))
Help on built-in function name:
name(...)
name(foo, bar, baz) -> rval
If you wanted to be even more ambitious, you could return FunctionType
and have a fake func_code so pydoc will be able to see the argument
signature directly. :)
I was thinking that too, but I didn't have the energy to code it in an
email :)
-bob
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] string find(substring) vs. substring in string
any special reason why "in" is faster if the substring is found, but
a lot slower if it's not in there?
timeit -s "s = 'not there'*100" "s.find('not there') != -1"
100 loops, best of 3: 0.749 usec per loop
timeit -s "s = 'not there'*100" "'not there' in s"
1000 loops, best of 3: 0.122 usec per loop
timeit -s "s = 'not the xyz'*100" "s.find('not there') != -1"
10 loops, best of 3: 7.03 usec per loop
timeit -s "s = 'not the xyz'*100" "'not there' in s"
1 loops, best of 3: 25.9 usec per loop
ps. btw, it's about time we did something about this:
timeit -s "s = 'not the xyz'*100" -s "import re; p = re.compile('not there')"
"p.search(s)"
10 loops, best of 3: 5.72 usec per loop
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
RE: [Python-Dev] string find(substring) vs. substring in string
Title: RE: [Python-Dev] string find(substring) vs. substring in string [Fredrik Lundh] #- any special reason why "in" is faster if the substring is found, but #- a lot slower if it's not in there? Maybe because it stops searching when it finds it? The time seems to be very dependant of the position of the first match: [EMAIL PROTECTED] ~/ota> python /usr/local/lib/python2.3/timeit.py -s "s = 'not there'*100" "'not there' in s" 100 loops, best of 3: 0.222 usec per loop [EMAIL PROTECTED] ~/ota> python /usr/local/lib/python2.3/timeit.py -s "s = 'blah blah'*20 + 'not there'*100" "'not there' in s" 10 loops, best of 3: 5.54 usec per loop [EMAIL PROTECTED] ~/ota> python /usr/local/lib/python2.3/timeit.py -s "s = 'blah blah'*40 + 'not there'*100" "'not there' in s" 10 loops, best of 3: 10.8 usec per loop . Facundo Bitácora De Vuelo: http://www.taniquetil.com.ar/plog PyAr - Python Argentina: http://pyar.decode.com.ar/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] string find(substring) vs. substring in string
Fredrik Lundh wrote: > any special reason why "in" is faster if the substring is found, but > a lot slower if it's not in there? Just guessing here, but in general I would think that it would stop searching as soon as it found it, whereas until then, it keeps looking, which takes more time. But I would also hope that it would be smart enough to know that it doesn't need to look past the 2nd character in 'not the xyz' when it is searching for 'not there' (due to the lengths of the sequences). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] string find(substring) vs. substring in string
On Wed, Feb 16, 2005 at 01:34:16PM -0700, Mike Brown wrote:
> time. But I would also hope that it would be smart enough to know that it
> doesn't need to look past the 2nd character in 'not the xyz' when it is
> searching for 'not there' (due to the lengths of the sequences).
Assuming stringobject.c:string_contains is the right function, the
code looks like this:
size = PyString_GET_SIZE(el);
rhs = PyString_AS_STRING(el);
lhs = PyString_AS_STRING(a);
/* optimize for a single character */
if (size == 1)
return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;
end = lhs + (PyString_GET_SIZE(a) - size);
while (lhs <= end) {
if (memcmp(lhs++, rhs, size) == 0)
return 1;
}
So it's doing a zillion memcmp()s. I don't think there's a more
efficient way to do this with ANSI C; memmem() is a GNU extension that
searches for blocks of memory. Perhaps saving some memcmps by writing
if ((*lhs == *rhs) && memcmp(lhs++, rhs, size) == 0)
would help.
--amk
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] string find(substring) vs. substring in string
> Assuming stringobject.c:string_contains is the right function, the
> code looks like this:
>
> size = PyString_GET_SIZE(el);
> rhs = PyString_AS_STRING(el);
> lhs = PyString_AS_STRING(a);
>
> /* optimize for a single character */
> if (size == 1)
> return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;
>
> end = lhs + (PyString_GET_SIZE(a) - size);
> while (lhs <= end) {
> if (memcmp(lhs++, rhs, size) == 0)
> return 1;
> }
>
> So it's doing a zillion memcmp()s. I don't think there's a more
> efficient way to do this with ANSI C; memmem() is a GNU extension that
> searches for blocks of memory. Perhaps saving some memcmps by writing
>
> if ((*lhs == *rhs) && memcmp(lhs++, rhs, size) == 0)
>
> would help.
Which is exactly how s.find() wins this race. (I guess it loses when
it's found by having to do the "find" lookup.) Maybe string_contains
should just call string_find_internal()?
And then there's the question of how the re module gets to be faster
still; I suppose it doesn't bother with memcmp() at all.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] string find(substring) vs. substring in string
Mike Brown wrote: Fredrik Lundh wrote: any special reason why "in" is faster if the substring is found, but a lot slower if it's not in there? Just guessing here, but in general I would think that it would stop searching as soon as it found it, whereas until then, it keeps looking, which takes more time. But I would also hope that it would be smart enough to know that it doesn't need to look past the 2nd character in 'not the xyz' when it is searching for 'not there' (due to the lengths of the sequences). There's the Boyer-Moore string search algorithm which is allegedly much faster than a simplistic scanning approach, and I also found this: http://portal.acm.org/citation.cfm?id=79184 So perhaps there's room for improvement :) --Irmen ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
A.M. Kuchling wrote:
>> time. But I would also hope that it would be smart enough to know that it
>> doesn't need to look past the 2nd character in 'not the xyz' when it is
>> searching for 'not there' (due to the lengths of the sequences).
>
> Assuming stringobject.c:string_contains is the right function, the
> code looks like this:
>
> size = PyString_GET_SIZE(el);
> rhs = PyString_AS_STRING(el);
> lhs = PyString_AS_STRING(a);
>
> /* optimize for a single character */
> if (size == 1)
> return memchr(lhs, *rhs, PyString_GET_SIZE(a)) != NULL;
>
> end = lhs + (PyString_GET_SIZE(a) - size);
> while (lhs <= end) {
> if (memcmp(lhs++, rhs, size) == 0)
> return 1;
> }
>
> So it's doing a zillion memcmp()s. I don't think there's a more
> efficient way to do this with ANSI C; memmem() is a GNU extension that
> searches for blocks of memory.
oops. so whoever implemented contains didn't even bother to look at the
find implementation... (which uses the same brute-force algorithm, but a better
implementation...)
> Perhaps saving some memcmps by writing
>
> if ((*lhs == *rhs) && memcmp(lhs++, rhs, size) == 0)
>
> would help.
memcmp still compiles to REP CMPB on many x86 compilers, and the setup
overhead for memcmp sucks on modern x86 hardware; it's usually better to
write your own bytewise comparision...
(and the fact that we're still brute-force search algorithms in "find" is a bit
embarrassing -- note that RE outperforms "in" by a factor of five guess
it's time to finish the split/replace parts of stringlib and produce a patch...
;-)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
Mike Brown wrote: >> any special reason why "in" is faster if the substring is found, but >> a lot slower if it's not in there? > > Just guessing here, but in general I would think that it would stop searching > as soon as it found it, whereas until then, it keeps looking, which takes more > time. the point was that string.find does the same thing, but is much faster in the "no match" case. > But I would also hope that it would be smart enough to know that it > doesn't need to look past the 2nd character in 'not the xyz' when it is > searching for 'not there' (due to the lengths of the sequences). note that the target string was "not the xyz"*100, so the search algorithm surely has to look past the second character ;-) (btw, the benchmark was taken from jim hugunin's ironpython talk, and seems to be carefully designed to kill performance also for more advanced algorithms -- including boyer-moore) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
Guido van Rossum wrote: > Which is exactly how s.find() wins this race. (I guess it loses when > it's found by having to do the "find" lookup.) Maybe string_contains > should just call string_find_internal()? I somehow suspected that "in" did some extra work in case the "find" failed; guess I should have looked at the code instead... I didn't really expect anyone to use a bad implementation of a brute-force algorithm (O(nm)) when the library already contained a reasonably good version of the same algorithm. > And then there's the question of how the re module gets to be faster > still; I suppose it doesn't bother with memcmp() at all. the benchmark cheats (a bit) -- it builds a state machine (KMP-style) in "compile", and uses that to search in O(n) time. that approach won't fly for "in" and find, of course, but it's definitely possible to make them run a lot faster than RE (i.e. O(n/m) for most cases)... but refactoring the contains code to use find_internal sounds like a good first step. any takers? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] 2.4 func.__name__ breakage
Rev 2.66 of funcobject.c made func.__name__ writable for the first time. That's great, but the patch also introduced what I'm pretty sure was an unintended incompatibility: after 2.66, func.__name__ was no longer *readable* in restricted execution mode. I can't think of a good reason to restrict reading func.__name__, and it looks like this part of the change was an accident. So, unless someone objects soon, I intend to restore that func.__name__ is readable regardless of execution mode (but will continue to be unwritable in restricted execution mode). Objections? Tres Seaver filed a bug report (some Zope tests fail under 2.4 because of this): http://www.python.org/sf/1124295 ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Re: string find(substring) vs. substring in string
> but refactoring the contains code to use find_internal sounds like a good > first step. any takers? > > I'm up for it. Raymond Hettinger ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
> memcmp still compiles to REP CMPB on many x86 compilers, and the setup > overhead for memcmp sucks on modern x86 hardware make that "compiles to REPE CMPSB" and "the setup overhead for REPE CMPSB" ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
Irmen de Jong wrote: There's the Boyer-Moore string search algorithm which is allegedly much faster than a simplistic scanning approach, and I also found this: http://portal.acm.org/citation.cfm?id=79184 So perhaps there's room for improvement :) The problem is setup vs. run. If the question is 'ab in 'rabcd', Boyer-Moore and other fancy searches will be swamped with prep time. In Fred's comparison with re, he does the re.compile(...) outside of the timing loop. You need to decide what the common case is. The longer the thing you are searching in, the more one-time-only overhead you can afford to reduce the per-search-character cost. --Scott David Daniels [EMAIL PROTECTED] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Re: string find(substring) vs. substring in string
> The longer the thing you are searching in, the more one-time-only > overhead you can afford to reduce the per-search-character cost. Only if you don't find it close to the start. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
Fredrik Lundh wrote: (btw, the benchmark was taken from jim hugunin's ironpython talk, and seems to be carefully designed to kill performance also for more advanced algorithms -- including boyer-moore) Looking for "not there" in "not the xyz"*100 using Boyer-Moore should do about 300 probes once the table is set (the underscores below): not the xyznot the xyznot the xyz... not ther_ not the__ not ther_ not the__ not ther_ ... -- Scott David Daniels [EMAIL PROTECTED] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Re: string find(substring) vs. substring in string
Scott David Daniels wrote:
> Looking for "not there" in "not the xyz"*100 using Boyer-Moore should do
> about 300 probes once the table is set (the underscores below):
>
> not the xyznot the xyznot the xyz...
> not ther_
> not the__
>not ther_
> not the__
> not ther_
> ...
yup; it gets into a 9/2/9/2 rut. tweak the pattern a little, and you get better
results for BM.
("kill" is of course an understatement, but BM usually works better. but it
still
needs a sizeof(alphabet) table, so you can pretty much forget about it if you
want to support unicode...)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%
Gfeller Martin wrote: Nevertheless, I tried to convert the heap used by Python to a Windows Low Fragmentation Heap (available on XP and 2003 Server). This improved the overall run time of a typical CPU-intensive report by about 15% (overall run time is in the 5 minutes range), with the same memory consumption. I must admit that I'm surprised. I would have expected that most allocations in Python go through obmalloc, so the heap would only see "large" allocations. It would be interesting to find out, in your application, why it is still an improvement to use the low-fragmentation heaps. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] string find(substring) vs. substring in string
Boyer-Moore and variants need a bit of preprocessing on the pattern which makes them great for long patterns but more costly for short ones. On Wed, 16 Feb 2005, Irmen de Jong wrote: > Mike Brown wrote: > > Fredrik Lundh wrote: > > > >>any special reason why "in" is faster if the substring is found, but > >>a lot slower if it's not in there? > > > > > > Just guessing here, but in general I would think that it would stop > > searching > > as soon as it found it, whereas until then, it keeps looking, which takes > > more > > time. But I would also hope that it would be smart enough to know that it > > doesn't need to look past the 2nd character in 'not the xyz' when it is > > searching for 'not there' (due to the lengths of the sequences). > > There's the Boyer-Moore string search algorithm which is > allegedly much faster than a simplistic scanning approach, > and I also found this: http://portal.acm.org/citation.cfm?id=79184 > So perhaps there's room for improvement :) > > --Irmen > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/allison%40sumeru.stanford.edu > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Windows Low Fragementation Heap yields speedup of ~15%
On Feb 16, 2005, at 18:42, Martin v. Löwis wrote: I must admit that I'm surprised. I would have expected that most allocations in Python go through obmalloc, so the heap would only see "large" allocations. It would be interesting to find out, in your application, why it is still an improvement to use the low-fragmentation heaps. Hmm... This is an excellent point. A grep through the Python source code shows that the following files call the native system malloc (I've excluded a few obviously platform specific files). A quick visual inspection shows that most of these are using it to allocate some sort of array or string, so it likely *should* go through the system malloc. Gfeller, any idea if you are using any of the modules on this list? If so, it would be pretty easy to try converting them to call the obmalloc functions instead, and see how that affects the performance. Evan Jones Demo/pysvr/pysvr.c Modules/_bsddb.c Modules/_curses_panel.c Modules/_cursesmodule.c Modules/_hotshot.c Modules/_sre.c Modules/audioop.c Modules/bsddbmodule.c Modules/cPickle.c Modules/cStringIO.c Modules/getaddrinfo.c Modules/main.c Modules/pyexpat.c Modules/readline.c Modules/regexpr.c Modules/rgbimgmodule.c Modules/svmodule.c Modules/timemodule.c Modules/zlibmodule.c PC/getpathp.c Python/strdup.c Python/thread.c ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] builtin_id() returns negative numbers
Richard Brodie wrote: Otherwise, unless I misunderstand integer unification, one would just have to strike the distinction between, say, %d and %u. Couldn't that be done anyway? The distinction really only makes sense in C, where there's no way of knowing whether the value is signed or unsigned otherwise. In Python the value itself knows whether it's signed or not. -- Greg Ewing, Computer Science Dept, +--+ University of Canterbury, | A citizen of NewZealandCorp, a | Christchurch, New Zealand | wholly-owned subsidiary of USA Inc. | [EMAIL PROTECTED] +--+ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] builtin_id() returns negative numbers
> > Otherwise, unless I misunderstand integer unification, one would > > just have to strike the distinction between, say, %d and %u. > > Couldn't that be done anyway? The distinction really only > makes sense in C, where there's no way of knowing whether > the value is signed or unsigned otherwise. In Python the > value itself knows whether it's signed or not. The time machine is at your service: in Python 2.4 there's no difference. That's integer unification for you! -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] license issues with profiler.py and md5.h/md5c.c
fyi - i've updated the python sha1/md5 openssl patch. it now replaces the entire sha and md5 modules with a generic hashes module that gives access to all of the hash algorithms supported by OpenSSL (including appropriate legacy interface wrappers and falling back to the old code when compiled without openssl). https://sourceforge.net/tracker/index.php?func=detail&aid=1121611&group_id=5470&atid=305470 I don't quite like the module name 'hashes' that i chose for the generic interface (too close to the builtin hash() function). Other suggestions on a module name? 'digest' comes to mind. -greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
