Re: [Python-Dev] Using and binding relative names (was Re: PEP forBetter Control of Nested Lexical Scopes)
Alex Martelli wrote: > We stole list comprehensions and genexps from Haskell The idea predates Haskell, I think. I first saw it in Miranda, and it may have come from something even earlier -- SETL, maybe? Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Stephen J. Turnbull wrote: > Greg> I'd be perfectly happy with ascii characters, but in Py3k, > Greg> the most natural place to keep ascii characters will be in > Greg> character strings, not byte arrays. > > Natural != practical. That seems to be another thing we disagree about -- to me it seems both natural *and* practical. The whole business of stuffing binary data down a text channel is a practicality-beats-purity kind of thing. You wouldn't do it if you had a real binary channel available, but if you don't, it's better than nothing. > The base64 string is a representation of an object > that doesn't have text semantics. But the base64 string itself *does* have text semantics. That's the whole point of base64 -- to represent a non-text object *using* text. To me this is no different than using a string of decimal digit characters to represent an integer, or a string of hexadecimal digit characters to represent a bit pattern. Would you say that those are not text, either? What about XML? What would you consider the proper data type for an XML document to be inside a Python program -- bytes or text? I'm genuinely interested in your answer to that, because I'm trying to understand where you draw the line between text and non-text. You seem to want to reserve the term "text" for data that doesn't ever have to be understood even a little bit by a computer program, but that seems far too restrictive to me, and a long way from established usage. > Nor do base64 strings have text semantics: they can't even > be concatenated as text ... So if you > wish to concatenate the underlying objects, the base64 strings must be > decoded, concatenated, and re-encoded in the general case. You can't add two integers by concatenating their base-10 character representation, either, but I wouldn't take that as an argument against putting decimal numbers into text files. Also, even if we follow your suggestion and store our base64-encoded data in byte arrays, we *still* wouldn't be able to concatenate the original data just by concatenating those byte arrays. So this argument makes no sense either way. > IMO it's not worth preserving the very superficial > coincidence of "character representation" I disagree entirely that it's superficial. On the contrary, it seems to me to be very essence of what base64 is all about. If there's any "coincidence of representation" it's in the idea of storing the result as ASCII bit patterns in a byte array, on the assumption that that's probably how they're going to end up being represented down the line. That assumption could be very wrong. What happens if it turns out they really need to be encoded as UTF-16, or as EBCDIC? All hell breaks loose, as far as I can see, unless the programmer has kept very firmly in mind that there is an implicit ASCII encoding involved. It's exactly to avoid the need for those kinds of mental gymnastics that Py3k will have a unified, encoding-agnostic data type for all character strings. > I think that fact that favoring the coincidence of representation > leads you to also deprecate the very natural use of the codec API to > implement and understand base64 is indicative of a deep problem with > the idea of implementing base64 as bytes->unicode. Not sure I'm following you. I don't object to implementing base64 as a codec, only to exposing it via the same interface as the "real" unicode codecs like utf8, etc. I thought we were in agreement about that. If you're thinking that the mere fact its input type is bytes and its output type is characters is going to lead to its mistakenly appearing via that interface, that would be a bug or design flaw in the mechanism that controls which codecs appear via that interface. It needs to be controlled by something more than just the input and output types. Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
Greg Ewing wrote: > Raymond Hettinger wrote: >> Code that >> uses next() is more understandable, friendly, and readable without the >> walls of underscores. > > There wouldn't be any walls of underscores, because > >y = x.next() > > would become > >y = next(x) > > The only time you would need to write underscores is > when defining a __next__ method. That would be no worse > than defining an __init__ or any other special method, > and has the advantage that it clearly marks the method > as being special. I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected some day, such that the signature for the special method was "__next__(self, input)" and for the builtin "next(iterator, input=None)" That would go hand in hand with the idea of allowing the continue statement to accept an argument though. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [Was:Re: release plan for 2.5 ?]
Just van Rossum wrote:
> > If bytes support the buffer interface, we get another interesting
> > issue -- regular expressions over bytes. Brr.
>
> We already have that:
>
> >>> import re, array
> >>> re.search('\2', array.array('B', [1, 2, 3, 4])).group()
> array('B', [2])
> >>>
>
> Not sure whether to blame array or re, though...
SRE. iirc, the design rationale was to support RE over mmap'ed regions.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
On 2/27/06, Nick Coghlan <[EMAIL PROTECTED]> wrote: > I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected > some day, such that the signature for the special method was "__next__(self, > input)" and for the builtin "next(iterator, input=None)" > > That would go hand in hand with the idea of allowing the continue statement to > accept an argument though. Yup. The continue thing we might add in 2.6. The __next__ API in 3.0. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Current trunk test failures
On Sun, Feb 26, 2006 at 11:36:20PM -0600, Tim Peters wrote: > The buildbot shows that the debug-build test_grammar is dying with a C > assert failure on all boxes. > > In case it helps, in a Windows release build test_transformer is also failing: All build/test failures introduced by the PEP 308 patch should be fixed (thanks, Martin!) -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
> If implementing a mime packer is really the only use case > for base64, then it might as well be removed from the > standard library, since 99.9% of all programmers will > never touch it. Those that do will need to have boned up I use it quite a bit for image processing (converting to and from the "data:" URL form), and various checksum applications (converting SHA into a string). Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Switch to MS VC++ 2005 ?!
Microsoft has recently released their express version of the Visual C++. Given that this version is free for everyone, wouldn't it make sense to ship Python 2.5 compiled with this version ?! http://msdn.microsoft.com/vstudio/express/default.aspx I suppose this would make compiling extensions easier for people who don't have a standard VC++ .NET installed. Note: This is just a thought - I haven't looked into the consequences of building with VC8 yet, e.g. from the list of pre-requisites, it's possible that .NET 2.0 would become a requirement. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 27 2006) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
On 2/27/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! > > http://msdn.microsoft.com/vstudio/express/default.aspx > > I suppose this would make compiling extensions easier for people > who don't have a standard VC++ .NET installed. It would sure be nice for people like me with "occasional dabbler in Windows" status, so, selfishly, I'd be all in favor. However...: What I hear from the rumor mill (not perhaps a reliable source) is a bit discouraging about the stability of VS2005 (e.g. internal rebellion at MS in which groups which need to ship a lot of code pushed back against any attempt to make them use VS2005, and managed to win the internal fight and stick with VS2003), but I don't know if any such worry applies to something as simple as the mere compilation of C code... > Note: This is just a thought - I haven't looked into the consequences > of building with VC8 yet, e.g. from the list of pre-requisites, > it's possible that .NET 2.0 would become a requirement. You mean, to RUN vc8-compiled Python?! That would be perhaps the first C compiler ever unable to produce "native", stand-alone code, wouldn't it? Alex ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
On Monday 27 February 2006 5:51 pm, Alex Martelli wrote: > On 2/27/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > > Microsoft has recently released their express version of the Visual C++. > > Given that this version is free for everyone, wouldn't it make sense > > to ship Python 2.5 compiled with this version ?! > > > > http://msdn.microsoft.com/vstudio/express/default.aspx > > > > I suppose this would make compiling extensions easier for people > > who don't have a standard VC++ .NET installed. > > It would sure be nice for people like me with "occasional dabbler in > Windows" status, so, selfishly, I'd be all in favor. However...: > > What I hear from the rumor mill (not perhaps a reliable source) is a > bit discouraging about the stability of VS2005 (e.g. internal > rebellion at MS in which groups which need to ship a lot of code > pushed back against any attempt to make them use VS2005, and managed > to win the internal fight and stick with VS2003), but I don't know if > any such worry applies to something as simple as the mere compilation > of C code... ...but some extension modules are 500,000 lines of C++. Phil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
M.-A. Lemburg wrote: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! > > http://msdn.microsoft.com/vstudio/express/default.aspx The express editions are only "free" until November 7th: http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx#pricing -- Benji York ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
Zitat von "M.-A. Lemburg" <[EMAIL PROTECTED]>: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! Not in my opinion. People have also commented that they want to continue with this version (i.e. 7.1.). I actually hope that Python can skip VS 2005, and go right away to the next version. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
Alex Martelli wrote: > On 2/27/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: >> Microsoft has recently released their express version of the Visual C++. >> Given that this version is free for everyone, wouldn't it make sense >> to ship Python 2.5 compiled with this version ?! >> >> http://msdn.microsoft.com/vstudio/express/default.aspx >> >> I suppose this would make compiling extensions easier for people >> who don't have a standard VC++ .NET installed. > > It would sure be nice for people like me with "occasional dabbler in > Windows" status, so, selfishly, I'd be all in favor. However...: > > What I hear from the rumor mill (not perhaps a reliable source) is a > bit discouraging about the stability of VS2005 (e.g. internal > rebellion at MS in which groups which need to ship a lot of code > pushed back against any attempt to make them use VS2005, and managed > to win the internal fight and stick with VS2003), but I don't know if > any such worry applies to something as simple as the mere compilation > of C code... Should I read this as: VC8 is unstable ? Perhaps that's the reason they decided to give it away for free for the first year. >> Note: This is just a thought - I haven't looked into the consequences >> of building with VC8 yet, e.g. from the list of pre-requisites, >> it's possible that .NET 2.0 would become a requirement. > > You mean, to RUN vc8-compiled Python?! That would be perhaps the > first C compiler ever unable to produce "native", stand-alone code, > wouldn't it? Well, the code that VC7 generates relies on MSVCR71.DLL which appears to be part of .NET 1.1. It's hard to tell since I don't have a system around without .NET on it. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 27 2006) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
Zitat von "M.-A. Lemburg" <[EMAIL PROTECTED]>: > > What I hear from the rumor mill (not perhaps a reliable source) is a > > bit discouraging about the stability of VS2005 (e.g. internal > > rebellion at MS in which groups which need to ship a lot of code > > pushed back against any attempt to make them use VS2005, and managed > > to win the internal fight and stick with VS2003), but I don't know if > > any such worry applies to something as simple as the mere compilation > > of C code... > > Should I read this as: VC8 is unstable ? Not sure how Alex interprets this; I think that one of the good reasons not to use VS2005 is that they managed to "break" the C library: change it from standard C in an incompatible way that they think is better for the end user. One of these changes broke Python; we now have a work-around for this breakage. In addition to changing the library behaviour, they also produce tons of warnings about perfectly correct code. > Well, the code that VC7 generates relies on MSVCR71.DLL > which appears to be part of .NET 1.1. It's hard to tell > since I don't have a system around without .NET on it. I don't believe .NET 1.1 ships msvcr71.dll. Actually, Microsoft discourages installing msvcr into system32, so that would be against their own guidelines. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-PEP: The "bytes" object
Neil Schemenauer wrote:
> Ron Adam <[EMAIL PROTECTED]> wrote:
>> Why was it decided that the unicode encoding argument should be ignored
>> if the first argument is a string? Wouldn't an exception be better
>> rather than give the impression it does something when it doesn't?
>
>From the PEP:
>
> There is no sane meaning that the encoding can have in that
> case. str objects *are* byte arrays and they know nothing about
> the encoding of character data they contain. We need to assume
> that the programmer has provided str object that already uses
> the desired encoding.
>
> Raising an exception would be a valid option. However, passing the
> string through unchanged makes the transition from str to bytes
> easier.
Does it?
I am quite certain the bytes PEP is dead wrong on this. It should be changed.
Suppose I have code like this:
def faz(s):
return s.encode('utf-16be')
If I want to transition from str to bytes, how should I change this code?
def faz(s):
return bytes(s, 'utf-16be') # OOPS - subtle bug
This silently does the wrong thing when s is a str. If I hadn't read
the PEP, I would confidently assume that bytes(str, encoding) ==
bytes(unicode, encoding), modulo the default encoding. I'd be wrong.
But there's a really good reason to think this. Wherever a unicode
argument is expected in Python 2.x, you can pass a str and it'll be
silently decoded. This is an extremely strong convention. It's even
embedded in PyArg_ParseTuple(). I can't think of any exceptions to
the rule, offhand.
Is this special case special enough to break the rules? Arguable. I
suspect not. But even if so, allowing the breakage to pass silently
is surely a mistake. It should just refuse the temptation to guess,
and throw an exception--right?
Now you may be thinking: the str/unicode duality of text, and the
bytes/text duality of the "str" type, are *bad* things, and we're
trying to get rid of them. True. My view is, we'll be rid of them in
3.0 regardless. In the meantime, there is no point trying to pretend
that 2.0 "str" is bytes and not text. It just ain't so; you'll only
succeed in confusing people and causing bugs. (And in 3.0 you're
going to turn around and tell them "str" *is* text!)
Good APIs make simple, sensible, comprehensible promises. I like
these promises:
- bytes(arg) works like array.array('b', arg)
- bytes(arg1, arg2) works like bytes(arg1.encode(arg2))
I dislike these promises:
- bytes(s, [ignored]), where s is a str, works like array.array('b', s)
- bytes(u, [encoding]), where u is a unicode,
works like bytes(u.encode(encoding))
It seems more Pythonic to differentiate based on the number of
arguments, rather than the type.
-j
P.S. As someone who gets a bit agitated when the word "Pythonic" or
the Zen of Python is taken in vain, I'd like to know if anyone feels
I've done so here, so I can properly apologize. Thanks.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
[Alex Martelli wrote] > What I hear from the rumor mill (not perhaps a reliable source) is a > bit discouraging about the stability of VS2005 (e.g. internal > rebellion at MS in which groups which need to ship a lot of code > pushed back against any attempt to make them use VS2005, and managed > to win the internal fight and stick with VS2003), but I don't know if > any such worry applies to something as simple as the mere compilation > of C code... As a (perhaps significant) datapoint: the Mozilla guys are moving to building with VS2005. That's lots of C++ and widely run -- though probably not the C runtime so much. Trent -- Trent Mick [EMAIL PROTECTED] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
M.-A. Lemburg wrote: > Microsoft has recently released their express version of the Visual C++. > Given that this version is free for everyone, wouldn't it make sense > to ship Python 2.5 compiled with this version ?! > > http://msdn.microsoft.com/vstudio/express/default.aspx > > I suppose this would make compiling extensions easier for people > who don't have a standard VC++ .NET installed. it also causes more work for those of us who provide ready-made Windows binaries for more than just the latest and greatest Python release. if I could chose, I'd use the same compiler for at least one more release... ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
Zitat von Fredrik Lundh <[EMAIL PROTECTED]>: > it also causes more work for those of us who provide ready-made Windows > binaries for more than just the latest and greatest Python release. > > if I could chose, I'd use the same compiler for at least one more release... I find this argument convincing. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
> if I could chose, I'd use the same compiler for at least one more release... to clarify, the guideline should be "does the new compiler version add some- thing important ?", rather than just "is there a new version ?" ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Translating docs
Facundo Batista wrote:
> After a small talk with Raymond, yesterday in the breakfast, I
> proposed in PyAr the idea of start to translate the Library Reference.
>
> You'll agree with me that this is a BIG effort. But not only big, it's
> dynamic!
>
> So, we decided that we need a system that provide us the management of
> the translations. And it'd be a good idea the system to be available
> for translations in other languages.
>
> One of the guys proposed to use Launchpad (https://launchpad.net/).
>
> The question is, it's ok to use a third party system for this
> initiative? Or you (we) prefer to host it in-house? Someone alredy
> thought of this?
localized editions (with editing support) is definitely within the scope
for a more dynamic library reference platform [1].
with a more granular structure, you can easily track changes on a
method/function level, and dynamically generate pages that suits
the reader ("official english for version X.Y", "experimental norwegian",
"mixed latest english/german", etc).
(but until we get there (if ever), I see no reason not to use an existing
infrastructure, of course).
1) http://effbot.org/zone/pyref.htm
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
Zitat von Fredrik Lundh <[EMAIL PROTECTED]>: > to clarify, the guideline should be "does the new compiler version add some- > thing important ?", rather than just "is there a new version ?" In this specific case, the new thing added is the availability of Visual Studio Express. Whether this is important, and outweighs the disadvantages, I don't know. In addition, I'm uncertain whether this is a new feature. I thought you could get the VS 2003 compiler (VC 7.1) with the .NET 1.1 SDK. But maybe I'm misremembering. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Making ascii the default encoding
PEP 263 states that in Phase 2 the default encoding will be set to ASCII. Although the PEP is marked final, this isn't actually implemented. The warning about using non-ASCII characters started in 2.3. Does anyone think we shouldn't enforce the default being ASCII? This means if an # -*- coding: ... -*- is not set and non-ASCII characters are used, an error will be generated. n ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str.count is slow
>From comp.lang.python:
[EMAIL PROTECTED] wrote:
> It seems to me that str.count is awfully slow. Is there some reason
> for this?
> Evidence:
>
> str.count time test
> import string
> import time
> import array
>
> s = string.printable * int(1e5) # 10**7 character string
> a = array.array('c', s)
> u = unicode(s)
> RIGHT_ANSWER = s.count('a')
>
> def main():
> print 'str:', time_call(s.count, 'a')
> print 'array: ', time_call(a.count, 'a')
> print 'unicode:', time_call(u.count, 'a')
>
> def time_call(f, *a):
> start = time.clock()
> assert RIGHT_ANSWER == f(*a)
> return time.clock()-start
>
> if __name__ == '__main__':
> main()
>
> ## end
>
> On my machine, the output is:
>
> str: 0.29365715475
> array: 0.448095498171
> unicode: 0.0243757237303
>
> If a unicode object can count characters so fast, why should an str
> object be ten times slower? Just curious, really - it's still fast
> enough for me (so far).
>
> This is with Python 2.4.1 on WinXP.
>
>
> Chris Perkins
Your evidence points to some unoptimized code in the underlying C
implementation of Python. As such, this should probably go to the
python-dev list (http://mail.python.org/mailman/listinfo/python-dev).
The problem is that the C library function memcmp is slow, and
str.count calls it frequently. See lines 2165+ in stringobject.c
(inside function string_count):
r = 0;
while (i < m) {
if (!memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
This could be optimized as:
r = 0;
while (i < m) {
if (s[i] == *sub && !memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
This tactic typically avoids most (sometimes all) of the calls to
memcmp. Other string search functions, including unicode.count,
unicode.index, and str.index, use this tactic, which is why you see
unicode.count performing better than str.count.
The above might be optimized further for cases such as yours, where a
single character appears many times in the string:
r = 0;
if (n == 1) {
/* optimize for a single character */
while (i < m) {
if (s[i] == *sub)
r++;
i++;
}
} else {
while (i < m) {
if (s[i] == *sub && !memcmp(s+i, sub, n)) {
r++;
i += n;
} else {
i++;
}
}
}
Note that there might be some subtle reason why neither of these
optimizations are done that I'm unaware of... in which case a comment
in the C source would help. :-)
--Ben
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
"Benji York" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] >> http://msdn.microsoft.com/vstudio/express/default.aspx > > The express editions are only "free" until November 7th: > http://msdn.microsoft.com/vstudio/express/support/faq/default.aspx#pricing One can keep using any version downloaded before that date, but I would not be surprised to see a bugfix sometime after. There is also this: " 2.What can I do with the Express Editions? ... Evaluate the .NET Framework for Windows and Web development. " and this " 13.Can I develop applications using the Visual Studio Express Editions to target the .NET Framework 1.1? No, each release of Visual Studio is tied to a specific version of the .NET Framework. The Express Editions can only be used to create applications that run on the .NET Framework 2.0. " 'Free' is not always free. This appears to be a .NET 2 promotion. Perhaps the Firefox people are using the professional version, without such a limitation? Terry Jan Reedy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str.count is slow
(manually cross-posting from comp.lang.python) Ben Cartwright wrote: > Your evidence points to some unoptimized code in the underlying C > implementation of Python. As such, this should probably go to the > python-dev list (http://mail.python.org/mailman/listinfo/python-dev). > This tactic typically avoids most (sometimes all) of the calls to > memcmp. Other string search functions, including unicode.count, > unicode.index, and str.index, use this tactic, which is why you see > unicode.count performing better than str.count. it's about time that someone sat down and merged the string and unicode implementations into a single "stringlib" code base (see the SRE sources for an efficient way to do this in plain C). [1] moving to (basic) C++ might also be a good idea (in 3.0, perhaps). is any- one still stuck with pure C89 these days ? 1) anyone want me to start working on this ? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
"M.-A. Lemburg" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > Note: This is just a thought - I haven't looked into the consequences > of building with VC8 yet, e.g. from the list of pre-requisites, > it's possible that .NET 2.0 would become a requirement. >From the FAQ (see other reply), it appears that this *is* a requirement for the Express editions. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str.count is slow
Zitat von Fredrik Lundh <[EMAIL PROTECTED]>: > it's about time that someone sat down and merged the string and unicode > implementations into a single "stringlib" code base (see the SRE sources for > an efficient way to do this in plain C). [1] [...] > 1) anyone want me to start working on this ? This would be a waste of time: In Python 3, the string type will be gone (or, rather, the unicode type, depending on the point of view). Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str.count is slow
[EMAIL PROTECTED] wrote: > > it's about time that someone sat down and merged the string and unicode > > implementations into a single "stringlib" code base (see the SRE sources for > > an efficient way to do this in plain C). [1] > [...] > > 1) anyone want me to start working on this ? > > This would be a waste of time: In Python 3, the string type will be > gone (or, rather, the unicode type, depending on the point of view). no matter what ends up in Python 3, you'll still need to perform operations on both 8-bit buffers and Unicode buffers. (not to mention that a byte type that doesn't support find/split/count etc is pretty useless). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Switch to MS VC++ 2005 ?!
Zitat von Terry Reedy <[EMAIL PROTECTED]>: > " > 2.What can I do with the Express Editions? > ... > Evaluate the .NET Framework for Windows and Web development. > " > and this > > " Yes, but also this: """4. Can I use Express Editions for commercial use? Yes, there are no licensing restrictions for applications built using the Express Editions. """ > 13.Can I develop applications using the Visual Studio Express Editions to > target the .NET Framework 1.1? > No ... > 'Free' is not always free. This appears to be a .NET 2 promotion. Well, this is completely irrelevant for Python. Python does not use any .NET whatsoever (except for IronPython, of course). What framework version the C# links with is irrelevant for the to-native C compiler. > Perhaps the Firefox people are using the professional version, without such > a limitation? I guess the Express version can also build firefox, just fine. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] quick status report
I made a few more minor revisions to the AST on the plane this afternoon. I'll check them in tomorrow when I get a chance to do a full test run. * Remove asdl_seq_APPEND. All uses replaced with set * Fix set_context() comments and check return value every where. * Reimplement real arena for pyarena.c Jeremy ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Long-time shy failure in test_socket_ssl
[1/24/06, Tim Peters] >> ... >> test_rude_shutdown() is dicey, relying on a sleep() instead of proper >> synchronization to make it probable that the `listener` thread goes >> away before the main thread tries to connect, but while that race may >> account for bogus TestFailed deaths, it doesn't seem possible that it >> could account for the kind of failure above. [Tim Peters] > Well, since it's silly to try to guess about one weird failure when a > clear cause for another kind of weird failure is known, I checked in > changes to do "proper" thread synchronization and termination in that > test. Hasn't failed here since, but that's not surprising (it was > always a "once in a light blue moon" kind of thing). Neal plugged another hole later, but-- alas --I have seen the same shy failure since then on WinXP. One of the most recent buildbot test runs saw it too, on a non-Windows box: http://www.python.org/dev/buildbot/trunk/g5%20osx.3%20trunk/builds/204/step-test/0 test_socket_ssl test test_socket_ssl crashed -- exceptions.TypeError: 'NoneType' object is not callable in the second test run there. Still no theory! Maybe we can spend the next 3 days sprinting on it :-) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] defaultdict and on_missing()
Nick Coghlan wrote: > I wouldn't mind seeing one of the early ideas from PEP 340 being resurrected > some day, such that the signature for the special method was "__next__(self, > input)" and for the builtin "next(iterator, input=None)" Aren't we getting an argument to next() anyway? Or was that idea dropped? Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex()
Bill Janssen wrote: > I use it quite a bit for image processing (converting to and from the > "data:" URL form), and various checksum applications (converting SHA > into a string). Aha! We have a customer! For those cases, would you find it more convenient for the result to be text or bytes in Py3k? Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Pre-PEP: The "bytes" object
Jason Orendorff wrote:
> I like these promises:
> - bytes(arg) works like array.array('b', arg)
> - bytes(arg1, arg2) works like bytes(arg1.encode(arg2))
+1. That's exactly how I think it should work, too.
> I dislike these promises:
> - bytes(s, [ignored]), where s is a str, works like array.array('b', s)
> - bytes(u, [encoding]), where u is a unicode,
> works like bytes(u.encode(encoding))
Agreed.
Greg
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str.count is slow
Fredrik Lundh wrote: > moving to (basic) C++ might also be a good idea (in 3.0, perhaps). is any- > one still stuck with pure C89 these days ? Some of us actually *prefer* working with plain C when we have a choice, and don't consider ourselves "stuck" with it. My personal goal in life right now is to stay as far away from C++ as I can get. If CPython becomes C++-based (C++Python?) I will find it quite distressing, because my most favourite language will then be built on top of my least favourite language. Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str.count is slow
Greg Ewing wrote: > Fredrik Lundh wrote: > > > moving to (basic) C++ might also be a good idea (in 3.0, perhaps). is any- > > one still stuck with pure C89 these days ? > > Some of us actually *prefer* working with plain C > when we have a choice, and don't consider ourselves > "stuck" with it. perhaps, but polymorphic code is a lot easier to write in C++ than in C. > My personal goal in life right now is to stay as > far away from C++ as I can get. so what C compiler are you using ? ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
