Re: [Python-Dev] PEP 383: Non-decodable Bytes in System Character Interfaces
2009/4/22 "Martin v. Löwis" : > To convert non-decodable bytes, a new error handler "python-escape" is > introduced, which decodes non-decodable bytes using into a private-use > character U+F01xx, which is believed to not conflict with private-use > characters that currently exist in Python codecs. Why not use U+DCxx for non-UTF-8 encodings too? Overall I like the PEP: I think it's the best proposal so far that doesn't put an heavy burden on applications that only want to do simple things with the API. -- Lino Mastrodomenico -- http://mail.python.org/mailman/listinfo/python-list
Re: Code that ought to run fast, but can't due to Python limitations.
2009/7/5 Hendrik van Rooyen : > I cannot see how you could avoid a python function call - even if he > bites the bullet and implements my laborious scheme, he would still > have to fetch the next character to test against, inside the current state. > > So if it is the function calls that is slowing him down, I cannot > imagine a solution using less than one per character, in which > case he is screwed no matter what he does. A simple solution may be to read the whole input HTML file in a string. This potentially requires lots of memory but I suspect that the use case by far most common for this parser is to build a DOM (or DOM-like) tree of the whole document. This tree usually requires much more memory that the HTML source itself. So, if the code duplication is acceptable, I suggest keeping this implementation for cases where the input is extremely big *AND* the whole program will work on it in "streaming", not just the parser itself. Then write a simpler and faster parser for the more common case when the data is not huge *OR* the user will keep the whole document in memory anyway (e.g. on a tree). Also: profile, profile a lot. HTML pages are very strange beasts and the bottlenecks may be in innocent-looking places! -- Lino Mastrodomenico -- http://mail.python.org/mailman/listinfo/python-list
Re: memoization module?
2009/7/5 kj : > Is there a memoization module for Python? I'm looking for something > like Mark Jason Dominus' handy Memoize module for Perl. Check out the "memoized" class example here: <http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize> -- Lino Mastrodomenico -- http://mail.python.org/mailman/listinfo/python-list
Re: missing 'xor' Boolean operator
2009/7/16 Hendrik van Rooyen : > "Hrvoje Niksic" wrote: > > >> Note that in Python A or B is in fact not equivalent to not(not A and >> not B). > > De Morgan would turn in his grave. If this can make him happier, in Python (not (not a and not b)) *is* equivalent to bool(a or b). (Modulo crazy things like redefining "bool" or having a __bool__ with side effects.) In the first expression you implicitly request a bool because you use "not", in the second one you do this with an explicit "bool". -- Lino Mastrodomenico -- http://mail.python.org/mailman/listinfo/python-list