Re: [Python-Dev] bytes type discussion
Adam Olsen wrote:
> My assumption is these would become errors in 3.x. bytes(str) is only
> needed so you can do bytes(u"abc".encode('utf-8')) and have it work in
> 2.x and 3.x.
I think the proposal for bytes(seq) to mean bytes(map(ord, seq))
was meant to be valid for both 2.x and 3.x, on the grounds that
you should be able to write byte string constants in the same
way in all versions.
> (I wonder if maybe they should be an error in 2.x as well. Source
> encoding is for unicode literals, not str literals.)
Source encoding applies to the entire source code, including (byte)
string literals, comments, identifiers, and keywords. IOW, if you
declare your source encoding is utf-8, the keyword "print" must
be represented with the bytes that represent the Unicode letters
for "p","r","i","n", and "t" in UTF-8.
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
Greg Ewing wrote: > If the protocol has been sensibly designed, that shouldn't > happen, since everything up to the coding marker should > be ascii (or some other protocol-defined initial coding). XML, for one protocol, requires you to restart over. The initial sequence could be UTF-16, or it could be EBCDIC. You read a few bytes (up to four), then know which of these it is. Then you start over, reading further if it looks like an ASCII superset, to find out the real encoding. You normally then start over, although switching at that point could also work. > For protocols that are not sensibly designed (or if you're > just trying to guess) what you suggest may be needed. But > it would be good to have a nicer way of going about it > for when the protocol is sensible. There might be buffering of decoded strings already, (ie. beyond the point to which you have read), so you would need to unbuffer these, and reinterpret them. To support that, you really need to buffer both the original bytes, and the decoded ones, since the encoding might not roundtrip. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 release schedule
Neal Norwitz wrote: > What do people think about that? There are still a lot of features we > want to add. Is this ok with everyone? Do you think it's realistic? My view on schedules is that they need to exist, whether they are followed or not. So having one is orders of magnitude better than having none. This specific one "looks right" also. > We still need a release manager. No one has heard from Anthony. If > he isn't interested is someone else interested in trying their hand at > it? He might be on vacation, no need to worry yet. If he doesn't want to do it, I would. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 PEP
Hi, 2 questions: - is (c)ElementTree still planned for inclusion ? - isn't the current implementation of itertools.tee (cache of previous generated values) incompatible with the new possibility to feed a generator (PEP 342) ? Regards Neal Norwitz a écrit : > Attached is the 2.5 release PEP 356. It's also available from: > http://www.python.org/peps/pep-0356.html > > Does anyone have any comments? Is this good or bad? Feel free to > send to me comments. > > We need to ensure that PEPs 308, 328, and 343 are implemented. We > have possible volunteers for 308 and 343, but not 328. Brett is doing > 352 and Martin is doing 353. > > We also need to resolve a bunch of other implementation details about > providing the C AST to Python, bdist_* issues and a few more possible > stdlib modules. Don't be shy, tell the world what you think about > these. > > Can someone go through PEP 4 and 11 and determine what work needs to be > done? > > The more we distribute the work, the easier it will be on everyone. > You don't really want to listen to me whine any more do you? ;-) > > Thank you, ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bdist_* to stdlib?
Bob Ippolito wrote: > ** The exception is scripts. Scripts go wherever --install-scripts= > point to, and AFAIK there is no means to ensure that the scripts from > one egg do not interfere with the scripts for another egg or anything > else on the PATH. I'm also not sure what the uninstallation story > with scripts is. Hopefully PEP 338 will go some way towards fixing that - in Python 2.5, the '-m' switch should be able to run modules inside eggs as scripts, reducing the need to install them directly into the filesystem. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] C AST to Python discussion
As per Neal's prodding email, here is a thread to discuss where we want to go with the C AST to Python stuff and what I think are the core issues at the moment. First issue is the ast-objects branch. Work is being done on it, but it still leaks some references (Neal or Martin can correct me if I am wrong). We really should choose either this branch or the current solution before really diving into coding stuff for exposing the AST so as to not waste too much time. Basically the issues are that the current solution will require using a serialization form to go from C to Python and back again. The PyObjects solution in the branch won't need this. One protects us from ending up with an unusable AST since the seralization can keep the original AST around and if the version passed back in from Python code is junk it can be tossed and the original version used. The PyObjects branch most likely won't have this since the actual AST will most likely be passed to Python code. But there is performance issues with all of this seralization compared to a simple Pyobject pointer into Pythonland. Jeremy supports the serialization option. I am personally indifferent while leaning towards the serialization. Then there is the API. First we need to decide if AST modification is allowed or not. It has been argued on my blog by someone (see http://sayspy.blogspot.com/2006/02/possibilities-of-ast.html for the entry on this whole topic which highly mirrors this email) that Guido won't okay AST transformations since it can lead to control flow changes behind the scenes. I say that is fine as long as knowing that AST transformations are occurring are sufficiently obvious. I say allow transformations. Once that is settled, I see three places for possible access to the AST. One is the command line like -m. Totally obvious to the user as long as they are not just working off of the .pyc files. Next is something like sys.ast_transformations that is a list of functions that are passed in the AST (and return a new version if modifications are allowed). This could allow chaining of AST transformations by feeding the next function with another one. Next is per-object AST access. This could get expensive since if we don't keep a copy of the AST with the code objects (which we probably shouldn't since that is wasted memory if the AST is not used a lot) we will need to read the code a second time to get the AST regenerated. I personally think we should choose an initial global access API to the AST as a starting API. I like the sys.ast_transformations idea since it is simple and gives enough access that whether read-only or read-write is allowed something like PyChecker can get the access it needs. It also allows for simple Python scripts that can install the desired functions and then compile or check the passed-in files. Obviously write accesss would be needed for optimization stuff (such as if the peepholer was rewritten in Python and used by default), but we can also expose this later if we want. In terms of 2.5, I think we really need to settle on the fate of the ast-objects branch. If we can get the very basic API for exposing the AST to Python code in 2.5 that would be great, but I don't view that as critical as choosing on the final AST implementation style since wasting work on a version that will disappear would just plain suck. It would be great to resolve this before the PyCon sprints since a good chunk of the AST-caring folk will be there for at least part of the time. -Brett ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
On 2/15/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote:
> Adam Olsen wrote:
> > (I wonder if maybe they should be an error in 2.x as well. Source
> > encoding is for unicode literals, not str literals.)
>
> Source encoding applies to the entire source code, including (byte)
> string literals, comments, identifiers, and keywords. IOW, if you
> declare your source encoding is utf-8, the keyword "print" must
> be represented with the bytes that represent the Unicode letters
> for "p","r","i","n", and "t" in UTF-8.
Although it does apply to the entire source file, I think this is more
for convenience (try telling an editor that only a single line is
Shift_JIS!) than to allow 8-bit (or 16-bit?!) str literals. Indeed,
you could have arbitrary 8-bit str literals long before the source
encoding was added. Keywords and identifiers continue to be limited
to ascii characters (even if they make a roundtrip through other
encodings), and comments continue to be ignored.
Source encoding exists so that you can write u"123" with the encoding
stated once at the top of the file, rather than "123".decode('utf-8')
with the encoding repeated everywhere.
Making it an error to have 8-bit str literals in 2.x would help
educate the user that they will change behavior in 3.0 and not be
8-bit str literals anymore.
--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
> "M" == "M.-A. Lemburg" <[EMAIL PROTECTED]> writes:
M> James Y Knight wrote:
>> Nice and simple.
M> Albeit, too simple.
M> The above approach would basically remove the possibility to
M> easily create bytes() from literals in Py3k, since literals in
M> Py3k create Unicode objects, e.g. bytes("123") would not work
M> in Py3k.
No, it just rules out a builtin easy way to create bytes() from
literals.
But who needs to do that? codec writers and people implementing wire
protocols with bytes() that look like character strings but aren't.
OK, so this makes life hard on codec writers. But those implementing
wire protocols can use existing codecs, presumably 'ascii' will do 99%
of the time:
def make_wire_token (unicode_string, encoding='ascii'):
return bytes(unicode_string.encode(encoding))
Everybody else is just asking for trouble by using bytes() for
character strings. It would really be desirable to have "string" be a
Unicode literal in Py3k, and u"string" a syntax error.
M> To prevent [people from learning to write "bytes('string')" in
M> 2.x and expecting that to work in Py3k], you'd have to outrule
M> bytes() construction from strings altogether, which doesn't
M> look like a viable option either.
Why not? Either bytes() are the same as strings, in which case why
change the name? or they're not, in which case we ask people to jump
through the required hoops to create them. Maybe I'm missing some
huge use case, of course, but it looks to me like the use cases are
pretty specialized, and are likely to involve explicit coding anyway.
--
School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp
University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN
Ask not how you can "do" free software business;
ask what your business can "do for" free software.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Please comment on PEP 357 -- adding nb_index slot to PyNumberMethods
Travis E. Oliphant wrote: > 3) A new C-API function PyNumber_Index will be added with signature > >Py_ssize_t PyNumber_index (PyObject *obj) > There's a typo in the function name here. Other than that, the PEP looks pretty much fine to me. About the only other quibble is that it could arguably do with a link to the thread where we discussed (and discarded) 'discrete' and 'ordinal' as alternative names (you mention the discussion, but don't give a reference). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 release schedule
> We still need a release manager. No one has heard from Anthony. It is the peak of the summer down here. Perhaps he is lucky enough to be enjoying it away from computers for a while? =Tony.Meyer ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
Guido van Rossum wrote:
> If bytes support the buffer interface, we get another interesting
> issue -- regular expressions over bytes. Brr.
We already have that:
>>> import re, array
>>> re.search('\2', array.array('B', [1, 2, 3, 4])).group()
array('B', [2])
>>>
Not sure whether to blame array or re, though...
Just
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
Brett Cannon wrote: > One protects us from ending up with an unusable AST since > the seralization can keep the original AST around and if the version > passed back in from Python code is junk it can be tossed and the > original version used. I don't understand why this is an issue. If Python code produces junk and tries to use it as an AST, then it's buggy and deserves what it gets. All the AST compiler should be responsible for is to try not to crash the interpreter under those conditions. But that's true whatever method is used for passing ASTs from Python to the compiler. The PyObjects branch most likely won't have > this since the actual AST will most likely be passed to Python code. > But there is performance issues with all of this seralization compared > to a simple Pyobject pointer into Pythonland. Jeremy supports the > serialization option. I am personally indifferent while leaning > towards the serialization. > > Then there is the API. First we need to decide if AST modification is > allowed or not. It has been argued on my blog by someone (see > http://sayspy.blogspot.com/2006/02/possibilities-of-ast.html for the > entry on this whole topic which highly mirrors this email) that Guido > won't okay AST transformations since it can lead to control flow > changes behind the scenes. I say that is fine as long as knowing that > AST transformations are occurring are sufficiently obvious. I say > allow transformations. > > Once that is settled, I see three places for possible access to the > AST. One is the command line like -m. Totally obvious to the user as > long as they are not just working off of the .pyc files. Next is > something like sys.ast_transformations that is a list of functions > that are passed in the AST (and return a new version if modifications > are allowed). This could allow chaining of AST transformations by > feeding the next function with another one. Next is per-object AST > access. This could get expensive since if we don't keep a copy of the > AST with the code objects (which we probably shouldn't since that is > wasted memory if the AST is not used a lot) we will need to read the > code a second time to get the AST regenerated. > > I personally think we should choose an initial global access API to > the AST as a starting API. I like the sys.ast_transformations idea > since it is simple and gives enough access that whether read-only or > read-write is allowed something like PyChecker can get the access it > needs. It also allows for simple Python scripts that can install the > desired functions and then compile or check the passed-in files. > Obviously write accesss would be needed for optimization stuff (such > as if the peepholer was rewritten in Python and used by default), but > we can also expose this later if we want. > > In terms of 2.5, I think we really need to settle on the fate of the > ast-objects branch. If we can get the very basic API for exposing the > AST to Python code in 2.5 that would be great, but I don't view that > as critical as choosing on the final AST implementation style since > wasting work on a version that will disappear would just plain suck. > It would be great to resolve this before the PyCon sprints since a > good chunk of the AST-caring folk will be there for at least part of > the time. > > -Brett > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/greg.ewing%40canterbury.ac.nz ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
Bob Ippolito wrote: > On Feb 14, 2006, at 4:17 PM, Guido van Rossum wrote: >> (Why would you even think about views here? They are evil.) > > I mention views because that's what numpy/Numeric/numarray/etc. > do... It's certainly convenient at times to have that functionality, > for example, to work with only the alpha channel in an RGBA image. > Probably too magical for the bytes type. The key difference between numpy arrays and normal sequences is that the length of a sequence can change, but the shape of a numpy array is essentially fixed. So view behaviour can be reserved for a dimensioned array type (if the numpy folks ever find the time to finish writing their PEP. . .) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
On Tue, 14 Feb 2006 12:31:07 -0700, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
>On Mon, Feb 13, 2006 at 08:07:49PM -0800, Guido van Rossum wrote:
>> On 2/13/06, Neil Schemenauer <[EMAIL PROTECTED]> wrote:
>> > "\x80".encode('latin-1')
>>
>> But in 2.5 we can't change that to return a bytes object without
>> creating HUGE incompatibilities.
>
>People could spell it bytes(s.encode('latin-1')) in order to make it
>work in 2.X. That spelling would provide a way of ensuring the type
>of the return value.
UIAM spelling it
bytes(map(ord, s))
or
bytes(s) # (bytes would do above internally)
would work for str or unicode and would be forward compatible.
or
bytes(s, encoding_name) # if standard mapping is not desired
BTW, ord(u'x') has the effect of u'x'.encode('latin-1')
Note:
>>> s256 = ''.join(chr(i) for i in xrange(256))
>>> assert s256.decode('latin-1') == u''.join(unichr(ord(c)) for c in s256)
>>> assert map(ord, s256.decode('latin-1')) == map(ord, s256) == range(256)
But this does *not* mean bytes has an implicit encoding!! It just means
there is a useful 1:1 mapping between the possible bytes values and the
first 256 unicode *characters*, remembering that the latter are *characters*
quite apart from whatever encoding the code source may have.
This is a nice safe 1:1 abstract correspondence ISTM.
>
>> You missed the part where I said that introducing the bytes type
>> *without* a literal seems to be a good first step. A new type, even
>> built-in, is much less drastic than a new literal (which requires
>> lexer and parser support in addition to everything else).
>
>Are you concerned about the implementation effort? If so, I don't
>think that's justified since adding a new string prefix should be
>pretty straightforward (relative to rest of the effort involved).
>Are you comfortable with the proposed syntax?
>
I'm -1 on special literal at this point. I think a special text-like literal
would be misleading, because it suggests that bytes is somehow in the
string family of types, which IMO it really isn't.
IMO it's semantically more of a builtin array.array('B').
If we adopt the ord/unichr mappings for strings to/from bytes, and
of course init also from a suitable integer sequence, we AGNI, I think.
Using non-ascii non-escaped characters in string literals for specifying
str ord values (as opposed to characters) is bad practice, but escaped
ascii-in-whatever-source-encoding and
native_literal_in_source_encoding.decode(source_encoding)
seem to work:
>>> for enc in 'cp437 latin-1 utf-8'.split():
... print '\n< %s >'%enc
... print mkretesc(enc, 0xf6)[1].decode(enc)
... print repr(mkretesc(enc, 0xf6)[1])
... print mkretesc(enc, 0xf6)[0]()
... t = mkretesc(enc, 0xf6)[0]()
... print t[0], t[1], t[2]
... print
...
< cp437 >
# -*- coding: cp437 -*-
def foof6(): return '\xf6', 'ö', 'ö'.decode('cp437')
"# -*- coding: cp437 -*-\ndef foof6(): return '\\xf6', '\x94',
'\x94'.decode('cp437')\n"
('\xf6', '\x94', u'\xf6')
÷ ö ö
< latin-1 >
# -*- coding: latin-1 -*-
def foof6(): return '\xf6', 'ö', 'ö'.decode('latin-1')
"# -*- coding: latin-1 -*-\ndef foof6(): return '\\xf6', '\xf6',
'\xf6'.decode('latin-1')\n"
('\xf6', '\xf6', u'\xf6')
÷ ÷ ö
< utf-8 >
# -*- coding: utf-8 -*-
def foof6(): return '\xf6', 'ö', 'ö'.decode('utf-8')
"# -*- coding: utf-8 -*-\ndef foof6(): return '\\xf6', '\xc3\xb6',
'\xc3\xb6'.decode('utf-8')\n"
('\xf6', '\xc3\xb6', u'\xf6')
÷ +¦ ö
The source looks the same viewed as characters, but you can see the differences
in the repr values.
But the consequence of source-encoding ord values determining str values is
that if e.g. you imported
this foo function from variously encoded sources, only the escaped and unicode
have the proper ord value.
The middle one comes from the native literal source encoding.
So until str becomes unicode, ascii or ascii escapes are a must for
ord-specifying. Afer str becomes unicode,
escapes will still work, but the unichr/ord symmetry will allow using the full
first 256 unicode characters
to specify byte type values if desired. (This happens to correspond to latin-1,
but don't mention it ;-)
It would make possible a round-trippable repr as bytes('...')
using ascii+escaped ascii, and full-256 unicode string literals
backwards-compatibly after py3k.
Have I missed a pitfall? Hope the output got through to your screen. The first
and last in the 3-character
lines should always be division sign and umlaut o. The problematical middle
ones should be cp437 translations
of the middle hex values, since that is the screen I copied from (umluat o,
division sign, and plus, vertical_bar
for the translation of the utf-8 encoding pair. That one illustrates the
problem of returning a "character"
encoded in utf-8 thinking single-byte ord value.).
BTW, should bytes be freezable?
Regards,
Bengt Richter
_
Re: [Python-Dev] how to upload new MacPython web page?
On Tue, Feb 14, 2006 at 09:32:09PM -0800, Bill Janssen wrote: > We (the pythonmac-sig mailing list) seem to have converged (almost -- > still talking about the logo) on a new download page for MacPython, to > replace the page currently at > http://www.python.org/download/download_mac.html. The strawman can be > seen at http://bill.janssen.org/mac/new-macpython-page.html. > > How do I get the bits changed on python.org (when we're finished)? [EMAIL PROTECTED] is probably the right email address (although most of them are on here as well.) -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
Greg Ewing wrote: > Brett Cannon wrote: >> One protects us from ending up with an unusable AST since >> the seralization can keep the original AST around and if the version >> passed back in from Python code is junk it can be tossed and the >> original version used. > > I don't understand why this is an issue. If Python code > produces junk and tries to use it as an AST, then it's > buggy and deserves what it gets. All the AST compiler > should be responsible for is to try not to crash the > interpreter under those conditions. But that's true > whatever method is used for passing ASTs from Python > to the compiler. I'd prefer the AST node be real Python objects. The arena approach seems to be working reasonably well, but I still don't see a good reason for using a specialised memory allocation scheme when it really isn't necessary and we have a perfectly good memory management system for PyObject's. On the 'unusable AST' front, if AST transformation code creates illegal output, then the main thing is to raise an exception complaining about what's wrong with it. I believe that may need a change to the compiler whether the modified AST was serialised or not. In terms of reverting back to the untransformed AST if the transformation fails, then that option is up to the code doing the transformation. Instead of serialising all the time (even for cases where the AST is just being inspected instead of transformed), we can either let the AST objects support the copy/deepcopy protocol, or else provide a method to clone a tree before trying to transform it. A unified representation means we only have one API to learn, that is accessible from both Python and C. It also eliminates any need to either implement features twice (once in Python and once in C) or else let the Python and C API's diverge to the point where what you can do with one differs from what you can do with the other. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 PEP
On Tue, Feb 14, 2006 at 09:58:46PM -0800, Neal Norwitz wrote: > We need to ensure that PEPs 308, 328, and 343 are implemented. We > have possible volunteers for 308 and 343, but not 328. Brett is doing > 352 and Martin is doing 353. I can volunteer for 328 if no one else wants it, I've messed with the import mechanism before (and besides, it's fun.) I've also written an unfinished 308 implementation to get myself acquainted with the AST code more. 'Unfinished' means that it works completely, except for some cases of ambiguous syntax. I can fix that in a few days if the deadline nears and there's no working patch. (Naively adding if/else expressions broke list comprehensions with an 'if' clause, and fixing that broke list comprehensions with 'for x in lambda:0, lambda:1', and fixing that broke list comprehensions altogether... I added "clean up Grammar file" to the PyCon core sprint topics for that reason. I guess 308 wasn't as much a trainer implementation as people thought ;) The syntax part of 328 is probably easier (but the rest isn't.) > Access to C AST from Python If this still needs work when I finish grokking the AST code and the PyObj branch of it, I can help. I should have more than enough spare time to finish these things before alpha 1. -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
On Wed, Feb 15, 2006 at 07:28:36PM +1000, Nick Coghlan wrote: > On the 'unusable AST' front, if AST transformation code creates illegal > output, then the main thing is to raise an exception complaining about > what's wrong with it. I believe that may need a change to the compiler > whether the modified AST was serialised or not. I would personally prefer the AST validation to be a separate part of the compiler. It means the one or the other can be out of sync, but it also means it can be accessed directly (validating AST before sending it to the compiler) and the compiler (or CFG generator, or something between AST and CFG) can decide not to validate internally generated AST for non-debug builds, for instance. I like both those reasons. -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
Guido van Rossum wrote: > But somehow I still like the 'open' verb. It has a long and rich > tradition. And it also nicely conveys that it is a factory function > which may return objects of different types (though similar in API) > based upon either additional arguments (e.g. buffering) or the > environment (e.g. encodings) or even inspection of the file being > opened. If we went with longer names, a slight variation on the opentext/openbinary idea would be to use opentext and opendata. That is, "give me something that looks like a text file (it contains characters)", or "give me something that looks like a data file (it contains bytes)". "opentext" would map to "codecs.open" (that is, accepting an encoding argument) "opendata" would map to the standard "open", but with the 'b' in the mode string added automatically. So the mode choices common to both would be: 'r'/'w'/'a' - read/write/append (default 'r') ''/'+'- update (IOError if file does not already exist) (default '') opentext would allow the additional option: ''/'U'- universal newlines (default '') Neither of them would accept a 'b' in the mode string. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Guido van Rossum wrote: > (Now that I work for Google I realize more than ever before the > importance of keeping URLs stable; PageRank(tm) numbers don't get > transferred as quickly as contents. I have this worry too in the > context of the python.org redesign; 301 permanent redirect is *not* > going to help PageRank of the new page.) Hi Guido, Could you expand on why 301 redirects won't help with the transfer of page rank (if you're allowed)? We've done exactly this on many sites and the pagerank (or more relevantly the search rankings on specific terms) has transferred almost overnight. The bigger pagerank updates (both algorithm changes and overhauls in approach) seem to only happen every few months and these also seem to take notice of 301 redirects (they generally clear up any supplemental results). The addition of the docs.python.org was also intended (I thought) to be used in the google customised search (the google page you go to when you search from python.org). I'm not sure if that go lost in implementation but the idea was that the google box would have a radio button for docs.python.org. I agree that docs.python.org should only be the current documentation however what about the large amount of people who use 2.3 as standard? perhaps the docs23.python.org makes sense. In terms of pagerank for the different versions of the docs, would it make sense to 'hide' the older versions of the docs with a noindex so that general google searches will only return the current docs. Google seems to have a policy of ranking 'long standing' links with a higher pagerank weighting, hence older versions of python docs ranking higher). Hence keeping a single 'current' set of docs and having all inbound links pointing to them (e.g. docs.python.org) will gradually build up the search ranking. +1 on docs.python.org only containing current (with the caveat that there be an equivalent for users of specific versions, e.g. 2.3 users) Tim Parkin p.s. All my knowledge of how google work is gained through personal research so the terminology, techniques and results may be completely wrong (and also may vary from time to time) - however they do reflect direct experience. p.p.s regarding 'site:', 'allinurl:' and other google modifiers; It would seem a good idea to create a single page that helped site users make such searches without having to learn how the modifiers work. It maybe should be noted that you can also add a 'temporary redirects' (302's) which is taken by google to mean "leave the original search results in place". This has also worked for us (old urls remain the same as far as google is concerned). ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
Greg Ewing wrote: > Ron Adam wrote: > >> My first impression and thoughts were: (and seems incorrect now) >> >> bytes(object) -> byte sequence of objects value >> >> Basically a "memory dump" of objects value. > > As I understand the current intentions, this is correct. > The bytes constructor would have two different signatures: > > (1) bytes(seq) --> interprets seq as a sequence of > integers in the range 0..255, > exception otherwise > > (2a) bytes(str, encoding) --> encodes the characters of > (2b) bytes(unicode, encoding) the string using the specified >encoding > > In (2a) the string would be interpreted as containing > ascii characters, with an exception otherwise. In 3.0, > (2a) will disappear leaving only (1) and (2b). I was presuming it would be done in C code and it will just need a pointer to the first byte, memchr(), and then read n bytes directly into a new memory range via memcpy(). But I don't know if that's possible with Pythons object model. (My C skills are a bit rusty as well) However, if it's done with a Python iterator and then each item is translated to bytes in a sequence, (much slower), an encoding will need to be known for it to work correctly. Unfortunately Unicode strings don't set an attribute to indicate it's own encoding. So bytes() can't just do encoding = s.encoding to find out, it would need to be specified in this case. And that should give you a byte object that is equivalent to the bytes in memory, providing Python doesn't compress data internally to save space. (?, I don't think it does) I'd prefer the first version *if possible* because of the performance. >> And I was thinking a bytes argument of more than one item would indicate >> a byte sequence. >> >> bytes(1,2,3) -> bytes([1,2,3]) > > But then you have to test the argument in the one-argument > case and try to guess whether it should be interpreted as > a sequence or an integer. Best to avoid having to do that. Yes, I agree. >> Which is fine... so ??? >> >> b = bytes(0L) -> bytes([0,0,0,0]) > > No, bytes(0L) --> TypeError because 0L doesn't implement > the iterator protocol or the buffer interface. It wouldn't need it if it was a direct C memory copy. > I suppose long integers might be enhanced to support the > buffer interface in 3.0, but that doesn't seem like a good > idea, because the bytes you got that way would depend on > the internal representation of long integers. In particular, Since some longs will be of different length, yes a bytes(0L) could give differing results on different platforms, but it will always give the same result on the platform it is run on. I actually think this is a plus and not a problem. If you are using Python to implement a byte interface you need to *know* it is different, not have it hidden. bytesize = len(bytes(0L)) # find how long a long is Cheers, Ronald Adam ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
Thomas Wouters wrote: > On Wed, Feb 15, 2006 at 07:28:36PM +1000, Nick Coghlan wrote: > >> On the 'unusable AST' front, if AST transformation code creates illegal >> output, then the main thing is to raise an exception complaining about >> what's wrong with it. I believe that may need a change to the compiler >> whether the modified AST was serialised or not. > > I would personally prefer the AST validation to be a separate part of the > compiler. It means the one or the other can be out of sync, but it also > means it can be accessed directly (validating AST before sending it to the > compiler) and the compiler (or CFG generator, or something between AST and > CFG) can decide not to validate internally generated AST for non-debug > builds, for instance. > > I like both those reasons. Aye, I was thinking much the same thing. Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] Generalizing *args and **kwargs
I've been thinking about generalization of the *args/**kwargs syntax for
quite a while, and even though I'm pretty sure Guido (and many people) will
consider it overgeneralization, I am finally going to suggest it. This whole
idea is not something dear to my heart, although I obviously would like to
see it happen. If the general vote is 'no', I'll write a small PEP or add it
to PEP 13 and be done with it.
The grand total of the generalization would be something like this:
Allow 'unpacking' of arbitrary iterables in sequences:
>>> iterable = (1, 2)
>>> ['a', 'b', *iterable, 'c']
['a', 'b', 1, 2, 'c']
>>> ('a', 'b', *iterable, 'c')
('a', 'b', 1, 2, 'c')
Possibly also allow 'unpacking' in list comprehensions and genexps:
>>> [ *subseq for subseq in [(1, 2), (3, 4)] ]
[1, 2, 3, 4]
(You can already do this by adding an extra 'for' loop inside the LC)
Allow 'unpacking' of mapping types (anything supporting 'items' or
'iteritems') in dictionaries:
>>> args = {'verbose': 1}
>>> defaults = {'verbose': 0}
>>> {**defaults, **args, 'fixedopt': 1}
{'verbose': 1, 'fixedopt': 1}
Allow 'packing' in assignment, stuffing left-over items in a list.
>>> a, b, *rest = range(5)
>>> a, b, rest
(0, 1, [2, 3, 4])
>>> a, b, *rest = range(2)
(0, 1, [])
(A list because you can't always take the type of the RHS and it's the right
Python type for 'an arbitrary length homogeneous sequence'.)
While generalizing that, it may also make sense to allow:
>>> def spam(*args, **kwargs):
... return args, kwargs
...
>>> args = (1, 2); kwargs = {'eggs': 'no'}
>>> spam(*args, 3)
((1, 2, 3), {})
>>> spam(*args, 3, **kwargs, spam='extra', eggs='yes')
((1, 2, 3), {'spam': 'extra', 'eggs': 'yes'})
(In spite of the fact that both are already possible by fiddling args/kwargs
beforehand or doing '*(args + (3,))'.)
Maybe it also makes sense on the defining side, particularly for keyword
arguments to indicate 'keyword-only arguments'. Maybe with a '**' without a
name attached:
>>> def spam(pos1, pos2, **, kwarg1=.., kwarg2=..)
But I dunno yet.
Although I've made it look like I have a working implementation, I haven't.
I know exactly how to do it, though, except for the AST part ;) Once I
figure out how to properly work with the AST code I'll probably write this
patch whether it's a definite 'no' or not, just to see if I can. I wouldn't
mind if people gave their opinion, though.
--
Thomas Wouters <[EMAIL PROTECTED]>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
On 2/15/06, Ron Adam <[EMAIL PROTECTED]> wrote:
> Greg Ewing wrote:
> > Ron Adam wrote:
> >> b = bytes(0L) -> bytes([0,0,0,0])
> >
> > No, bytes(0L) --> TypeError because 0L doesn't implement
> > the iterator protocol or the buffer interface.
>
> It wouldn't need it if it was a direct C memory copy.
>
> > I suppose long integers might be enhanced to support the
> > buffer interface in 3.0, but that doesn't seem like a good
> > idea, because the bytes you got that way would depend on
> > the internal representation of long integers. In particular,
>
> Since some longs will be of different length, yes a bytes(0L) could give
> differing results on different platforms, but it will always give the
> same result on the platform it is run on. I actually think this is a
> plus and not a problem. If you are using Python to implement a byte
> interface you need to *know* it is different, not have it hidden.
>
> bytesize = len(bytes(0L)) # find how long a long is
I believe you're confusing a C long with a Python long. A Python long
is implemented as an array and has variable size.
In any case we already have the struct module:
>>> import struct
>>> struct.calcsize('l')
4
--
Adam Olsen, aka Rhamphoryncus
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
On Wed, 15 Feb 2006 00:34:35 -0800 Brett Cannon <[EMAIL PROTECTED]> wrote: > As per Neal's prodding email, here is a thread to discuss where we > want to go with the C AST to Python stuff and what I think are the > core issues at the moment. > > First issue is the ast-objects branch. Work is being done on it, but > it still leaks some references (Neal or Martin can correct me if I am > wrong). I've been doing the heavy lifting on ast-objects the last few weeks. Today it finally passed the python test suite. The last thing to do is the addition of XDECREF's, so yes, it is leaking a lot of references. I won't make it to PyCon (it's a long way for me to come), but gee I've left all the fun stuff for you to do ! :) Even if AST transforms are not allowed, I see it as the strongest form of code reflection, and long over-due in python. Simon. -- Simon Burton, B.Sc. Licensed PO Box 8066 ANU Canberra 2601 Australia Ph. 61 02 6249 6940 http://arrowtheory.com ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
> "Fred" == Fred L Drake, <[EMAIL PROTECTED]> writes: Fred> On Tuesday 14 February 2006 22:34, Greg Ewing wrote: >> Seems to me this is a case where you want to be able to change >> encodings in the middle of reading the stream. You start off >> reading the data as ascii, and once you've figured out the >> encoding, you switch to that and carry on reading. Fred> Not quite. The proper response in this case is often to Fred> re-start decoding with the correct encoding, since some of Fred> the data extracted so far may have been decoded incorrectly. Fred> A very carefully constructed application may be able to go Fred> back and re-decode any data saved from the stream with the Fred> previous encoding, but that seems like it would be pretty Fred> fragile in practice. I believe GNU Emacs is currently doing this. AIUI, they save annotations where the codec is known to be non-invertible (eg, two charset-changing escape sequences in a row). I do think this is fragile, and a robust application really should buffer everything it's not sure of decoding correctly. Fred> There may be cases where switching encoding on the fly makes Fred> sense, but I'm not aware of any actual examples of where Fred> that approach would be required. This is exactly what ISO 2022 formalizes: switching encodings on the fly. mboxes of Japanese mail often contain random and unsignaled encoding changes. A terminal emulator may need to switch when logging in to a remote system. -- School of Systems and Information Engineering http://turnbull.sk.tsukuba.ac.jp University of TsukubaTennodai 1-1-1 Tsukuba 305-8573 JAPAN Ask not how you can "do" free software business; ask what your business can "do for" free software. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] nice()
I am reluctantly posting here since this is of less intense interest than other things being discussed right now, but this is related to the areclose proposal that was discussed here recently. The following discussion ends with things that python-dev might want to consider in terms of adding a function that allows something other than the default 12- and 17-digit precision representations of numbers that str() and repr() give. Such a function (like nice(), perhaps named trim()?) would provide a way to convert fp numbers that are being used in comparisons into a precision that reflects the user's preference. Everyone knows that fp numbers must be compared with caution, but there is a void in the relative-error department for exercising such caution, thus the proposal for something like 'areclose'. The problem with areclose(), however, is that it only solves one part of the problem that needs to be solved if two fp's *are* going to be compared: if you are going to check if a < b you would need to do something like not areclose(a,b) and a < b With something like trim() (a.k.a nice()) you could do trim(a) < trim(b) to get the comparison to 12-digit default precision or arbitrary precision with optional arguments, e.g. to 3 digits of precision: trim(a,3) < trim(b,3) >From a search on the documentation, I don't see that the name trim() is taken >yet. OK, comments responding to Greg follow. | From: Greg Ewing [EMAIL PROTECTED] | Smith wrote: | || computing the bin boundaries for a histogram || where bins are a width of 0.1: || | for i in range(20): || ... if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)): || ... print i,repr(i*.1),repr(i/10.),i*.1,i/10. | | I don't see how that has any relevance to the way bin boundaries | would be used in practice, which is to say something like | | i = int(value / 0.1) | bin[i] += 1 # modulo appropriate range checks This is just masking the issue by converting numbers to integers. The fact remains that two mathematically equal numbers can have two different internal representations with one being slightly larger than the exact integer value and one smaller: >>> a=(23*.1)*10;a 23.004 >>> b=2.3/.1;b 22.996 >>> int(a/.1),int(b/.1) (230, 229) Part of the answer in this context is to use round() rather than int so you are getting to the closest integer. || For, say, garden variety numbers that aren't full of garbage digits || resulting from fp computation, the boundaries computed as 0.1*i are\ || not going to agree with such simple numbers as 1.4 and 0.7. | | Because the arithmetic is binary rather than decimal. But even using | decimal, you get the same sort of problems using a bin width of | 1.0/3.0. The solution is to use an algorithm that isn't sensitive | to those problems, then it doesn't matter what base your arithmetic | is done in. Agreed. | || I understand that the above really is just a patch over the problem, || but I'm wondering if it moves the problem far enough away that most || users wouldn't have to worry about it. | | No, it doesn't. The problems are not conveniently grouped together | in some place you can get away from; they're scattered all over the | place where you can stumble upon one at any time. | Yes, even a simple computation of the wrong type can lead to unexpected results. I agree. || So perhaps this brings us back to the original comment that "fp || issues are a learning opportunity." They are. The question I have is || "how || soon do they need to run into them?" Is decreasing the likelihood || that they will see the problem (but not eliminate it) a good thing || for the python community or not? | | I don't think you're doing anyone any favours by trying to protect | them from having to know about these things, because they *need* to | know about them if they're not to write algorithms that seem to | work fine on tests but mysteriously start producing garbage when | run on real data, possibly without it even being obvious that it is | garbage. Mostly I agree, but if you go to the extreme then why don't we just drop floating point comparisons altogether and force the programmer to convert everything to integers and make their own bias evident (like converting to int rather than nearest int). Or we drop the fp comparison operators and introduce fp comparison functions that require the use of tolerance terms to again make the assumptions transparent: def lt(x, y, rel_err = 1e-5, abs_err = 1e-8): return not areclose(x,y,rel_err,abs_err) and int(x-y)<=0 print lt(a,b,0,1e-10) --> False (they are equal to that tolerance) print lt(a,b,0,1e-20) --> True (a is less than b at that tolerance) The fact is, we make things easier and let the programmer shoot themselves in the foot if they want to by providing things like fp comparisons and even functions like sum that do dumb-sums (though Raymond Hettinger's Python Recipe at ASPN provides a smart-sum). I
Re: [Python-Dev] str object going in Py3K
Adam Olsen wrote: On 2/14/06, Just van Rossum <[EMAIL PROTECTED]> wrote: +1 for two functions. My choice would be open() for binary and opentext() for text. I don't find that backwards at all: the text function is going to be more different from the current open() function then the binary function would be since in many ways the str type is closer to bytes than to unicode. Maybe it's even better to use opentext() AND openbinary(), and deprecate plain open(). We could even introduce them at the same time as bytes() (and leave the open() deprecation for 3.0). Thus providing us with a transition period, even with warnings on use of the old function. [snip..] I personally like the move towards all unicode strings, basically any text where you don't know the encoding used is 'random binary data'. This works fine, so long as you are in control of the text source. *However*, it leaves the following problem : The current situation (treating byte-sequences as text and assuming they are an ascii-superset encoded text-string) *works* (albeit with many breakages), simply because this assumption is usually correct. Forcing the programmer to be aware of encodings, also pushes the same requirement onto the user (who is often the source of the text in question). Currently you can read a text file and process it - making sure that any changes/requirements only use ascii characters. It therefore doesn't matter what 8 bit ascii-superset encoding is used in the original. If you force the programmer to specify the encoding in order to read the file, they would have to pass that requirement onto their user. Their user is even less likely to be encoding aware than the programmer. What this means, is that for simple programs where the programmer doesn't want to have to worry about encoding, or can't force the user to be aware, they will read in the file as bytes. Modules will quickly and inevitably be created implementing all the 'string methods' for bytes. New programmers will gravitate to these and the old mess will continue, but with a more awkward hybrid than before. (String manipulations of byte sequences will no longer be a core part of the language - and so be harder to use.) Not sure what we can do to obviate this of course... but is this change actually going to improve the situation or make it worse ? All the best, Michael Foord ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] nice()
[Smith] > The following discussion ends with things that python-dev might want to > consider in terms of adding a function that allows something other than the > default 12- and 17-digit precision representations of numbers that str() and > repr() give. Such a function (like nice(), perhaps named trim()?) would > provide a way to convert fp numbers that are being used in comparisons into a > precision that reflects the user's preference. -1 See posts by Greg, Terry, and myself which recommend against trim(), nice(), or other variants. For the purpose of precision sensitive comparisons, these constructs are unfit for their intended purpose -- they are error-prone and do not belong in Python. They may have some legitimate uses, but those tend to be dominated by the existing round() function. If anything, then some variant of is_close() can go in the math module. BUT, the justification should not be for newbies to ignore issues with floating-point equality comparisons. The justification would have to be that folks with some numerical sophistication have a recurring need for the function (with sophistication meaning that they know how to come up with relative and absolute tolerances that make their application succeed over the full domain of possible inputs). Raymond relevant posts from Greg and Terry [Greg Ewing] >> I don't think you're doing anyone any favours by trying to protect >> them from having to know about these things, because they *need* to >> know about them if they're not to write algorithms that seem to >> work fine on tests but mysteriously start producing garbage when >> run on real data, [Terry Reedy] > I agree. Here was my 'kick-in-the-butt' lesson (from 20+ years ago): the > 'simplified for computation' formula for standard deviation, found in too > many statistics books without a warning as to its danger, and specialized > for three data points, is sqrt( ((a*a+b*b+c*c)-(a+b+c)**2/3.0) /2.0). > After 1000s of ok calculations, the data were something like a,b,c = > 10005,10006,10007. The correct answer is 1.0 but with numbers rounded to 7 > digits, the computed answer is sqrt(-.5) == CRASH. I was aware that > subtraction lost precision but not how rounding could make a theoretically > guaranteed non-negative difference negative. > > Of course, Python floats being C doubles makes such glitches much rarer. > Not exposing C floats is a major newbie (and journeyman) protection > feature. [Greg Ewing] > I don't think you're doing anyone any favours by trying to protect > them from having to know about these things, because they *need* to > know about them if they're not to write algorithms that seem to > work fine on tests but mysteriously start producing garbage when > run on real data, I recommend rejecting trim(), nice(), areclose(), and all variants. Greg, Terry, and myself have > > OK, comments responding to Greg follow. > > > | From: Greg Ewing [EMAIL PROTECTED] > | Smith wrote: > | > || computing the bin boundaries for a histogram > || where bins are a width of 0.1: > || > | for i in range(20): > || ... if (i*.1==i/10.)<>(nice(i*.1)==nice(i/10.)): > || ... print i,repr(i*.1),repr(i/10.),i*.1,i/10. > | > | I don't see how that has any relevance to the way bin boundaries > | would be used in practice, which is to say something like > | > | i = int(value / 0.1) > | bin[i] += 1 # modulo appropriate range checks > > This is just masking the issue by converting numbers to integers. The fact > remains that two mathematically equal numbers can have two different internal > representations with one being slightly larger than the exact integer value > and one smaller: > a=(23*.1)*10;a > 23.004 b=2.3/.1;b > 22.996 int(a/.1),int(b/.1) > (230, 229) > > Part of the answer in this context is to use round() rather than int so you > are getting to the closest integer. > > > || For, say, garden variety numbers that aren't full of garbage digits > || resulting from fp computation, the boundaries computed as 0.1*i are\ > || not going to agree with such simple numbers as 1.4 and 0.7. > | > | Because the arithmetic is binary rather than decimal. But even using > | decimal, you get the same sort of problems using a bin width of > | 1.0/3.0. The solution is to use an algorithm that isn't sensitive > | to those problems, then it doesn't matter what base your arithmetic > | is done in. > > Agreed. > > | > || I understand that the above really is just a patch over the problem, > || but I'm wondering if it moves the problem far enough away that most > || users wouldn't have to worry about it. > | > | No, it doesn't. The problems are not conveniently grouped together > | in some place you can get away from; they're scattered all over the > | place where you can stumble upon one at any time. > | > > Yes, even a simple computation of the wrong type can lead to unexpected > results. I agree. > > || So perha
Re: [Python-Dev] Generalizing *args and **kwargs
> I've been thinking about generalization of the *args/**kwargs syntax for > quite a while, and even though I'm pretty sure Guido (and many people) will > consider it overgeneralization, I am finally going to suggest it. This whole > idea is not something dear to my heart, although I obviously would like to > see it happen. If the general vote is 'no', I'll write a small PEP or add it > to PEP 13 and be done with it. A PEP would be great, even if not accepted. At least we'll have it discussed in a single place and avoid rediscussing it everytime someone figures out it's a nice idea. Have a look for the subject "Extending tuple unpacking" in the mailing list for a recent discussion on the topic. -- Gustavo Niemeyer http://niemeyer.net ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bdist_* to stdlib?
Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing: > I'm disappointed that the various Linux distributions > still don't seem to have caught onto the very simple > idea of *not* scattering files all over the place when > installing something. > > MacOSX seems to be the only system so far that has got > this right -- organising the system so that everything > related to a given application or library can be kept > under a single directory, clearly labelled with a > version number. Those directories might be mounted on entirely different hardware (even over a network), often with different characteristics (access speed, writeability, etc.). -- Jan Claeys ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] how to upload new MacPython web page?
Thomas Wouters wrote: >On Tue, Feb 14, 2006 at 09:32:09PM -0800, Bill Janssen wrote: > > >>We (the pythonmac-sig mailing list) seem to have converged (almost -- >>still talking about the logo) on a new download page for MacPython, to >>replace the page currently at >>http://www.python.org/download/download_mac.html. The strawman can be >>seen at http://bill.janssen.org/mac/new-macpython-page.html. >> >>How do I get the bits changed on python.org (when we're finished)? >> >> > >[EMAIL PROTECTED] is probably the right email address (although most of >them are on here as well.) > > > I'm happy to upload the pages when you're ready. Tim ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
I am still -1 on the ast-objects branch. It adds a lot of boilerplate code and its makes complicated what is now simple. I'll see if I can get a rough cut of the marshal code ready today, so there will be a complete implementation of my original plan. I also think we should keep the transformation api simple. If we provide an extension module, along the lines of the parser module, users can write transformations with that module. They can also write their own wrapper script that runs a script after applying transformations. I agree that the question of saved bytecode files still needs to be resolved. I'm not sure that extending the bytecode format to record modifications is enough, since you also have a filename problem: How do you manage two versions of a module, one compiled with transformation and one compiled without? How about we arrange for some open space time at PyCon to discuss? Unfortunately, the compiler talk isn't until the last day and I can't stay for sprints. It would be better to have the talk, then the open space, then the sprint. Jeremy On 2/15/06, Simon Burton <[EMAIL PROTECTED]> wrote: > On Wed, 15 Feb 2006 00:34:35 -0800 > Brett Cannon <[EMAIL PROTECTED]> wrote: > > > As per Neal's prodding email, here is a thread to discuss where we > > want to go with the C AST to Python stuff and what I think are the > > core issues at the moment. > > > > First issue is the ast-objects branch. Work is being done on it, but > > it still leaks some references (Neal or Martin can correct me if I am > > wrong). > > I've been doing the heavy lifting on ast-objects the last few weeks. > Today it finally passed the python test suite. The last thing to do is > the addition of XDECREF's, so yes, it is leaking a lot of references. > > I won't make it to PyCon (it's a long way for me to come), but gee I've left > all the fun stuff for you to do ! > :) > > Even if AST transforms are not allowed, I see it as the strongest form of > code reflection, and long over-due in python. > > Simon. > > > -- > Simon Burton, B.Sc. > Licensed PO Box 8066 > ANU Canberra 2601 > Australia > Ph. 61 02 6249 6940 > http://arrowtheory.com > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 PEP
On Wed, Feb 15, 2006, Thomas Wouters wrote: > > I can volunteer for 328 if no one else wants it, I've messed with the import > mechanism before (and besides, it's fun.) I've also written an unfinished > 308 implementation to get myself acquainted with the AST code more. > 'Unfinished' means that it works completely, except for some cases of > ambiguous syntax. I can fix that in a few days if the deadline nears and > there's no working patch. If you want to also take over the PEP328 editing, please be my guest. I keep making time for it that gets overridden by other things. -- Aahz ([EMAIL PROTECTED]) <*> http://www.pythoncraft.com/ "19. A language that doesn't affect the way you think about programming, is not worth knowing." --Alan Perlis ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On Feb 15, 2006, at 7:19 AM, Fuzzyman wrote: > [snip..] > > I personally like the move towards all unicode strings, basically > any text where you don't know the encoding used is 'random binary > data'. This works fine, so long as you are in control of the text > source. *However*, it leaves the following problem : > > The current situation (treating byte-sequences as text and assuming > they are an ascii-superset encoded text-string) *works* (albeit > with many breakages), simply because this assumption is usually > correct. > > Forcing the programmer to be aware of encodings, also pushes the > same requirement onto the user (who is often the source of the text > in question). > > Currently you can read a text file and process it - making sure > that any changes/requirements only use ascii characters. It > therefore doesn't matter what 8 bit ascii-superset encoding is used > in the original. If you force the programmer to specify the > encoding in order to read the file, they would have to pass that > requirement onto their user. Their user is even less likely to be > encoding aware than the programmer. Or the programmer can just use "iso-8859-1" and call it done. That will get you the same "I don't care" behavior as now. James ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] math.areclose ...?
A problem that I pointed out with the proposed areclose() function is that it has within it a fp comparison. If such a function is to have greater utility, it should allow the user to specify how significant to consider the computed error. A natural extension of being able to tell if 2 fp numbers are close is to make a more general comparison. For that purpose, a proposed fpcmp function is appended. From that, fp boolean comparison operators (le, gt, ...) are easily constructed. Python allows fp comparison. This is significantly of source of surprises and learning experiences. Are any of these proposals of interest for providing tools to more intelligently make the fp comparisons? ### #new proposal for the areclose() function def areclose(x,y,atol=1e-8,rtol=1e-5,prec=12): """Return False if the |x-y| is greater than atol or greater than the absolute value of the larger of x and y, otherwise True. The comparison is made by computing a difference that should be 0 if the two numbers satisfy either condition; prec controls the precision of the value that is obtained, e.g. 8.3__e-17 is obtained for (2.1-2)-.1. But rounding to the 12th digit (the default precision) the value of 0.0 is returned indicating that for that precision there is no (significant) error.""" diff = abs(x-y) return round(diff-atol,prec)<=0 or \ round(diff-rtol*max(abs(x),abs(y)),prec)<=0 #fp cmp def fpcmp(x,y,atol=1e-8,rtol=1e-5,prec=12): """Return 0 if x and y are close in the absolute or relative sense. If not, then return -1 if x < y or +1 if x > y. Note: prec controls how many digits of the error are retained when checking for closeness.""" if areclose(x,y,atol,rtol,prec): return 0 else: return cmp(x,y) # fp comparisons functions def lt(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)==-1 def le(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec) in (-1,0) def eq(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)==0 def gt(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)==1 def ge(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec) in (0,1) def ne(x,y,atol=1e-8,rtol=1e-5,prec=12): return fpcmp(x, y, atol, rtol, prec)<>0 ### ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
On Tue, 14 Feb 2006 15:14:07 -0800, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>On 2/14/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
>> Guido van Rossum wrote:
>> > As Phillip guessed, I was indeed thinking about introducing bytes()
>> > sooner than that, perhaps even in 2.5 (though I don't want anything
>> > rushed).
>>
>> Hmm, that is probably going to be too early. As the thread shows
>> there are lots of things to take into account, esp. since if you
>> plan to introduce bytes() in 2.x, the upgrade path to 3.x would
>> have to be carefully planned. Otherwise, we end up introducing
>> a feature which is meant to prepare for 3.x and then we end up
>> causing breakage when the move is finally implemented.
>
>You make a good point. Someone probably needs to write up a new PEP
>summarizing this discussion (or rather, consolidating the agreement
>that is slowly emerging, where there is agreement, and summarizing the
>key open questions).
>
>> > Even in Py3k though, the encoding issue stands -- what if the file
>> > encoding is Unicode? Then using Latin-1 to encode bytes by default
>> > might not by what the user expected. Or what if the file encoding is
>> > something totally different? (Cyrillic, Greek, Japanese, Klingon.)
>> > Anything default but ASCII isn't going to work as expected. ASCII
>> > isn't going to work as expected either, but it will complain loudly
>> > (by throwing a UnicodeError) whenever you try it, rather than causing
>> > subtle bugs later.
>>
>> I think there's a misunderstanding here: in Py3k, all "string"
>> literals will be converted from the source code encoding to
>> Unicode. There are no ambiguities - a Klingon character will still
>> map to the same ordinal used to create the byte content regardless
>> of whether the source file is encoded in UTF-8, UTF-16 or
>> some Klingon charset (are there any ?).
>
>OK, so a string (literal or otherwise) containing a Klingon character
>won't be acceptable to the bytes() constructor in 3.0. It shouldn't be
>in 2.x either then.
>
>I still think that someone who types a file in Latin-1 and enters
>non-ASCII Latin-1 characters in a string literal and then passes it to
>the bytes() constructor might expect to get bytes encoded in Latin-1,
>and someone who types a file in UTF-8 and enters non-ASCII Unicode
>characters might expect to get UTF-8-encoded bytes. Since they can't
>both get what they want, we should disallow both, and only allow
>ASCII.
ISTM this is a good rule for backwards compatibility for the
'...' => u'...' py3k transition. I don't know if you saw my other post,
but I was suggesting that bytes(s_or_u) should be mapped to the integer
values by the current definition of ord for either str or unicode.
UIAM this works when you convert ASCII and will work if you convert
the ASCII string to unicode.
It will also let you use unicode _currently_ to get past the ASCII restriction,
since ord(u) works for all of the first 256 unicode characters.
Using those characters in bytes(u'...') works even if your source encoding is
utf-8
and contains ascii escapes, e.g.
>>> utfsrc = """\
... # -*- coding: utf-8 -*-
... umlaut_os, values = u'\xf6\\xf6', map(ord, u'\xf6\\xf6')
... """.decode('latin-1').encode('utf-8')
Hopefully showing on your screen properly:
>>> print utfsrc.decode('utf-8')
# -*- coding: utf-8 -*-
umlaut_os, values = u'ö\xf6', map(ord, u'ö\xf6')
And the repr, where you can see the utf-8 double chars for utf-8 and the \\xf6
ascii escape:
>>> print repr(utfsrc)
"# -*- coding: utf-8 -*-\numlaut_os, values = u'\xc3\xb6\\xf6', map(ord,
u'\xc3\xb6\\xf6')\n"
compiling the utf-8 source and executing it:
>>> exec compile(utfsrc,'','exec')
Good results:
>>> umlaut_os, map(hex, values)
(u'\xf6\xf6', ['0xf6', '0xf6'])
>>> print umlaut_os
öö
So map(s_or_u) works predictably now, and will not break after py3k
unless you use non-ascii in _plain_ str strings now. But in unicode it
should be ok even now.
I think ord is a consistent and handy mapping of characters to bytes,
and the fact that it works for unicode for all 256 characters seems to me
a boon. (So long as no one gets upset that ord(u) _happens_
to match ord(u.encode('latin-1')) ;-)
I didn't see yet where you had ruled against ord mapping of unicode to bytes,
so I am hopeful that you will consider it.
>> Furthermore, by restricting to ASCII you'd also outrule hex escapes
>> which seem to be the natural choice for presenting binary data in
>> literals - the Unicode representation would then only be an
>> implementation detail of the way Python treats "string" literals
>> and a user would certainly expect to find e.g. \x88 in the bytes object
>> if she writes bytes('\x88').
>
>I guess we'l just have to disappoint her. Too bad for the person who
>wrote bytes("\x12\x34\x56\x78\x9a\xbc\xde\xf0") -- they'll have to
>write bytes([0x12,0x34,0x56,0x78,0x9a,0xbc,0xde,0xf0]). Not so bad IMO
>and certainly easier than a *mixture* of hex and ASCII like
>'\xabc\xdef'.
>
>>
Re: [Python-Dev] str object going in Py3K
On 2/15/06, Nick Coghlan <[EMAIL PROTECTED]> wrote: > If we went with longer names, a slight variation on the opentext/openbinary > idea would be to use opentext and opendata. After some thinking I don't like opendata any more -- often data is text, so the term is wrong. openbinary is fine but long. So how about openbytes? This clearly links the resulting object with the bytes type, which is mutually reassuring. Regarding open vs. opentext, I'm still not sure. I don't want to generalize from the openbytes precedent to openstr or openunicode (especially since the former is wrong in 2.x and the latter is wrong in 3.0). I'm tempting to hold out for open() since it's most compatible. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On 2/15/06, Fuzzyman <[EMAIL PROTECTED]> wrote: > Forcing the programmer to be aware of encodings, also pushes the same > requirement onto the user (who is often the source of the text in question). The programmer shouldn't have to be aware of encodings most of the time -- it's the job of the I/O library to determine the end user's (as opposed to the language's) default encoding dynamically and act accordingly. Users who use non-ASCII characters without informing the OS of their encoding are in a world of pain, *unless* they use the OS default encoding (which may vary per locale). If the OS can figure out the default encoding, so can the Python I/O library. Many apps won't have to go beyond this at all. Note that I don't want to use this OS/user default encoding as the default encoding between bytes and strings; once you are reading bytes you are writing "grown-up" code and you will have to be explicit. It's only the I/O library that should automatically encode on write and decode on read. > Currently you can read a text file and process it - making sure that any > changes/requirements only use ascii characters. It therefore doesn't matter > what 8 bit ascii-superset encoding is used in the original. If you force the > programmer to specify the encoding in order to read the file, they would > have to pass that requirement onto their user. Their user is even less > likely to be encoding aware than the programmer. I disagree -- the user most likely has set or received a default encoding when they first got the computer, and that's all they are using. If other tools (notepad, wordpad, emacs, vi etc.) can figure out the encoding, so can Python's I/O library. > What this means, is that for simple programs where the programmer doesn't > want to have to worry about encoding, or can't force the user to be aware, > they will read in the file as bytes. Of course not! > Modules will quickly and inevitably be > created implementing all the 'string methods' for bytes. New programmers > will gravitate to these and the old mess will continue, but with a more > awkward hybrid than before. (String manipulations of byte sequences will no > longer be a core part of the language - and so be harder to use.) This seems an unlikely development if we do the conversions in the I/O library. > Not sure what we can do to obviate this of course... but is this change > actually going to improve the situation or make it worse ? I'm not worried about this scenario. "What if all the programmers in the world suddenly became dumb?" -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
Guido van Rossum wrote: > On 2/15/06, Nick Coghlan <[EMAIL PROTECTED]> wrote: >> If we went with longer names, a slight variation on the opentext/openbinary >> idea would be to use opentext and opendata. > > After some thinking I don't like opendata any more -- often data is > text, so the term is wrong. openbinary is fine but long. So how about > openbytes? This clearly links the resulting object with the bytes > type, which is mutually reassuring. > > Regarding open vs. opentext, I'm still not sure. I don't want to > generalize from the openbytes precedent to openstr or openunicode > (especially since the former is wrong in 2.x and the latter is wrong > in 3.0). I'm tempting to hold out for open() since it's most > compatible. Maybe a weird idea, but why not use static methods on the bytes and str type objects for this ?! E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 renamed to str.openfile()) After all, you are in a certain way constructing object of the given types - only that the input to these constructors happen to be files in the file system. -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 15 2006) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On Wed, 2006-02-15 at 09:17 -0800, Guido van Rossum wrote: > Regarding open vs. opentext, I'm still not sure. I don't want to > generalize from the openbytes precedent to openstr or openunicode > (especially since the former is wrong in 2.x and the latter is wrong > in 3.0). I'm tempting to hold out for open() since it's most > compatible. If we go with two functions, I'd much rather hang them off of the file type object then add two new builtins. I really do think file.bytes() and file.text() (a.k.a. open.bytes() and open.text()) is better than opentext() or openbytes(). -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > Maybe a weird idea, but why not use static methods on the > bytes and str type objects for this ?! > > E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 > renamed to str.openfile()) That's also not a bad idea, but I'd leave off one or the other of the redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() seem fine to me (we all know what 'open' means, right? :). -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
On Wed, Feb 15, 2006 at 10:29:38AM -0500, Jeremy Hylton wrote: > Unfortunately, the compiler talk isn't until the last day and I can't > stay for sprints. It would be better to have the talk, then the open > space, then the sprint. If you mean "Implementation of the Python Bytecode Compiler", that's on Saturday at 10:50, so you have a whole day in which to fit an open space event. Unfortunately there are already a lot of open space events on that day, and the next open slot is at 3:15PM. But if you don't need a room to talk in, I'm sure you can find a comfortable place for 5 or 6 people to chat. --amk ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
On Wed, 2006-02-15 at 00:34 -0800, Brett Cannon wrote: > I personally think we should choose an initial global access API to > the AST as a starting API. I like the sys.ast_transformations idea > since it is simple and gives enough access that whether read-only or > read-write is allowed something like PyChecker can get the access it > needs. I haven't been following the AST stuff closely enough, but I'm not crazy about putting access to this in the sys module. It seems like it clutters that up with a name that will be rarely used by the average Python programmer. -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
Barry Warsaw wrote: > On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > >> Maybe a weird idea, but why not use static methods on the >> bytes and str type objects for this ?! >> >> E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 >> renamed to str.openfile()) > > That's also not a bad idea, but I'd leave off one or the other of the > redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() > seem fine to me (we all know what 'open' means, right? :). Thinking about it, I like your idea better (file.bytes() and file.text()). Anyway, as long as we don't start adding openthis() and openthat() I guess I'm happy ;-) -- Marc-Andre Lemburg eGenix.com Professional Python Services directly from the Source (#1, Feb 15 2006) >>> Python/Zope Consulting and Support ...http://www.egenix.com/ >>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/ >>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/ ::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free ! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 release schedule
On Tue, 2006-02-14 at 21:24 -0800, Neal Norwitz wrote: > We still need a release manager. No one has heard from Anthony. If > he isn't interested is someone else interested in trying their hand at > it? There are many changes necessary in PEP 101 because since the > last release both python and pydotorg have transitioned from CVS to > SVN. Creosote also moved. I would definitely like to see a PEP 101 update as part of the 2.5 RM's responsibilities, and I think it could be done while spinning the first alpha release. I know others have volunteered, but in a pinch I'd be happy to dust off my RM hat and help out too. -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On Wed, 2006-02-15 at 19:02 +0100, M.-A. Lemburg wrote: > Anyway, as long as we don't start adding openthis() and openthat() > I guess I'm happy ;-) Me too! :) -Barry signature.asc Description: This is a digitally signed message part ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On 2/15/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote: > Barry Warsaw wrote: > > On Wed, 2006-02-15 at 18:29 +0100, M.-A. Lemburg wrote: > > > >> Maybe a weird idea, but why not use static methods on the > >> bytes and str type objects for this ?! > >> > >> E.g. bytes.openfile(...) and unicode.openfile(...) (in 3.0 > >> renamed to str.openfile()) > > > > That's also not a bad idea, but I'd leave off one or the other of the > > redudant "open" and "file" parts. E.g. bytes.open() and unicode.open() > > seem fine to me (we all know what 'open' means, right? :). > > Thinking about it, I like your idea better (file.bytes() > and file.text()). This is better than making it a static/class method on file (which has the problem that it might return something that's not a file at all -- file is a particular stream implementation, there may be others) but I don't like the tight coupling it creates between a data type and an I/O library. I still think that having global (i.e. built-in) factory functions for creating various stream types makes the most sense. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
On 2/14/06, Neil Schemenauer wrote:
> People could spell it bytes(s.encode('latin-1'))
Guido wrote:
> At the cost of an extra copying step.
I asked:
> ... why not just add some smarts to the bytes constructor?
Guido wrote:
> ... the VM usually keeps an extra reference
> on the stack so the refcount is never 1. But
> you can't rely on that
I did miss this, but _PyString_Resize seems to
work around it, and I'm not sure that the bytes
object can't be just as intimate.
Even if that is insurmountable, bytes objects
could recognize two states -- one normal, and
one for "I'm delegating to a string, and have to
copy to my own buffer before I actually mutate
anything."
Then a new bytes object would still need its
own header, but the data copying could often
be avoided.
But back to the possibility of not creating
even a new object header...
> the str's underlying array is allocated inline
> with the str header, this require str and
> bytes to have the same object layout. But
> since bytes are mutable, they can't.
Looking at the arraymodule, the only extra
fields in an array are weakrefs, description
(which will no longer be needed) and tracking
for the indirection. There are even a few extra
bytes leftover that could be used to indicate
that ob_item was redirected later, the way
tables do with small_table.
-jJ
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
> If we go with two functions, I'd much rather hang them off of the file > type object then add two new builtins. I really do think file.bytes() > and file.text() (a.k.a. open.bytes() and open.text()) is better than > opentext() or openbytes(). +1. The default behavior of the current open() in opening files as text is particularly grating. This would make things much clearer. Bill ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Instead of byte literals, how about a classmethod bytes.from_hex(), which works like this:
# two equivalent things
expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227, 131, 79, 229, 201, 46, 106])
It's just a nicety; the former fits my brain a little better. This would work fine both in 2.5 and in 3.0.
I thought about unicode.encode('hex'), but obviously it will continue
to return a str in 2.x, not bytes. Also the pseudo-encodings
('hex', 'rot13', 'zip', 'uu', etc.) generally scare me. And now
that bytes and text are going to be two very different types, they're
even weirder than before. Consider:
text.encode('utf-8') ==> bytes
text.encode('rot13') ==> text
bytes.encode('zip') ==> bytes
bytes.encode('uu') ==> text (?)
This state of affairs seems kind of crazy to me.
Actually users trying to figure out Unicode would probably be better served if bytes.encode() and text.decode() did not exist.
-j
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
Adam Olsen wrote: > Making it an error to have 8-bit str literals in 2.x would help > educate the user that they will change behavior in 3.0 and not be > 8-bit str literals anymore. You would like to ban string literals from the language? Remember: all string literals are currently 8-bit (byte) strings. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
On 2/15/06, Jason Orendorff <[EMAIL PROTECTED]> wrote:
> Instead of byte literals, how about a classmethod bytes.from_hex(), which
> works like this:
>
># two equivalent things
>expected_md5_hash =
> bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
>expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227,
> 131, 79, 229, 201, 46, 106])
>
> It's just a nicety; the former fits my brain a little better. This would
> work fine both in 2.5 and in 3.0.
Yes, this looks nice.
> I thought about unicode.encode('hex'), but obviously it will continue to
> return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13',
> 'zip', 'uu', etc.) generally scare me. And now that bytes and text are
> going to be two very different types, they're even weirder than before.
> Consider:
>
>text.encode('utf-8') ==> bytes
>text.encode('rot13') ==> text
>bytes.encode('zip') ==> bytes
>bytes.encode('uu') ==> text (?)
>
> This state of affairs seems kind of crazy to me.
>
> Actually users trying to figure out Unicode would probably be better served
> if bytes.encode() and text.decode() did not exist.
Yeah, the pseudogeneralizations seem to be a mistake -- they are
almost universally frowned upon. I'll happily send their to their
grave in Py3k.
It would be better if the signature of text.encode() always returned a
bytes object. But why deny the bytes object a decode() method if text
objects have an encode() method?
I'd say there are two "symmetric" API flavors possible (t and b are
text and bytes objects, respectively, where text is a string type,
either str or unicode; enc is an encoding name):
- b.decode(enc) -> t; t.encode(enc) -> b
- b = bytes(t, enc); t = text(b, enc)
I'm not sure why one flavor would be preferred over the other,
although having both would probably be a mistake.
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bdist_* to stdlib?
[Bob Ippolito wrote] >... > >/Library/Frameworks/Python.framework/... > >/Applications/MacPython-2.4/... # just MacPython does this > > ActivePython doesn't install app bundles for IDLE or anything? It does, but puts them under here instead: /Library/Frameworks/Python.framework/Versions/X.Y/Resources/ >... > >Also, a receipt of the installation ends up here: > > > >/Library/Receipts/$package_name/... > > > >though Apple does not provide tools for uninstallation using those > >receipts. > > That stuff is really behind the scenes stuff that's wholly managed by > Installer.app and is pretty much irrelevant. Sure. > Single apps are better than OK. Download them by whatever means you > want, put them wherever you want, and run them. You can run any well- > behaved application from a DMG (or a CD, or a USB key, or any other > readable media). For naive or new-to-mac users it is a confusing process to get the .app bundle to an appropriate place and then start running it. Why else have various app distributors out there come up with myriad slick background images for their DMG's trying to instruct users what to do with the icons in the mounted DMG's Finder window? On Windows you download an MSI (it ends up in your browser downloads folder), it starts the installation, and the end of the installation it starts the app for you. The app is nicely in Program Files. No need to eject something. No need to find somewhere to drag the icon. I'll grant that having the whole thing in one bundle is cool/handy/cute. ...anyway this is getting seriously OT for python-dev. :) Trent -- Trent Mick [EMAIL PROTECTED] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Jason Orendorff wrote:
> Instead of byte literals, how about a classmethod bytes.from_hex(), which
> works like this:
>
> # two equivalent things
> expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
I hope this will also be equivalent:
> expected_md5_hash = bytes.from_hex('5c 53 50 24 ca c5 19 91 53 e3 83 4f e5
> c9 2e 6a')
Thomas
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
Ron Adam <[EMAIL PROTECTED]> wrote: > Greg Ewing wrote: > > Ron Adam wrote: > >> b = bytes(0L) -> bytes([0,0,0,0]) > > > > No, bytes(0L) --> TypeError because 0L doesn't implement > > the iterator protocol or the buffer interface. > > It wouldn't need it if it was a direct C memory copy. Yes it would. Python long integers are stored as arrays of signed 16-bit short ints. See longintrepr.h from the source. - Josiah ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bdist_* to stdlib?
On Feb 15, 2006, at 4:49 AM, Jan Claeys wrote: > Op wo, 15-02-2006 te 14:00 +1300, schreef Greg Ewing: >> I'm disappointed that the various Linux distributions >> still don't seem to have caught onto the very simple >> idea of *not* scattering files all over the place when >> installing something. >> >> MacOSX seems to be the only system so far that has got >> this right -- organising the system so that everything >> related to a given application or library can be kept >> under a single directory, clearly labelled with a >> version number. > > Those directories might be mounted on entirely different hardware > (even > over a network), often with different characteristics (access speed, > writeability, etc.). Huh? What does that have to do with anything? I've never seen a system where /usr/include, /usr/lib, /usr/bin, etc. are not all on the same mount. It's not really any different with OS X either. -bob ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 PEP
Alain Poirier wrote: > - is (c)ElementTree still planned for inclusion ? It is included already. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] C AST to Python discussion
Thomas Wouters wrote: > I would personally prefer the AST validation to be a separate part of the > compiler. It means the one or the other can be out of sync, but it also > means it can be accessed directly (validating AST before sending it to the > compiler) and the compiler (or CFG generator, or something between AST and > CFG) can decide not to validate internally generated AST for non-debug > builds, for instance. That's how the ast-objects branch currently works. There is a method checking that the tree actually conforms to the grammar. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bdist_* to stdlib?
[Greg Ewing wrote] > It's not perfect, but it's still a lot better than the > situation on any other unix I've seen so far. Better than Unix, sure. But you *can* (and ActivePython does do) install everything under: /opt/$app_name/... > > open DMG, don't run the app from here, drag it to your > > Applications folder, then eject this window/disk, then run it from > > /Applications, > > A decently-designed application should be runnable from > anywhere, including a dmg, if the user wants to do that. > If an app refuses to run from a dmg, I consider that a > bug in the application. Yes, but the typical user probably *wants* to run the app from their /Applications folder (or somewhere else on their harddrive). When they start running from the mounted DMG, they can't then unmount the DMG to clean up. Actually the typical non-geek user doesn't care where they run the app from. They don't want to worry about those details. Trent -- Trent Mick [EMAIL PROTECTED] ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Jason Orendorff wrote:
> expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
This looks good, although it duplicates
expected_md5_hash = binascii.unhexlify('5c535024cac5199153e3834fe5c92e6a')
Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
On Tue, 14 Feb 2006 15:13:25 -0800, Guido van Rossum <[EMAIL PROTECTED]> wrote:
>I'm about to send 6 or 8 replies to various salient messages in the
>PEP 332 revival thread. That's probably a sign that there's still a
>lot to be sorted out. In the mean time, to save you reading through
>all those responses, here's a summary of where I believe I stand.
>Let's continue the discussion in this new thread unless there are
>specific hairs to be split in the other thread that aren't addressed
>below or by later posts.
>
>Non-controversial (or almost):
>
>- we need a new PEP; PEP 332 won't cut it
>
>- no b"..." literal
>
>- bytes objects are mutable
>
>- bytes objects are composed of ints in range(256)
>
>- you can pass any iterable of ints to the bytes constructor, as long
>as they are in range(256)
>
>- longs or anything with an __index__ method should do, too
>
>- when you index a bytes object, you get a plain int
>
>- repr(bytes[1,0 20, 30]) == 'bytes([10, 20, 30])'
>
>Somewhat controversial:
>
>- it's probably too big to attempt to rush this into 2.5
>
>- bytes("abc") == bytes(map(ord, "abc"))
>
>- bytes("\x80\xff") == bytes(map(ord, "\x80\xff")) == bytes([128, 256])
>
>Very controversial:
>
Given that ord/unichr and ord/chr work as encoding-agnostic function pairs
symmetrically
mapping between unicode and int or str and int, please consider the effect of
this API
as illustrated by how it works with the examples:
>>> def bytes(arg, encoding=None):
... if isinstance(arg, str):
... if encoding: b = map(ord, arg.decode(encoding))
... else: b = map(ord, arg)
... elif isinstance(arg, unicode):
... if encoding: raise ValueError(
... 'Use bytes(%r.encode(%r)) to avoid PY 3000 breakage'%(arg,
encoding))
... b = map(ord, arg)
... else:
... b = map(int, arg)
... if sum(1 for x in b if x<0 or x>255) > 0:
... raise ValueError('byte out of range')
... return 'bytes(%r)'%b
...
...
Then
>- bytes("abc", "encoding") == bytes("abc") # ignores the "encoding" argument
(Use encoding, the only requirement is that all the resulting ord values be in
range(0,256))
>>> bytes("abc\xf6", 'latin-1')
'bytes([97, 98, 99, 246])'
>>> print unichr(246)
ö
>>> bytes("abc\xf6", 'cp437')
'bytes([97, 98, 99, 247])'
>>> print unichr(247)
÷
>
>- bytes(u"abc") == bytes("abc") # for ASCII at least
>>> bytes(u"abc")
'bytes([97, 98, 99])'
>
>- bytes(u"\x80\xff") raises UnicodeError
>>> bytes(u"\x80\xff")
'bytes([128, 255])'
>
>- bytes(u"\x80\xff", "latin-1") == bytes("\x80\xff")
>>> bytes(u"\x80\xff", "latin-1")
Traceback (most recent call last):
File "", line 1, in ?
File "", line 6, in bytes
ValueError: Use bytes(u'\x80\xff'.encode('latin-1')) to avoid PY 3000 breakage
>>> bytes(u'\x80\xff'.encode('latin-1'))
'bytes([128, 255])'
(If the characters exist in the encoding specified, it will work, otherwise
raises exception. Assumes PY 3000 string encode results in bytes, so it should
work there too ;-)
of course,
>>> bytes(u'\u1234')
Traceback (most recent call last):
File "", line 1, in ?
File "", line 12, in bytes
ValueError: byte out of range
and
>>> bytes([1,2])
'bytes([1, 2])'
>>> bytes([1,-1])
Traceback (most recent call last):
File "", line 1, in ?
File "", line 12, in bytes
ValueError: byte out of range
>>> bytes([1,256])
Traceback (most recent call last):
File "", line 1, in ?
File "", line 12, in bytes
ValueError: byte out of range
Interestingly, the internal map int on a sequence permits
>>> bytes(["1", 2, 3L, True, 5.6])
'bytes([1, 2, 3, 1, 5])'
IOW, any sequence of objects that will convert themselves
to int in range(0,256) will do.
>
>Martin von Loewis's alternative for the "very controversial" set is to
>disallow an encoding argument and (I believe) also to disallow Unicode
>arguments. In 3.0 this would leave us with s.encode() as the
>only way to convert a string (which is always unicode) to bytes. The
>problem with this is that there's no code that works in both 2.x and
>3.0.
>
I hope Martin will reconsider, considering ord/unichr as a symmetric
pair of functions mapping 1:1 to unicode (and ignoring the fact that
this also happens to be the latin-1 mapping ;-)
A test class should be easy, except deciding on appropriate methods
and how the type should be defined. It's the same peculiar problem
as str, i.e., length one would be compatible with int, but not other lengths.
How do we do that?
Regards,
Bengt Richter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
On 2/15/06, Tim Parkin <[EMAIL PROTECTED]> wrote: > Guido van Rossum wrote: > > > (Now that I work for Google I realize more than ever before the > > importance of keeping URLs stable; PageRank(tm) numbers don't get > > transferred as quickly as contents. I have this worry too in the > > context of the python.org redesign; 301 permanent redirect is *not* > > going to help PageRank of the new page.) > Could you expand on why 301 redirects won't help with the transfer of > page rank (if you're allowed)? We've done exactly this on many sites and > the pagerank (or more relevantly the search rankings on specific terms) > has transferred almost overnight. The bigger pagerank updates (both > algorithm changes and overhauls in approach) seem to only happen every > few months and these also seem to take notice of 301 redirects (they > generally clear up any supplemental results). OK, perhaps I stand corrected. I don't actually know that much about PageRank! I still don't like docs.python.org, and adding more like it seems a mistake; but it's possible that this is because of a poor execution of the idea (there's no "search docs" button near the search button on the old python.org). -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Generalizing *args and **kwargs
On 2/15/06, Thomas Wouters <[EMAIL PROTECTED]> wrote: > I've been thinking about generalization of the *args/**kwargs syntax for > quite a while, and even though I'm pretty sure Guido (and many people) will > consider it overgeneralization, I am finally going to suggest it. This whole > idea is not something dear to my heart, although I obviously would like to > see it happen. If the general vote is 'no', I'll write a small PEP or add it > to PEP 13 and be done with it. Feel free to write a PEP so that at least we have a concrete proposal where all the nuts and bolts have been thought through. I'm currently not able to give much thought to any more new proposals, so don't expect me to look at it any time soon. Unless a miracle occurs it's off the table for 2.5 so there's no hurry. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On 2/15/06, Bill Janssen <[EMAIL PROTECTED]> wrote: > The default behavior of the current open() in opening files as text is > particularly grating. Why? Are you perhaps one of those rare folks who read more binary data than text? -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Guido van Rossum wrote: > On 2/15/06, Tim Parkin <[EMAIL PROTECTED]> wrote: > >>Guido van Rossum wrote: >> >>>I have this worry too in the >>>context of the python.org redesign; 301 permanent redirect is *not* >>>going to help PageRank of the new page.) >>Could you expand on why 301 redirects won't help with the transfer of >>page rank (if you're allowed)? We've done exactly this on many sites and >>the pagerank (or more relevantly the search rankings on specific terms) >>has transferred almost overnight. The bigger pagerank updates (both >>algorithm changes and overhauls in approach) seem to only happen every >>few months and these also seem to take notice of 301 redirects (they >>generally clear up any supplemental results). > > OK, perhaps I stand corrected. I don't actually know that much about PageRank! > No problem, I don't think that many people do and the general consensus seems to be that, although the calculations behind pagerank may be one of the core parts of the google algorithm, there are so many additional algorithms* that affect searches on a case by case and day by day basis that the value from is almost meaningless (apart from possibly 0-2 may be a problem 3-5 is normal, 6-9 is generally good and 10 I've not seen) * (for instance, patents on working out the value of inbound links based on there age, how many other inbound links appeared around the same time, the status of the originating site as an 'authority' site, the text contained in the inbound link and title attributes, etc and the general relation between the inbound links and the 'theme' of the target site ['theme' == the distribution of important keywords across the site]) > I still don't like docs.python.org, and adding more like it seems a > mistake; but it's possible that this is because of a poor execution of > the idea (there's no "search docs" button near the search button on > the old python.org). I'll try and make a more functional/usable google search page on the new site. Tim Parkin p.s. I hope you didn't think I was digging for 'insider info'.. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
As I said in an earlier message, there's no need to have a separate domain to restrict queries to just the doc/current part of python.org. Just type "site:python.org/doc/current your query here" If there isn't any other rationale, maybe we can redirects docs.python.org back to www.python.org? Jeremy On 2/15/06, Guido van Rossum <[EMAIL PROTECTED]> wrote: > On 2/15/06, Tim Parkin <[EMAIL PROTECTED]> wrote: > > Guido van Rossum wrote: > > > > > (Now that I work for Google I realize more than ever before the > > > importance of keeping URLs stable; PageRank(tm) numbers don't get > > > transferred as quickly as contents. I have this worry too in the > > > context of the python.org redesign; 301 permanent redirect is *not* > > > going to help PageRank of the new page.) > > > Could you expand on why 301 redirects won't help with the transfer of > > page rank (if you're allowed)? We've done exactly this on many sites and > > the pagerank (or more relevantly the search rankings on specific terms) > > has transferred almost overnight. The bigger pagerank updates (both > > algorithm changes and overhauls in approach) seem to only happen every > > few months and these also seem to take notice of 301 redirects (they > > generally clear up any supplemental results). > > OK, perhaps I stand corrected. I don't actually know that much about PageRank! > > I still don't like docs.python.org, and adding more like it seems a > mistake; but it's possible that this is because of a poor execution of > the idea (there's no "search docs" button near the search button on > the old python.org). > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/jeremy%40alum.mit.edu > ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Jeremy Hylton wrote: > As I said in an earlier message, there's no need to have a separate > domain to restrict queries to just the doc/current part of python.org. > Just type > "site:python.org/doc/current your query here" > > If there isn't any other rationale, maybe we can redirects > docs.python.org back to www.python.org? If something like Fredrik's new doc system is adopted, it would be extremely convenient to refer someone to just docs.python.org/os.path.join without looking up how the page is actually named. Georg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
On 2/14/06, Greg Ewing <[EMAIL PROTECTED]> wrote: > Fred L. Drake, Jr. wrote: > > > The proper response in this case is often to re-start decoding > > with the correct encoding, since some of the data extracted so far may have > > been decoded incorrectly. > > If the protocol has been sensibly designed, that shouldn't > happen, since everything up to the coding marker should > be ascii (or some other protocol-defined initial coding). > > For protocols that are not sensibly designed (or if you're > just trying to guess) what you suggest may be needed. But > it would be good to have a nicer way of going about it > for when the protocol is sensible. I think that the implementation of encoding-guessing or auto-encoding-upgrade techniques should be left out of the standard library design for now. I know that XML does something like this, but fortunately we employ dedicated C code to parse XML so that particular case should be taken care of without complicating the rest of the standard I/O library. As far as searching bytes objects, that shouldn't be a problem as long as the search 'string' is also specified as a bytes object. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Jeremy Hylton wrote: > As I said in an earlier message, there's no need to have a separate > domain to restrict queries to just the doc/current part of python.org. > Just type > "site:python.org/doc/current your query here" > > If there isn't any other rationale, maybe we can redirects > docs.python.org back to www.python.org? One possible reason, I'd like to be able to serve the docs up integrated with the new design (with a full hierarchical navigation). I had planned on leaving the docs.python.org as the raw tex2html conversion. If we got rid of the docs.python.org would we still want the www.python.org in the current style? Personally I was hoping that nearly all of the site could be in the new html structure and design for consistency and usability reasons. Tim Parkin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
On Tue, 14 Feb 2006 19:41:07 -0500, "Raymond Hettinger" <[EMAIL PROTECTED]>
wrote:
>[Guido van Rossum]
>> Somewhat controversial:
>>
>> - bytes("abc") == bytes(map(ord, "abc"))
>
>At first glance, this seems obvious and necessary, so if it's somewhat
>controversial, then I'm missing something. What's the issue?
>
ord("x") gets the source encoding's ord value of "x", but if that is not unicode
or latin-1, it will break when PY 3000 makes "x" unicode.
This means until Py 3000 plain str string literals have to use ascii and
escapes in order to preserve the meaning when "x" == u"x".
But the good news is bytes(map(ord(u"x"))) works fine for any source encoding
now or after PY 3000. You just have to type characters into your editor
between the quotes that look on the screen like any of the first 256 unicode
characters
(or use ascii escapes for unshowables). The u"x" translates x into unicode
according
to the *character* of x, whatever the source encoding, so all you have to do is
choose characters of the first 256 unicodes. This happens to be latin-1, but
you can ignore that
unless you are interested in the actual byte values. If they have byte meaning,
escapes
are clearer anyway, and they work in a unicode string (where
"x".decode(source_encoding) might
fail on an illegal character).
The solution is to use u"x" for now or use ascii-only with escapes, and just
map ord on either kind of string. This should work when u"x"
becomes equivalent to "x". The unicode that comes from a current u"x" string
defines a *character* sequence. If you use legal latin-1 *characters* in
whatever source encoding your editor and coding cookie say, you will get
the *characters* you see inside the quotes in the u"..." literal translated
to unicode, and the first 256 characters of unicode happen to be the latin-1
set,
so map ord just works. With a unicode string you don't have to think about
encoding,
just use ord/unichr in range(0,256). Hex escapes within unicode strings work as
expected,
so IMO it's pretty clean.
I think I have shown this in a couple of other posts in the orignal thread
(where I created and compiled source code in several encodings including utf-8
and comiled with coding cookies and exec'd the result)
I could always have overlooked something, but I am hopeful.
Regards,
Bengt Richter
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Georg Brandl wrote: > If something like Fredrik's new doc system is adopted, it would be extremely > convenient to refer someone to just > > docs.python.org/os.path.join > > without looking up how the page is actually named. you could of course reserve a toplevel directory for that purpose; e.g. http://python.org/lib/os.path.join or perhaps http://python.org/tag/os.path.join http://python.org/tag/print etc. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] bytes type needs a new champion
Skip has mentioned in private email that he's not available to update PEP 332. I've therefore rejected that PEP; the current ideas are rather different so we might as well start a new PEP. Anyway, we need a new PEP author who can take the current discussion and turn it into a coherent PEP. I've tried to keep up with the current thread but it takes too much time to organize it all and I need to start focusing on the 2.5 release schedule. Any volunteers? -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 PEP
Martin v. Löwis wrote: > > - is (c)ElementTree still planned for inclusion ? > > It is included already. in the xml.etree package, in case someone's looking for it in the usual place. that is, import xml.etree.ElementTree as ET import xml.etree.cElementTree as ET will work in any 2.5 that has a working pyexpat. (is the xmlplus/xmlcore issue still an issue, btw?) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
Jason Orendorff wrote:
> Instead of byte literals, how about a classmethod bytes.from_hex(), which
> works like this:
>
> # two equivalent things
> expected_md5_hash = bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
> expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83, 227,
> 131, 79, 229, 201, 46, 106])
>
> It's just a nicety; the former fits my brain a little better. This would
> work fine both in 2.5 and in 3.0.
>
> I thought about unicode.encode('hex'), but obviously it will continue to
> return a str in 2.x, not bytes. Also the pseudo-encodings ('hex', 'rot13',
> 'zip', 'uu', etc.) generally scare me.
Those are not pseudo-encodings, they are regular codecs.
It's a common misunderstanding that codecs are only seen as serving
the purpose of converting between Unicode and strings.
The codec system is deliberately designed to be general enough
to also work with many other types, e.g. it is easily possible to
write a codec that convert between the hex literal sequence you
have above to a list of ordinals:
""" Hex string codec
Converts between a list of ordinals and a two byte hex literal
string.
Usage:
>>> codecs.encode([1,2,3], 'hexstring')
'010203'
>>> codecs.decode(_, 'hexstring')
[1, 2, 3]
(c) 2006, Marc-Andre Lemburg.
"""
import codecs
class Codec(codecs.Codec):
def encode(self, input, errors='strict'):
""" Convert hex ordinal list to hex literal string.
"""
if not isinstance(input, list):
raise TypeError('expected list of integers')
return (
''.join(['%02x' % x for x in input]),
len(input))
def decode(self,input,errors='strict'):
""" Convert hex literal string to hex ordinal list.
"""
if not isinstance(input, str):
raise TypeError('expected string of hex literals')
size = len(input)
if not size % 2 == 0:
raise TypeError('input string has uneven length')
return (
[int(input[(i<<1):(i<<1)+2], 16)
for i in range(size >> 1)],
size)
class StreamWriter(Codec,codecs.StreamWriter):
pass
class StreamReader(Codec,codecs.StreamReader):
pass
def getregentry():
return (Codec().encode,Codec().decode,StreamReader,StreamWriter)
> And now that bytes and text are
> going to be two very different types, they're even weirder than before.
> Consider:
>
> text.encode('utf-8') ==> bytes
> text.encode('rot13') ==> text
> bytes.encode('zip') ==> bytes
> bytes.encode('uu') ==> text (?)
>
> This state of affairs seems kind of crazy to me.
Really ?
It all depends on what you use the codecs for. The above
usages through the .encode() and .decode() methods is
not the only way you can make use of them.
To get full access to the codecs, you'll have to use
the codecs module.
> Actually users trying to figure out Unicode would probably be better served
> if bytes.encode() and text.decode() did not exist.
You're missing the point: the .encode() and .decode() methods
are merely interfaces to the registered codecs. Whether they
make sense for a certain codec depends on the codec, not the
methods that interface to it, and again, codecs do not
only exist to convert between Unicode and strings.
--
Marc-Andre Lemburg
eGenix.com
Professional Python Services directly from the Source (#1, Feb 15 2006)
>>> Python/Zope Consulting and Support ...http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/
::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,FreeBSD for free !
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
On Wed, Feb 15, 2006 at 01:38:41PM -0500, Jim Jewett wrote:
> On 2/14/06, Neil Schemenauer wrote:
> > People could spell it bytes(s.encode('latin-1'))
>
> Guido wrote:
> > At the cost of an extra copying step.
>
> I asked:
> > ... why not just add some smarts to the bytes constructor?
>
> Guido wrote:
>
> > ... the VM usually keeps an extra reference
> > on the stack so the refcount is never 1. But
> > you can't rely on that
>
> I did miss this, but _PyString_Resize seems to
> work around it, and I'm not sure that the bytes
> object can't be just as intimate.
No, _PyString_Resize doesn't work around it. _PyString_Resize only works if
the refcount is exactly one: only the caller has a reference. And by
'caller', I mean 'the calling C function'. Besides that, the caller takes
care to only use _PyString_Resize on strings it created itself.
Theoretically it could 'steal' a reference from someplace else, but I
haven't seen _PyString_Resize-using code do that, and it would be a recipe
for disaster.
--
Thomas Wouters <[EMAIL PROTECTED]>
Hi! I'm a .signature virus! copy me into your .signature file to help me spread!
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
Well, I probably am, but that's not the reason. Reading has nothing to do with it. The default mode (text) corrupts data on write on a certain platform (Windows) by inserting extra bytes in the data stream. This bug particularly exhibits itself when programs developed on Linux or Mac OS X are then run on a Windows platform. I think it's a bug to default to a mode which modifies the data stream. The default mode should be 'binary'; people interested in exploiting the obsolete Windows distinction between "text" and "binary" should have to use a mode switch (I suggest "t") to put a file stream in 'text' mode. Bill > On 2/15/06, Bill Janssen <[EMAIL PROTECTED]> wrote: > > The default behavior of the current open() in opening files as text is > > particularly grating. > > Why? Are you perhaps one of those rare folks who read more binary data > than text? > > -- > --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On 2/15/06, Bill Janssen <[EMAIL PROTECTED]> wrote: > Well, I probably am, but that's not the reason. Reading has nothing > to do with it. Actually if you read binary data in text mode on Windows you also get corrupt (and often truncated) data, unless you're lucky enough that the binary data contains neither ^Z (EOF) nor CRLF. > The default mode (text) corrupts data on write on a certain platform > (Windows) by inserting extra bytes in the data stream. This bug > particularly exhibits itself when programs developed on Linux or Mac > OS X are then run on a Windows platform. I think it's a bug to > default to a mode which modifies the data stream. The default mode > should be 'binary'; people interested in exploiting the obsolete > Windows distinction between "text" and "binary" should have to use a > mode switch (I suggest "t") to put a file stream in 'text' mode. This might have been a possibility in Python 2.x where binary reads return strings. In Python 3000 binary files will return bytes objects while text files will return strings (which are decoded from unicode using an encoding that's determined when the file is opened, taking into account system and user settings as well as possible overrides passed to open()). I expect that the APIs for reading and writing binary data will be sufficiently different from that for reading/writing text that even staunch Unix programmers won't make the mistake of using the text API for creating binary files. I realize that's not the answer you're looking for, but for backwards compatibility we can't change the default on Windows in Python 2.x, so the point is moot until 3.0 or until a new binary file API is added to 2.x. -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] math.areclose ...?
Please, I don't much care about the fine points of the function's semantics, but PLEASE rename that function to are_close. Every time I see this subject in my email client I have to think for a few seconds what the hell 'areclose' means. This time it's not just because of the new PEP 8, 'areclose' is really really hard to read. -- Gustavo J. A. M. Carneiro <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> The universe is always one step beyond logic ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
On Wed, 2006-02-15 at 14:01 -0500, Jason Orendorff wrote:
> Instead of byte literals, how about a classmethod bytes.from_hex(),
> which works like this:
>
> # two equivalent things
> expected_md5_hash =
> bytes.from_hex('5c535024cac5199153e3834fe5c92e6a')
> expected_md5_hash = bytes([92, 83, 80, 36, 202, 197, 25, 145, 83,
> 227, 131, 79, 229, 201, 46, 106])
Kind of like binascii.unhexlify() but returning a bytes object.
-Barry
signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] Generalizing *args and **kwargs
Thomas Wouters wrote: > Although I've made it look like I have a working implementation, I haven't. > I know exactly how to do it, though, except for the AST part ;) Once I > figure out how to properly work with the AST code I'll probably write this > patch whether it's a definite 'no' or not, just to see if I can. I wouldn't > mind if people gave their opinion, though. A phase 1 for Python 2.5 that allowed keyword args to go between "*args" and "**kwds" at the call site would be nice (Guido even approved the concept already, it's that it hasn't irritated anyone enough to actually tweak the grammar. . .) Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] how bugfixes are handled?
Hi, How bugfixes are handled? I've posted a bug and a patch + test case for a quite common issue (see google, problem mentioned on this ml) long time ago and nothing happened with it http://sourceforge.net/tracker/index.php?func=detail&aid=1380952&group_id=5470&atid=305470 Is anyone reviewing fixes on regular basis? Or just some bugfixes are reviewed + commited depending on interest of commiters? Thanks, -- Arkadiusz MiśkiewiczPLD/Linux Team http://www.t17.ds.pwr.wroc.pl/~misiek/ http://ftp.pld-linux.org/ ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes.from_hex() [Was: PEP 332 revival in coordination with pep 349?]
On 2/15/06, M.-A. Lemburg <[EMAIL PROTECTED]> wrote:
> Jason Orendorff wrote:
> > Also the pseudo-encodings ('hex', 'rot13',
> > 'zip', 'uu', etc.) generally scare me.
>
> Those are not pseudo-encodings, they are regular codecs.
>
> It's a common misunderstanding that codecs are only seen as serving
> the purpose of converting between Unicode and strings.
>
> The codec system is deliberately designed to be general enough
> to also work with many other types, e.g. it is easily possible to
> write a codec that convert between the hex literal sequence you
> have above to a list of ordinals:
It's fine that the codec system supports this. However it's
questionable that these encodings are invoked using the standard
encode() and decode() APIs; and it will be more questionable once
encode() returns a bytes object. Methods that return different types
depending on the value of an argument are generally a bad idea. (Hence
the movement to have separate opentext and openbinary or openbytes
functions.)
--
--Guido van Rossum (home page: http://www.python.org/~guido/)
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] 2.5 PEP
Neal Norwitz wrote: > Attached is the 2.5 release PEP 356. It's also available from: > http://www.python.org/peps/pep-0356.html > > Does anyone have any comments? Is this good or bad? Feel free to > send to me comments. > > We need to ensure that PEPs 308, 328, and 343 are implemented. We > have possible volunteers for 308 and 343, but not 328. Brett is doing > 352 and Martin is doing 353. PEP 338 is pretty much ready to go, too - just waiting on Guido's review and pronouncement on the specific API used in the latest update (his last PEP parade said he was OK with the general concept, but I only posted the PEP 302 compliant version after that). Cheers, Nick. -- Nick Coghlan | [EMAIL PROTECTED] | Brisbane, Australia --- http://www.boredomandlaziness.org ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] A codecs nit (was Re: bytes.from_hex())
On Wed, 2006-02-15 at 22:07 +0100, M.-A. Lemburg wrote:
> Those are not pseudo-encodings, they are regular codecs.
>
> It's a common misunderstanding that codecs are only seen as serving
> the purpose of converting between Unicode and strings.
>
> The codec system is deliberately designed to be general enough
> to also work with many other types, e.g. it is easily possible to
> write a codec that convert between the hex literal sequence you
> have above to a list of ordinals:
Slightly off-topic, but one thing that's always bothered me about the
current codecs implementation is that str.encode() (and friends)
implicitly treats its argument as module, and imports it, even if the
module doesn't live in the encodings package. That seems like a mistake
to me (and a potential security problem if the import has side-effects).
I don't know whether at the very least restricting the imports to the
encodings package would make sense or would break things.
>>> import sys
>>> sys.modules['smtplib']
Traceback (most recent call last):
File "", line 1, in ?
KeyError: 'smtplib'
>>> ''.encode('smtplib')
Traceback (most recent call last):
File "", line 1, in ?
LookupError: unknown encoding: smtplib
>>> sys.modules['smtplib']
I can't see any reason for allowing any randomly importable module to
act like an encoding.
-Barry
signature.asc
Description: This is a digitally signed message part
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] how bugfixes are handled?
We're all volunteers here, and we get a large volume of bugs. Unfortunately, bugfixes are reviewed on a voluntary basis. Are you aware of the standing offer that if you review 5 bugs/patches some of the developers will pay attention to your bug/patch? On 2/15/06, Arkadiusz Miskiewicz <[EMAIL PROTECTED]> wrote: > Hi, > > How bugfixes are handled? > > I've posted a bug and a patch + test case for a quite common issue (see > google, problem mentioned on this ml) long time ago and nothing happened > with it > http://sourceforge.net/tracker/index.php?func=detail&aid=1380952&group_id=5470&atid=305470 > > Is anyone reviewing fixes on regular basis? Or just some bugfixes are > reviewed + commited depending on interest of commiters? > > Thanks, > -- > Arkadiusz MiśkiewiczPLD/Linux Team > http://www.t17.ds.pwr.wroc.pl/~misiek/ http://ftp.pld-linux.org/ > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Georg Brandl wrote: > If something like Fredrik's new doc system is adopted don't hold your breath, by the way. it's clear that the current PSF-sponsored site overhaul won't lead to anything remotely close to a best-of-breed python- powered site, and I'm beginning to think that I should spend my time on other stuff. I find it a bit sad that we'll end up with a butt-ugly static and boring python.org site when we have so much talent in the python universe, but I guess that's in- evitable at this stage in Python's evolution. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] ssize_t branch merged
Just in case you haven't noticed, I just merged the ssize_t branch (PEP 353). If you have any corrections to the code to make which you would consider bug fixes, just go ahead. If you are uncertain how specific problems should be resolved, feel free to ask. If you think certain API changes should be made, please discuss them here - they would need to be reflected in the PEP as well. Regards, Martin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
Guido van Rossum wrote: > - it's probably too big to attempt to rush this into 2.5 After reading some of the discussion, and seen some of the arguments, I'm beginning to feel that we need working code to get this right. It would be nice if we could get a bytes() type into the first alpha, so the design can get some real-world exposure in real-world apps/libs be- fore 2.5 final. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
On Wed, Feb 15, 2006 at 11:28:59PM +0100, Fredrik Lundh wrote: > After reading some of the discussion, and seen some of the arguments, > I'm beginning to feel that we need working code to get this right. > > It would be nice if we could get a bytes() type into the first alpha, so > the design can get some real-world exposure in real-world apps/libs be- > fore 2.5 final. I agree that working code would be nice, but I don't see why it should be in an alpha release. IMHO it shouldn't be in an alpha release until it at least looks good enough for the developers, and good enough to put in a PEP. -- Thomas Wouters <[EMAIL PROTECTED]> Hi! I'm a .signature virus! copy me into your .signature file to help me spread! ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
Thomas Wouters wrote: > > After reading some of the discussion, and seen some of the arguments, > > I'm beginning to feel that we need working code to get this right. > > > > It would be nice if we could get a bytes() type into the first alpha, so > > the design can get some real-world exposure in real-world apps/libs be- > > fore 2.5 final. > > I agree that working code would be nice, but I don't see why it should be in > an alpha release. IMHO it shouldn't be in an alpha release until it at least > looks good enough for the developers, and good enough to put in a PEP. I'm not convinced that the PEP will be good enough without experience from using a bytes type in *real-world* (i.e. *existing*) byte-crunching applications. if we put it in an early alpha, we can use it with real code, fix any issues that arises, and even remove it if necessary, before 2.5 final. if it goes in late, we'll be stuck with whatever the PEP says. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
Guido van Rossum wrote: > On 2/15/06, Fuzzyman <[EMAIL PROTECTED]> wrote: > >> Forcing the programmer to be aware of encodings, also pushes the same >> requirement onto the user (who is often the source of the text in question). >> > > The programmer shouldn't have to be aware of encodings most of the > time -- it's the job of the I/O library to determine the end user's > (as opposed to the language's) default encoding dynamically and act > accordingly. Users who use non-ASCII characters without informing the > OS of their encoding are in a world of pain, *unless* they use the OS > default encoding (which may vary per locale). If the OS can figure out > the default encoding, so can the Python I/O library. Many apps won't > have to go beyond this at all. > > Note that I don't want to use this OS/user default encoding as the > default encoding between bytes and strings; once you are reading bytes > you are writing "grown-up" code and you will have to be explicit. It's > only the I/O library that should automatically encode on write and > decode on read. > > >> Currently you can read a text file and process it - making sure that any >> changes/requirements only use ascii characters. It therefore doesn't matter >> what 8 bit ascii-superset encoding is used in the original. If you force the >> programmer to specify the encoding in order to read the file, they would >> have to pass that requirement onto their user. Their user is even less >> likely to be encoding aware than the programmer. >> > > I disagree -- the user most likely has set or received a default > encoding when they first got the computer, and that's all they are > using. If other tools (notepad, wordpad, emacs, vi etc.) can figure > out the encoding, so can Python's I/O library. > > I'm intrigued by the encoding guessing techniques you envisage. I currently use a modified version of something contained within docutils. I read the file in binary and first check for UTF8 or UTF16 BOM. Then I try to decode the text using the following encodings (in this order) : ascii UTF8 locale.nl_langinfo(locale.CODESET) locale.getlocale()[1] locale.getdefaultlocale()[1] ISO8859-1 cp1252 (The encodings returned by the locale calls are only used on platforms for which they exist.) The first decode that doesn't blow up, I assume is correct. The problem I have is that I usually (for the application I have in mind anyway) then want to re-encode into a consistent encoding rather than back into the original encoding. If the encoding of the original (usually unspecified) is any arbitrary 8-bit ascii superset (as it usually is), then it will probably not blow up if decoded with any other arbitrary 8 bit encoding. This means I sometimes get junk. I'm curious if there is any extra things I could do ? This is possibly beyond the scope of this discussion (in which case I apologise), but we are discussing the techniques the I/O layer would use to 'guess' the encoding of a file opened in text mode - so maybe it's not so off topic. There is also the following cookbook recipe that uses an heuristic to guess encoding : http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/163743 XML, HTML, or other text streams may also contain additional information about their encoding - which be unreliable. :-) All the best, Michael Foord ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Fredrik Lundh wrote: > Georg Brandl wrote: >>If something like Fredrik's new doc system is adopted > > don't hold your breath, by the way. it's clear that the current PSF-sponsored > site overhaul won't lead to anything remotely close to a best-of-breed python- > powered site, and I'm beginning to think that I should spend my time on other > stuff. > > I find it a bit sad that we'll end up with a butt-ugly static and boring > python.org > site when we have so much talent in the python universe, but I guess that's > in- > evitable at this stage in Python's evolution. > > Some very large sites - and some may say some very interesting, very large sites - are delivered as static html (for some time the two biggest sites in the uk were both delivered as static html, one of which was bbc.co.uk and the other was sportinglife.com for which I used to be the main web developer. As far as I know the bbc and sporting life still both use static html for a large portion of their content). Regarding the python site, it was a concious decision to deliver the pages as static html. This was for many reasons, of which a prominent one (but by no means the only major one) was mirroring. One of the advantages of a semantically structured website that uses css for layout and style is that, as far as design goes, you are welcome to re-style the html using css; we can also offer it as an alternate stylesheet (just as I've added a 'large font' style and a 'default font settings' style). However, design is a subjective thing - I've spent quite a bit of time reacting to the majority of constructive feedback (probably far too much time when I should have been getting content migrated) but obviously it won't please everyone :-) As for cutting edge, it's using twisted, restructured text, nevow, clean urls, xhtml, semantic markup, css2, interfaces, adaption, eggs, the path module, moinmoin, yaml (to avoid xml), etc - just because it's generating all of the html up front rather than at runtime doesn't mean that it's not best-of-breed (although I'm not sure what best-of-breed is; I'm presuming it's some sort of accolade for excellence in python programming; something I don't think I would be qualified to judge, never mind receive). However, back to the Goerg's comment, we could use mod_write to map: /lib/sets to: /doc/lib/module-sets.html with rewriteRule ^/lib/(.*)$ /doc/lib/module-$1.html [L,R=301] (not tested) Whether that is a good idea or not is another matter. Tim Parkin ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] http://www.python.org/dev/doc/devel still available
Tim Parkin wrote: > As for cutting edge, it's using twisted, restructured text, nevow, clean > urls, xhtml, semantic markup, css2, interfaces, adaption, eggs, the path > module, moinmoin, yaml (to avoid xml), that's not cutting edge, that's buzzword bingo. > something I don't think I would be qualified to judge,never mind receive). no, you're not qualified. yet, someone gave you total control over the future of python.org, and there's no way to make you give it up, despite the fact that you're over a year late and the stuff you've delivered this far is massively underwhelming. that's the problem. ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type needs a new champion
Guido van Rossum <[EMAIL PROTECTED]> wrote: > Anyway, we need a new PEP author who can take the current > discussion and turn it into a coherent PEP. I'm not sure that I have time to be the official champion. Right now I'm spending some time to collect all the ideas presented in the email messages and put them into a draft PEP. Hopefully that will be useful. Neil ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] str object going in Py3K
On 2/15/06, Michael Foord <[EMAIL PROTECTED]> wrote: > I'm intrigued by the encoding guessing techniques you envisage. Don't hold your breath. *I* am not very interested in guessing encodings -- I was just commenting on posts by others that mentioned difficulties caused by this approach. My position is that the standard library (with the exception of XML processing code perhaps) shouldn't be *guessing* encodings but simply using the encoding specified by the user (or the OS default) in the environment or some such place. (It is OS dependent how to retrieve this information but my hypothesis is that every OS with any kind of text support has a way to get this info -- even if it's as rudimentary as "it's always ASCII" (v7 Unix :-) or "it's always UTF-8" (I am hoping this will eventually be the answer in the distant future). -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
I'm actually assuming to put this off until 2.6 anyway. On 2/15/06, Fredrik Lundh <[EMAIL PROTECTED]> wrote: > Thomas Wouters wrote: > > > > After reading some of the discussion, and seen some of the arguments, > > > I'm beginning to feel that we need working code to get this right. > > > > > > It would be nice if we could get a bytes() type into the first alpha, so > > > the design can get some real-world exposure in real-world apps/libs be- > > > fore 2.5 final. > > > > I agree that working code would be nice, but I don't see why it should be in > > an alpha release. IMHO it shouldn't be in an alpha release until it at least > > looks good enough for the developers, and good enough to put in a PEP. > > I'm not convinced that the PEP will be good enough without experience > from using a bytes type in *real-world* (i.e. *existing*) byte-crunching > applications. > > if we put it in an early alpha, we can use it with real code, fix any issues > that arises, and even remove it if necessary, before 2.5 final. if it goes in > late, we'll be stuck with whatever the PEP says. > > > > > > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] ssize_t branch merged
Great! I'll mark the PEP as accepted. (Which doesn't mean you can't update it if changes are found necessary.) --Guido On 2/15/06, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > Just in case you haven't noticed, I just merged > the ssize_t branch (PEP 353). > > If you have any corrections to the code to make which > you would consider bug fixes, just go ahead. > > If you are uncertain how specific problems should be resolved, > feel free to ask. > > If you think certain API changes should be made, please > discuss them here - they would need to be reflected in the > PEP as well. > > Regards, > Martin > ___ > Python-Dev mailing list > [email protected] > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/guido%40python.org > -- --Guido van Rossum (home page: http://www.python.org/~guido/) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] PEP 332 revival in coordination with pep 349? [ Was:Re: release plan for 2.5 ?]
Ron Adam wrote: > I was presuming it would be done in C code and it will just need a > pointer to the first byte, memchr(), and then read n bytes directly into > a new memory range via memcpy(). If the object supports the buffer interface, it can be done that way. But if not, it would seem to make sense to fall back on the iterator protocol. > However, if it's done with a Python iterator and then each item is > translated to bytes in a sequence, (much slower), an encoding will need > to be known for it to work correctly. No, it won't. When using the bytes(x) form, encoding has nothing to do with it. It's purely a conversion from one representation of an array of 0..255 to another. When you *do* want to perform encoding, you use bytes(u, encoding) and say what encoding you want to use. > Unfortunately Unicode strings > don't set an attribute to indicate it's own encoding. I think you don't understand what an encoding is. Unicode strings don't *have* an encoding, because theyre not encoded! Encoding is what happens when you go from a unicode string to something else. > Since some longs will be of different length, yes a bytes(0L) could give > differing results on different platforms, It's not just a matter of length. I'm not sure of the details, but I believe longs are currently stored as an array of 16-bit chunks, of which only 15 bits are used. I'm having trouble imagining a use for low-level access to that format, other than just treating it as an opaque lump of data for turning back into a long later -- in which case why not just leave it as a long in the first place. Greg ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] bytes type discussion
Guido wrote: > I'm actually assuming to put this off until 2.6 anyway. makes sense. (but will there be a 2.6? isn't it time to start hacking on 3.0?) ___ Python-Dev mailing list [email protected] http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
