[Python-Dev] 2.5b3, commit r46372 regressed PEP 302 machinery (sf not letting me post)

2006-08-07 Thread Robin Bryce

Hi,

Appologies for the lack of an sf#. I tried to submit this there but
couldn't. (sf is logging me out each time I visit a new page and it is
refusing my attempt to post anonymously).

Python 2.5b3 (trunk:51136M, Aug  7 2006, 10:48:15)
[GCC 4.0.3 (Ubuntu 4.0.3-1ubuntu5)] on linux2


The need for speed patch commited in revision r46372 included a change
whose intent was to reduce the number of open calls. The `continue`
statement at line 1283 in import.c:r51136 has the effect of skipping
the builtin import machinery if the find_module method of a custom
importer returns None.

In Python 2.4.3, if find_module returned None the builtin machinery is
allowed to process the path tail.

In my particular case I am working on an importer that deals with kid
templates that may or may not exist as .py[c] files.

The short of it is that in Python 2.4.3 this produces a usable module
``__import__('foo.a/templateuri')`` wheras in 2.5b3 I get import
error. The python 2.4.3 implementation *allows* module paths that are
not seperated with '.' Python 2.5b3 does not allow this and it does
not look like this was an intentional change. I believe this point
about 'illeagal' module paths is actualy independent of the regresion
I am asserting. Detailed session logs are attatched (following the sf
guidance even though I'm posting to py-dev)

The 'use case' for the importer is: Robin wants to package a default
template file as normal python module and provide a custom importer
that allows users of his package to reference both: there own
templates and html files on the file system in arbitrary locations AND
the stock templates provided as python modules under the same name
space. He would like to leave normal imports to the standard
machinery.

Cheers,

Robin


bugreport.rst
Description: Binary data
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
M.-A. Lemburg schrieb:
>> There's no disputing that an exception should be raised
>> if the string *must* be interpretable as characters in
>> order to continue. But that's not true here if you allow
>> for the interpretation that they're simply objects of
>> different (duck) type and therefore unequal.
> 
> Hmm, given that interpretation, 1 == 1.0 would have to be
> False.

No, but 1 == 1.5 would have to be False (and actually is).
In that analogy, int relates to float as ascii-bytes to
Unicode: some values are shared between int and float (e.g.
1 and 1.0), other values are not shared (e.g. 1.5 has no
equivalent in int). An int equals a float only  if both
values originate from the shared subset.

Now, int is a (nearly) true subset of float, so there are
no ints with no float equivalent (actually, there are, but
Python ignores that).

> Note that you do have to interpret the string as characters
> if you compare it to Unicode and there's nothing wrong with
> that.

Consider this:
py> int(3+4j)
Traceback (most recent call last):
  File "", line 1, in ?
TypeError: can't convert complex to int; use int(abs(z))
py> 3 == 3+4j
False

So even though the conversion raises an exception, the
values are determined to be not equal. Again, because int
is a nearly true subset of complex, the conversion goes
the other way, but *if* it would use the complex->int
conversion, then the TypeError should be taken as
a guarantee that the objects don't compare equal.

Expanding this view to Unicode should mean that a unicode
string U equals a byte string B if
U.encode(system_encode) == B or B.decode(system_encoding) == U,
and that they don't equal otherwise (e.g. if the conversion
fails with a "not convertible" exception). Which of the
two conversions is selected is arbitrary; we should, of
course, continue to use the one we always used (for
"ascii", there is no difference between the two).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Dicts are broken Was: unicode hell/mixing str and unicode asdictionarykeys

2006-08-07 Thread Martin v. Löwis
M.-A. Lemburg schrieb:
> Python just doesn't know the encoding of the 8-bit string, so can't
> make any assumptions on it. As result, it raises an exception to inform
> the programmer.

Oh, Python does make an assumption what the encoding is: it assumes
it is the system encoding (i.e. "ascii"). Then invoking the ascii
codec raises an exception, because the string clearly isn't ascii.

> It is well possible that the string uses an encoding where the
> Unicode string is indeed the equal to the string, assuming this
> encoding

So what? Python uses the system encoding for this operation.
What does it matter that the result would be different if it
had used a different encoding.

The strings are unequal under the system encoding; it's irrelevant
that they might be equal under a different encoding.

The same holds for the ASCII part (i.e. where you don't get an
exception):

py> u"foo" == "sbb"
False
py> u"foo".encode("rot13") == "sbb"
True

So the strings compare as unequal, even though they compare
equal if treated as rot13. That doesn't stop Python from considering
them unequal.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Python 2.5b3 and AIX 4.3 - It Works

2006-08-07 Thread Martin v. Löwis
Michael Kent schrieb:
> Because of a requirement to remain compatible with AIX 4.3, I have been forced
> to stay with Python 2.3, because while 2.4 would compile under AIX 4.3, it 
> would
> segfault immediately when run.
> 
> I'm happy to report that Python 2.5b3 compiles and runs fine under AIX 4.3, 
> and
> passes most of its test suite.  However, here are a few test failures.  I
> realize AIX 4.3 is long obsolete, so is there any interest on the list for me 
> to
> report which tests failed and how?

As Neal says: There would be interest in receiving patches. Interest in
receiving bug reports for systems we don't have access to is low.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Martin v. Löwis
Gregory P. Smith schrieb:
> Whoever knows how the windows build process works and controls the
> python 2.5 windows release builds could you please make sure the
> hashlib module gets built + linked with OpenSSL rather than falling
> back to its much slower builtin implementations.

If the project files are changed in that direction, then my build will
pick that up automatically. I can't promise to change the files myself.

I'm somewhat worried about yet another size increase in pythonxy.dll
(actually, I'm personally not worried, but I anticipate complaints about
 such a change).

What should happen to the existing sha* modules?

I believe that the performance of the OpenSSL routines depends on
the way OpenSSL was built, e.g. whether the assembler implementations
are used or not. Somebody would have to check, but I doubt they are.

So in short: I'm very doubtful that such a change can still be made,
and if it is, that it will be "right". I'm accepting patches
regardless.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Michael Foord
Martin v. Löwis wrote:
> [snip..]
> Expanding this view to Unicode should mean that a unicode
> string U equals a byte string B if
> U.encode(system_encode) == B or B.decode(system_encoding) == U,
> and that they don't equal otherwise (e.g. if the conversion
> fails with a "not convertible" exception). Which of the
> two conversions is selected is arbitrary; we should, of
> course, continue to use the one we always used (for
> "ascii", there is no difference between the two).
>
>   
+1

This seems the most (only ?) logical solution.

Michael Foord

> Regards,
> Martin
> ___
> Python-Dev mailing list
> [email protected]
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>
>   

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.5 status

2006-08-07 Thread [EMAIL PROTECTED]
"Neal Norwitz" <[EMAIL PROTECTED]> wrote:

> Things are getting better, but we still have some really important
> outstanding issues.  PLEASE CONTINUE TESTING AS MUCH AS POSSIBLE.

I've run into a problem with a big application that I wasn't able to
reproduce with a small example. I've submitted a bug report to
Sourceforge (Id 1536021)*.

As Sourceforge seems to mangle the code I used to describe the
problem, I'll include it here once more:

The code (exception handler added to demonstrate and work around
the problem):

try :
h = hash(p)
except OverflowError, e:
print type(p), p, id(p), e
h = id(p) & 0x0FFF

prints the following output:


>
   3066797028 long int too large to convert to int

This happens with Python 2.5b3, but didn't happen with Python 2.4.3.

I assume that the hash-function for function/methods returns the
`id` of the function. The following code demonstrates the same
problem with a Python class whose `__hash__` returns the `id` of
the object:

$ python2.4
Python 2.4.3 (#1, Jun 30 2006, 10:02:59)
[GCC 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class X(object):
...   def __hash__(self): return id(self)
...
>>> hash (X())
-1211078036
$ python2.5
Python 2.5b3 (r25b3:51041, Aug  7 2006, 15:35:35)
[GCC 3.4.6 (Gentoo 3.4.6-r1, ssp-3.4.5-1.0, pie-8.7.9)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> class X(object):
...   def __hash__(self): return id(self)
...
>>> hash (X())
Traceback (most recent call last):
  File "", line 1, in 
OverflowError: long int too large to convert to int

(*) 
http://sourceforge.net/tracker/index.php?func=detail&aid=1536021&group_id=5470&atid=105470


-- 
Christian Tanzerhttp://www.c-tanzer.at/

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.5 status

2006-08-07 Thread Martin v. Löwis
[EMAIL PROTECTED] schrieb:
> The code (exception handler added to demonstrate and work around
> the problem):
> 
> try :
> h = hash(p)
> except OverflowError, e:
> print type(p), p, id(p), e
> h = id(p) & 0x0FFF
> 
> prints the following output:
> 
> 
> >
>3066797028 long int too large to convert to int
> 
> This happens with Python 2.5b3, but didn't happen with Python 2.4.3.
> 
> I assume that the hash-function for function/methods returns the
> `id` of the function.

No (not really). Instead, it combines the hash of the target object
with the address of the function object. The hash function of the
method object, in itself, cannot raise this overflow error.

However, it involves hash(p.im_self). So if Script_Category.__hash__
is implemented as you show below, this error might occur.

> >>> class X(object):
> ...   def __hash__(self): return id(self)
> ...
> >>> hash (X())
> Traceback (most recent call last):
>   File "", line 1, in 
> OverflowError: long int too large to convert to int

Yes, this comes from id() now always returning positive integers,
which might be a long if the object pointer is > MAXINT

I think both instance_hash and slot_tp_hash should be changed
to just truncate long ints to the range LONG_MIN..LONG_MAX

Notice that this error could have occurred already in 2.4,
on a 64-bit system where sizeof(void*) > sizeof(long) (i.e.
on Win64).

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread David Hopwood
Michael Foord wrote:
> Martin v. Löwis wrote:
> 
>>[snip..]
>>Expanding this view to Unicode should mean that a unicode
>>string U equals a byte string B if
>>U.encode(system_encode) == B or B.decode(system_encoding) == U,
>>and that they don't equal otherwise (e.g. if the conversion
>>fails with a "not convertible" exception).

I disagree. Unicode strings should always be considered distinct from
non-ASCII byte strings. Implicitly encoding or decoding in order to
perform a comparison is a bad idea; it is expensive and will often do
the wrong thing.

The programmer should explicitly encode the Unicode string or decode
the byte string before comparison (which one of these is correct is
application-dependent).

>>Which of the two conversions is selected is arbitrary; [...]

It would not be arbitrary. In the common case where the byte encoding
uses "precomposed" characters, using "U.encode(system_encoding) == B"
will tend to succeed in more cases than "B.decode(system_encoding) == U",
because alternative representations of the same abstract character in
Unicode will be mapped to the same precomposed character.

(Whether these are cases in which the comparison *should* succeed is,
as I said above, application-dependent.)

The special case of considering US-ASCII strings to compare equal to
the corresponding Unicode string, is more reasonable than this would be
for a general byte encoding, because:

 - it can be done with no (or only a trivial) conversion,
 - US-ASCII has no precomposed characters or combining marks, so it
   does not have multiple encodings for the same abstract character,
 - Unicode has a US-ASCII subset that uses exactly the same encoding
   model as US-ASCII (whereas in general, a byte encoding might use
   an arbitrarily different encoding model to Unicode, as for example
   is the case for ISCII).

>>we should, of course, continue to use the one we always used (for
>>"ascii", there is no difference between the two).
> 
> +1
> 
> This seems the most (only ?) logical solution.

No; always considering Unicode and non-ASCII byte strings to be distinct
is just as logical.

-- 
David Hopwood <[EMAIL PROTECTED]>



___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] 2.5 status

2006-08-07 Thread Nick Coghlan
Martin v. Löwis wrote:
> [EMAIL PROTECTED] schrieb:
>> >>> class X(object):
>> ...   def __hash__(self): return id(self)
>> ...
>> >>> hash (X())
>> Traceback (most recent call last):
>>   File "", line 1, in 
>> OverflowError: long int too large to convert to int
> 
> Yes, this comes from id() now always returning positive integers,
> which might be a long if the object pointer is > MAXINT
> 
> I think both instance_hash and slot_tp_hash should be changed
> to just truncate long ints to the range LONG_MIN..LONG_MAX

Couldn't they be changed to invoke long's own hash method when a long object 
is returned from __hash__?

Cheers,
Nick.

-- 
Nick Coghlan   |   [EMAIL PROTECTED]   |   Brisbane, Australia
---
 http://www.boredomandlaziness.org
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
David Hopwood schrieb:
> I disagree. Unicode strings should always be considered distinct from
> non-ASCII byte strings. Implicitly encoding or decoding in order to
> perform a comparison is a bad idea; it is expensive and will often do
> the wrong thing.

That's a pretty irrelevant position at this point; Python has had
the notion of a system encoding since Unicode was introduced,
and we are not going to remove that just before a release candidate
of Python 2.5.

The question at hand is not whether certain object should compare
unequal, but whether comparing them should raise an exception.

>>> Which of the two conversions is selected is arbitrary; [...]
>
> It would not be arbitrary. In the common case where the byte encoding
> uses "precomposed" characters, using "U.encode(system_encoding) == B"
> will tend to succeed in more cases than "B.decode(system_encoding) == U",
> because alternative representations of the same abstract character in
> Unicode will be mapped to the same precomposed character.

No, they won't (although they should, perhaps):

py> u'o\u0308'.encode("latin-1")
Traceback (most recent call last):
  File "", line 1, in ?
UnicodeEncodeError: 'latin-1' codec can't encode character u'\u0308' in
position 1: ordinal not in range(256)

In addition, it's also possible to find encodings (e.g. iso-2022) where
different byte sequences decode to the same Unicode string.

Regards,
Martin

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Armin Rigo
Hi,

On Thu, Aug 03, 2006 at 07:53:11PM +0200, M.-A. Lemburg wrote:
> > I though I'd heard (from Guido here or on the py3k list) that it was only 
> > 1 < u'abc' that would raise an exception, and that 1 == u'abc' would still 
> > evaluate to False.  Did I misunderstand?
> 
> Could be that I'm wrong.

I also seem to remember that TypeErrors should only signal ordering
non-sense, not equality.  In this case, I'm on the opinion that unicode
objects and completely-unrelated strings of random bytes should
successfully compare as unequal, but I'm not enough of a unicode user to
be sure.


A bientot,

Armin.
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Gregory P. Smith
On Mon, Aug 07, 2006 at 03:16:22PM +0200, "Martin v. L?wis" wrote:
> Gregory P. Smith schrieb:
> > Whoever knows how the windows build process works and controls the
> > python 2.5 windows release builds could you please make sure the
> > hashlib module gets built + linked with OpenSSL rather than falling
> > back to its much slower builtin implementations.
> 
> If the project files are changed in that direction, then my build will
> pick that up automatically. I can't promise to change the files myself.
> 
> I'm somewhat worried about yet another size increase in pythonxy.dll
> (actually, I'm personally not worried, but I anticipate complaints about
>  such a change).
> 
> What should happen to the existing sha* modules?

Actually, this change will either:

 (a) leave pythonxy.dll the same size.
   OR
 (b) *reduce* the size of pythonxy.dll.
 
hashlib's OpenSSL implementation on windows comes in the form of a
300k _hashlib.pyd library.  Build that and pythonxy.dll won't change.
If you want to reduce the pythonxy.dll size you can remove _md5, _sha,
_sha256 and _sha512 from the builtins list so long as _hashlib.pyd is
built.

There is OpenSSL library code duplication between the _hashlib (300k)
and _ssl (650k) modules the way things are linked today (static).  If
you wanted absolute smallest distro size and code reuse we'd need to
change things to use an OpenSSL.dll.  I'm not proposing that (though
it is a good idea).

> I believe that the performance of the OpenSSL routines depends on
> the way OpenSSL was built, e.g. whether the assembler implementations
> are used or not. Somebody would have to check, but I doubt they are.

That'd be unfortunate as that negatively impacts the socket _ssl
module as well.  OpenSSL should always be built with the assembler
implementations.

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Michael Foord
David Hopwood wrote:[snip..]
>
>   
>>> we should, of course, continue to use the one we always used (for
>>> "ascii", there is no difference between the two).
>>>   
>> +1
>>
>> This seems the most (only ?) logical solution.
>> 
>
> No; always considering Unicode and non-ASCII byte strings to be distinct
> is just as logical.
>   
Except there has been an implicit promise in Python for years now that 
ascii byte-strings will compare equally to the unicode equivalent: lots 
of code assumes this. Breaking this is fine in principle - but for Py3K 
not Py 2.x.

That means Martin's solution is the best for the current problem. (IMHO 
of course...)

Michael
http://www.voidspace.org.uk/python/index.shtml


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Anthony Baxter
I'm nervous about this change being made at this stage of the release process. 
It seems to me to have a chance of causing breakages - admittedly a small 
chance, but one that's higher than I'd like.

I'd also like to make sure that the PCBuild8 directory is updated at the same 
time - with the recent disappearance of the previous free MS compiler 
version, I think this will become more important over the 18 months or so of 
Python 2.5's life.

Anthony
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Gregory P. Smith
On Tue, Aug 08, 2006 at 03:25:46AM +1000, Anthony Baxter wrote:
> I'm nervous about this change being made at this stage of the release 
> process. 
> It seems to me to have a chance of causing breakages - admittedly a small 
> chance, but one that's higher than I'd like.

Sigh.  Half the reason I did the hashlib work was to get much faster
optimized versions of the hash algorithms into python.  I'll be
disappointed if that doesn't happen.

hashlib passes its test suite with our without openssl.  If I make the
windows project file updates to simply build and include _hashlib.pyd
in the windows installer what harm is that going to cause?  IMHO the
windows python 2.5 build as it is is missing a feature by not
including this.

> I'd also like to make sure that the PCBuild8 directory is updated at the same 
> time - with the recent disappearance of the previous free MS compiler 
> version, I think this will become more important over the 18 months or so of 
> Python 2.5's life.

agreed.

frustrated..
-greg

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread David Hopwood
Michael Foord wrote:
> David Hopwood wrote:[snip..]
> 
 we should, of course, continue to use the one we always used (for
 "ascii", there is no difference between the two).
>>>
>>> +1
>>>
>>> This seems the most (only ?) logical solution.
>>
>> No; always considering Unicode and non-ASCII byte strings to be distinct
>> is just as logical.
> 
> Except there has been an implicit promise in Python for years now that
> ascii byte-strings will compare equally to the unicode equivalent: lots
> of code assumes this.

I think you must have misread my comment:

  No; always considering Unicode and *non-ASCII* byte strings to be distinct
  is just as logical.

This says nothing about comparing Unicode and ASCII byte strings.

-- 
David Hopwood <[EMAIL PROTECTED]>


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Ron Adam
Michael Foord wrote:
> David Hopwood wrote:[snip..]
>>   
 we should, of course, continue to use the one we always used (for
 "ascii", there is no difference between the two).
   
>>> +1
>>>
>>> This seems the most (only ?) logical solution.
>>> 
>> No; always considering Unicode and non-ASCII byte strings to be distinct
>> is just as logical.

Yes, that's true.  (But can't be done prior to P3k of course.) Consider 
the comparison of ...

[3] == (3,)   ->  False

These are not the same thing even though it may be trivial to treat them 
as being equivalent.  So how smart should a equivalence comparison be? 
I think testing for interchangeability and/or taking into account 
context is going down a very difficult road.  Which is what the string 
to Unicode comparison does by making an assumption that the string type 
is in the default encoding, which it may not be.

Purity in this would insist that comparing floats and integers always 
return False, but there is little ambiguity when it comes to whether 
numerical values are equivalent or not.  The rules for their comparisons 
are fairly well established.  So numerical equivalence can be the 
exception when comparing values of differing types and its the expected 
behavior as well as the established practice in programming.


> Except there has been an implicit promise in Python for years now that 
> ascii byte-strings will compare equally to the unicode equivalent: lots 
> of code assumes this. Breaking this is fine in principle - but for Py3K 
> not Py 2.x.

Also True.  And I hope that a bytes to Unicode comparison in Py3k will 
always returns False just like [3] == (3,) always returns False.


> That means Martin's solution is the best for the current problem. (IMHO 
> of course...)

I think (IMHO) in this particular case, maintaining "backwards 
compatibility" should take precedence (until Py3k) and be the stated 
reason for the continued behavior in the documents as well.  And so 
Unicode to String comparisons should be the second exception to not 
doing data form conversions when comparing two objects.  At least for 
pre-Py3k.

Are there other cases where different types of objects compare equal? 
(Not including those where the user writes or overrides a method to get 
that functionality of course.)


Cheers,
Ron


___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
David Hopwood schrieb:
> Michael Foord wrote:
>> David Hopwood wrote:[snip..]
>>
> we should, of course, continue to use the one we always used (for
> "ascii", there is no difference between the two).
 +1

 This seems the most (only ?) logical solution.
>>> No; always considering Unicode and non-ASCII byte strings to be distinct
>>> is just as logical.
> 
> I think you must have misread my comment:

Indeed. The misunderstanding originates from your sentence starting with
"no", when, in fact, you seem to be supporting the proposal I made.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Martin v. Löwis
Gregory P. Smith schrieb:
> hashlib's OpenSSL implementation on windows comes in the form of a
> 300k _hashlib.pyd library.

What do you mean by "comes"? I can't find any _hashlib.vcproj file
inside the PCbuild directory.

>> I believe that the performance of the OpenSSL routines depends on
>> the way OpenSSL was built, e.g. whether the assembler implementations
>> are used or not. Somebody would have to check, but I doubt they are.
> 
> That'd be unfortunate as that negatively impacts the socket _ssl
> module as well.  OpenSSL should always be built with the assembler
> implementations.

But Visual Studio doesn't ship with an assembler! So how could I
build it?

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Gregory P. Smith
On Tue, Aug 08, 2006 at 01:46:13AM +0200, "Martin v. L?wis" wrote:
> Gregory P. Smith schrieb:
> > hashlib's OpenSSL implementation on windows comes in the form of a
> > 300k _hashlib.pyd library.
> 
> What do you mean by "comes"? I can't find any _hashlib.vcproj file
> inside the PCbuild directory.

I'll see about creating one later today when I'm reunited with my laptop.

> >> I believe that the performance of the OpenSSL routines depends on
> >> the way OpenSSL was built, e.g. whether the assembler implementations
> >> are used or not. Somebody would have to check, but I doubt they are.
> > 
> > That'd be unfortunate as that negatively impacts the socket _ssl
> > module as well.  OpenSSL should always be built with the assembler
> > implementations.
> 
> But Visual Studio doesn't ship with an assembler! So how could I
> build it?

yes it does.  Visual Studio comes with MASM (ml) and OpenSSL ships
with build scripts to use it.  See openssl's INSTALL.W32 file.  Also,
a free assembler (NASM) is available that OpenSSL is also capable of
building with if for some reason you don't have masm installed.

Looking into how the windows python build process works (honestly
something i've not looked at since 2.0) it appears that
PCbuild/build_ssl.py handles the compilation of OpenSSL..

I haven't tested this yet (I'll try it tonight) but I believe this
patch is all thats needed to enable the openssl assembly build:

--- build_ssl.py(revision 51136)
+++ build_ssl.py(working copy)
@@ -98,6 +98,8 @@
 if not cmd: break
 if cmd.strip()[:5].lower() == "nmake":
 continue
+if re.match("set OPTS=no-asm", cmd):
+continue
 temp_bat.write(cmd)
 in_bat.close()
 temp_bat.close()

Alternatively just modifying build_ssl.py to run "ms\do_masm.bat"
before it runs "nmake -f ms\32.mak" should work.

-g

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Martin v. Löwis
Gregory P. Smith schrieb:
> Sigh.  Half the reason I did the hashlib work was to get much faster
> optimized versions of the hash algorithms into python.  I'll be
> disappointed if that doesn't happen.

Sad as it sounds, it appears you just did half of the work, then
(omitting the Windows build process).

> hashlib passes its test suite with our without openssl.  If I make the
> windows project file updates to simply build and include _hashlib.pyd
> in the windows installer what harm is that going to cause?

None, if you do it correctly (where correctly includes AMD64 and IA64
builds), add text to PCbuild/readme.txt, and edit msi.py properly.

That is, assuming hashlib then still works correctly on Windows (which
is hard to tell).

> IMHO the
> windows python 2.5 build as it is is missing a feature by not
> including this.

Wrong incantation :-) We are in feature freeze now, so adding a feature
is a big no-no. You should have argued this is a bug <0.5 wink>.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] unicode hell/mixing str and unicode as dictionary keys

2006-08-07 Thread Martin v. Löwis
Armin Rigo schrieb:
> I also seem to remember that TypeErrors should only signal ordering
> non-sense, not equality.  In this case, I'm on the opinion that unicode
> objects and completely-unrelated strings of random bytes should
> successfully compare as unequal, but I'm not enough of a unicode user to
> be sure.

I believe this was the original intent for raising TypeErrors here in
the first place: string-vs-unicode comparison predates rich comparisons,
and there is no way to implement __cmp__ meaningfully if the strings
don't convert successfully under the system encoding (if they are
inequal, you wouldn't be able to tell which one is smaller).

With rich comparisons available, I see no reason to keep raising that
exception.

As for unicode users: As others have said, they should avoid mixing
unicode and ascii strings. We provide a fallback for a limited case
(ascii); beyond that, Python assumes that non-ascii strings represent
uninterpreted bytes, not characters.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Gregory P. Smith
On Tue, Aug 08, 2006 at 02:23:02AM +0200, "Martin v. L?wis" wrote:
> Gregory P. Smith schrieb:
> > Sigh.  Half the reason I did the hashlib work was to get much faster
> > optimized versions of the hash algorithms into python.  I'll be
> > disappointed if that doesn't happen.
> 
> Sad as it sounds, it appears you just did half of the work, then
> (omitting the Windows build process).

I had no access to a windows build environment at the time (as many
python developers do not).  Apparently I neglected to bribe someone
who did to do it after I checked the module in. ;)

So is it worth my time doing this in a hurry for 2.5 or do other
people really just not care if python for windows uses a slow OpenSSL?

Widely deployed popular applications use python for both large scale
hashing and ssl communications.

If no, can this go in 2.5.1?  Its not an API change.

-g

___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] windows 2.5 build: use OpenSSL for hashlib [bug 1535502]

2006-08-07 Thread Martin v. Löwis
Gregory P. Smith schrieb:
> So is it worth my time doing this in a hurry for 2.5 or do other
> people really just not care if python for windows uses a slow OpenSSL?

As I said: I would accept patches. If you arrange for a separate
_hashlib.pyd file, most of my concerns would go away. So please
do produce a patch, so we have something to review.

> Widely deployed popular applications use python for both large scale
> hashing and ssl communications.

Yet, nobody has worried about performance in all these years to notice
that the assembler code isn't being used. So it can't be that bad.
For SSL specifically, the usage of hashing is minimal, as the actual
communication uses symmetric encryption.

Regards,
Martin
___
Python-Dev mailing list
[email protected]
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com