Re: "import" not working?

2009-02-21 Thread Gabriel Genellina
En Fri, 20 Feb 2009 22:40:03 -0200, Lionel   
escribió:



Okay, moving the wx example into the same directory containing the
first example that was working fixed it. This directory only contains
these two modules and nothing else. The old directory which contained
the example that wasn't working did not contain a module with the same
name as the one I was trying to import, so i don't know why this "fix"
worked.


Just play safe:

- Don't use "from xxx import *", least from two places at the same time.  
The wx package has a short name on purpose - use "import wx" and then  
wx.Frame, etc.


- Don't play with sys.path if you don't have to; you can put your own  
modules in a place already listed (like Lib\site-packages). Or, use a .pth  
file if you want to add a copmletely separate directory like  
c:\DataFileTypes


--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread Gabriel Genellina

En Sat, 21 Feb 2009 01:14:02 -0200, odeits  escribió:

On Feb 15, 11:31 pm, odeits  wrote:



It seems what you are actually testing for is if the intersection of
the two sets is not empty where the first set is the characters in
your word and the second set is the characters in your defined string.


To expand on what I was saying I thought i should provide a code
snippet:

WORD = 'g' * 100
WORD2 = 'g' * 50 + 'U'
VOWELS = 'aeiouAEIOU'
BIGWORD = 'g' * 1 + 'U'

def set_test(vowels, word):

vowels = set( iter(vowels))
letters = set( iter(word) )

if letters & vowels:
return True
else:
return False

with python 2.5 I got 1.30 usec/pass against the BIGWORD


You could make it slightly faster by removing the iter() call: letters =  
set(word)

And (if vowels are really constant) you could pre-build the vowels set.

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


python contextmanagers and ruby blocks

2009-02-21 Thread Alia Khouri
As an exercise, I recently translated one of my python scripts (http://
code.activestate.com/recipes/576643/) to haskell (a penultimate
version exists at 
http://groups.google.com/group/comp.lang.haskell/browse_thread/thread/fb1ebd986b44244e#
in case anyone is interested) with the result that haskell has now
become my second favourite language (after python of course :-)

Just to change mental gears a bit, I'd now like to do the same and
create a ruby version. As I've progressed on the latter, I've been
struck by how pervasive the use of blocks is in ruby. For example:

class Builder
attr_accessor :name
def machine &block
@name = "m1"
block.call
end

def build(x, &block)
puts x
block.call
end
end

builder = Builder.new

builder.machine do
puts "hello #{builder.name}"
end

builder.build "hello" do
puts "world"
end

which should print out:
hello m1
hello
world

Now, python's relatively new contextmanagers seem to provide something
similar such that one can write:

from __future__ import with_statement
from contextlib import contextmanager

class Builder:
@contextmanager
def machine(self):
self.name = "m1"
yield

@contextmanager
def build(self, x):
print x
yield

builder = Builder()

with builder.machine():
print 'hello %s' % builder.name

with builder.build("hello"):
print 'world'

Which brings me to my questions:

1. To what extent are python's contextmanagers similar or equivalent
to ruby's blocks?

2. If there is a gap in power or expressiveness in python's context
managers relative to ruby's blocks, what are possible (syntactic and
non-syntactic) proposals to bridge this gap?

Thank you for your responses.

AK
--
http://mail.python.org/mailman/listinfo/python-list


Re: function factory question: embed current values of object attributes

2009-02-21 Thread Alan Isaac
Terry Reedy wrote: 
You are now describing a function closure.  Here is an example that 
might help.



It does.
Thanks,
Alan
--
http://mail.python.org/mailman/listinfo/python-list


Re: function factory question: embed current values of object attributes

2009-02-21 Thread Alan Isaac
Gabriel Genellina wrote: 
If you want a "frozen" function (that is, a function already set-up with  
the parameters taken from the current values of x.a, x.b) use  
functools.partial:



OK, that's also a nice idea.
Thanks!
Alan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Using clock() in threading on Windows

2009-02-21 Thread Martin v. Löwis
> Would it not be better to use time.clock() instead? 

If you really want to reconsider this implementation, I think it
would be best to use relative timeouts all the way down to the
system. In the specific case of Windows, WaitForSingleObject
expects a relative number of milliseconds (i.e. a wait duration).
As this is also what the Python script passes (in seconds),
it is best to leave issues of timer resolution to the operating
system (which we have to trust anyway).

As a consequence, the half-busy loops could go away, at least
on systems where lock timeouts can be given to the system.

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Python dictionary size/entry limit?

2009-02-21 Thread intelliminer
I wrote a script to process textual data and extract phrases from
them, storing these phrases in a dictionary. It encounters a
MemoryError when there are about 11.18M keys in the dictionary, and
the size is about 1.5GB. I tried multiple times, and the error occurs
everytime at exactly the same place (with the same number of keys in
the dict). I then split the dictionary into two using a simple
algorithm:

if str[0]<='m':
dict=dict1
else:
dict=dict2

#use dict...

And it worked fine. The total size of the two dictionaries well
exceeded 2GB yet no MemoryError occured.

I have 1GB of pysical memory and 3GB in pagefile. Is there a limit to
the size or number of entries that a single dictionary can possess? By
searching on the web I can't find a clue why this problem occurs.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python dictionary size/entry limit?

2009-02-21 Thread Tino Wildenhain

intellimi...@gmail.com wrote:

I wrote a script to process textual data and extract phrases from
them, storing these phrases in a dictionary. It encounters a
MemoryError when there are about 11.18M keys in the dictionary, and
the size is about 1.5GB. I tried multiple times, and the error occurs
everytime at exactly the same place (with the same number of keys in
the dict). I then split the dictionary into two using a simple
algorithm:

if str[0]<='m':
dict=dict1
else:
dict=dict2

#use dict...

And it worked fine. The total size of the two dictionaries well
exceeded 2GB yet no MemoryError occured.

I have 1GB of pysical memory and 3GB in pagefile. Is there a limit to
the size or number of entries that a single dictionary can possess? By
searching on the web I can't find a clue why this problem occurs.


From what can be deducted from the headers of your message:
X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1;...
you are using windows?
It seems either python or windows memory management somehow prevent
the use of continuous memory areas that large.
We've got such an example somewhere down the list which was similar
(iirc it was a large string in memory) which runned perfectly
with linux. You can try yourself maybe by installing ubuntu
on the same host. (If you feel fit you can even skip the install
and run it off life CD but then you need to fiddle a little to
get swap space on disk)

Regards
Tino




--
http://mail.python.org/mailman/listinfo/python-list




smime.p7s
Description: S/MIME Cryptographic Signature
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python dictionary size/entry limit?

2009-02-21 Thread intelliminer
On Feb 21, 6:25 pm, Tino Wildenhain  wrote:
> intellimi...@gmail.com wrote:
> > I wrote a script to process textual data and extract phrases from
> > them, storing these phrases in a dictionary. It encounters a
> > MemoryError when there are about 11.18M keys in the dictionary, and
> > the size is about 1.5GB. I tried multiple times, and the error occurs
> > everytime at exactly the same place (with the same number of keys in
> > the dict). I then split the dictionary into two using a simple
> > algorithm:
>
> > if str[0]<='m':
> >     dict=dict1
> > else:
> >     dict=dict2
>
> > #use dict...
>
> > And it worked fine. The total size of the two dictionaries well
> > exceeded 2GB yet no MemoryError occured.
>
> > I have 1GB of pysical memory and 3GB in pagefile. Is there a limit to
> > the size or number of entries that a single dictionary can possess? By
> > searching on the web I can't find a clue why this problem occurs.
>
>  From what can be deducted from the headers of your message:
> X-HTTP-UserAgent: Mozilla/5.0 (Windows; U; Windows NT 5.1;...
> you are using windows?
> It seems either python or windows memory management somehow prevent
> the use of continuous memory areas that large.
> We've got such an example somewhere down the list which was similar
> (iirc it was a large string in memory) which runned perfectly
> with linux. You can try yourself maybe by installing ubuntu
> on the same host. (If you feel fit you can even skip the install
> and run it off life CD but then you need to fiddle a little to
> get swap space on disk)
>
> Regards
> Tino
>
> > --
> >http://mail.python.org/mailman/listinfo/python-list
>
>
>
>  smime.p7s
> 4KViewDownload

Yes, it's winxp, I forgot to mention it.
Thanks for the reply.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python dictionary size/entry limit?

2009-02-21 Thread Stefan Behnel
intellimi...@gmail.com wrote:
> I wrote a script to process textual data and extract phrases from
> them, storing these phrases in a dictionary. It encounters a
> MemoryError when there are about 11.18M keys in the dictionary, and
> the size is about 1.5GB.
> [...]
> I have 1GB of pysical memory and 3GB in pagefile. Is there a limit to
> the size or number of entries that a single dictionary can possess? By
> searching on the web I can't find a clue why this problem occurs.

Python dicts are only limited by what your OS returns as free memory.
However, when a dict grows, it needs to resize, which means that it has to
create a bigger copy of itself and redistribute the keys. For a dict that
is already 1.5GB big, this can temporarily eat a lot more memory than you
have, at least more than two times as much as the size of the dict itself.

You may be better served with one of the dbm databases that come with
Python. They live on-disk but do the usual in-memory caching. They'll
likely perform a lot better than your OS level swap file.

Stefan
--
http://mail.python.org/mailman/listinfo/python-list


Re: python contextmanagers and ruby blocks

2009-02-21 Thread Francesco Bochicchio
On Sat, 21 Feb 2009 00:46:08 -0800, Alia Khouri wrote:

> As an exercise, I recently translated one of my python scripts (http://
> code.activestate.com/recipes/576643/) to haskell (a penultimate
> version exists at 
> http://groups.google.com/group/comp.lang.haskell/browse_thread/thread/fb1ebd986b44244e#
> in case anyone is interested) with the result that haskell has now
> become my second favourite language (after python of course :-)
> 
> Just to change mental gears a bit, I'd now like to do the same and
> create a ruby version. As I've progressed on the latter, I've been
> struck by how pervasive the use of blocks is in ruby. For example:
>

... ruby code that shows the most twisted 'Hellow world' example I have
ever seen :-) ...

 
> 
> Now, python's relatively new contextmanagers seem to provide something
> similar such that one can write:
> 

... python code doing the same thing - apparently - 
of prevous ruby code, using context managers in a way that I believe the
authors of contextlib module never thought of.


> 
> Which brings me to my questions:
> 
> 1. To what extent are python's contextmanagers similar or equivalent
> to ruby's blocks?
> 

ASAIK, context managers are nothing like ruby blocks.
Context managers have a very specific purpose : to make people able to
abstract the code that one writes to 'enter a context'
(i.e. open a file, start a transaction, ... ) and 'leave a context'
(i.e. close a file, commit or rollback the transaction ... ).
So that you can separate context handling code from the code that performs
actions insed that context, factoring out the first for reuse and better
code maintenance.

Ruby blocks are blocks of code which can be passed as
objects for a number of different usage - for instance to make context
management stuff. If I have to compare them to something in Python, I
would say they are 'lambda on steroids' or 'nameless functions'. And -
personally - I don't like them just as I don't like lambdas in python for
anything but one-liners and I don't like anonymous functions in haskell
(which I am painfully trying to learn ). They may be cool to write, but
they look not very readable to me - but maybe this is just me.

Ciao

FB  


 


Ruby blocks - for the little I know of ruby - are anonymous block of
codes


> 2. If there is a gap in power or expressiveness in python's context
> managers relative to ruby's blocks, what are possible (syntactic and
> non-syntactic) proposals to bridge this gap?
> 
> Thank you for your responses.
> 
> AK

--
http://mail.python.org/mailman/listinfo/python-list


Re: python contextmanagers and ruby blocks

2009-02-21 Thread Alia K
Francesco wrote:

> ... ruby code that shows the most twisted 'Hellow world' example I have
> ever seen :-) ...

and I was gunning for the simplest possible example (-:


> ... python code doing the same thing - apparently -
> of prevous ruby code, using context managers in a way that I believe the
> authors of contextlib module never thought of.

Methinks they probably thought of such usage.

> > 1. To what extent are python's contextmanagers similar or equivalent
> > to ruby's blocks?

> ASAIK, context managers are nothing like ruby blocks.
> Context managers have a very specific purpose : to make people able to
> abstract the code that one writes to 'enter a context'
> (i.e. open a file, start a transaction, ... ) and 'leave a context'
> (i.e. close a file, commit or rollback the transaction ... ).
> So that you can separate context handling code from the code that performs
> actions insed that context, factoring out the first for reuse and better
> code maintenance.

Thinking about it: I suppose one could say in python a contextmanager
defines the functional context of a block of code and makes it a first
class construct in the language, whereas in ruby the block itself is a
first class citizen -- contextmanagers are like the inverse of blocks.

> Ruby blocks are blocks of code which can be passed as
> objects for a number of different usage - for instance to make context
> management stuff. If I have to compare them to something in Python, I
> would say they are 'lambda on steroids' or 'nameless functions'.

Agreed, but also they are more tightly integrated e.g. the &block
construct which can be passed into functions...

> personally - I don't like them just as I don't like lambdas in python for
> anything but one-liners and I don't like anonymous functions in haskell
> (which I am painfully trying to learn ). They may be cool to write, but
> they look not very readable to me - but maybe this is just me.

In case you are learning haskell, here are some excellent guides
(order is important) :

* Learn you a haskell: http://learnyouahaskell.com/chapters
* Real World Haskell: http://book.realworldhaskell.org/
* Typeclassopedia: 
http://byorgey.wordpress.com/2009/02/16/the-typeclassopedia-request-for-feedback/

(I'm personally still scratching the surface of it all...)

back to the subject...

I suppose because contextmanagers (indeed decorators) are so
relatively new to python, it will probably take a little while for
these constructs to comprehensively penetrate the stdlib. It's already
happened with files, locks, and db transactions but I'm sure there are
many instances where one could benefit from using the with statement.

Nevertheless, I remain curious about whether once can use the
contextmanager in python to achieve the full power of ruby's blocks...

Best,

AK

--
http://mail.python.org/mailman/listinfo/python-list


TypeError: descriptor 'replace' requires a 'str' object but received a 'unicode'

2009-02-21 Thread Jaap van Wingerde

# -*- coding: utf_8 -*-
Omschrijving = u'priv? assuranti?n' # string from a bank.csv
Omschrijving = str.replace(Omschrijving, "priv?", 'privé')
Omschrijving = str.replace(Omschrijving, "Assuranti?n", 'Assurantiën')
print Omschrijving

When I run this script I get the following message.

"Traceback (most recent call last):
  File "/home/jaap/Desktop/unicode.py", line 3, in 
Omschrijving = str.replace(Omschrijving, "priv?", 'priv�')
TypeError: descriptor 'replace' requires a 'str' object but received a 
'unicode'"


How can I solve this?




--
Jaap van Wingerde
e-mail: 1234567...@vanwingerde.net
web: http://jaap.vanwingerde.net/
--
http://mail.python.org/mailman/listinfo/python-list


Re: TypeError: descriptor 'replace' requires a 'str' object but received a 'unicode'

2009-02-21 Thread Stefan Behnel
Jaap van Wingerde wrote:
> # -*- coding: utf_8 -*-
> Omschrijving = u'priv? assuranti?n' # string from a bank.csv
> Omschrijving = str.replace(Omschrijving, "priv?", 'privé')
> Omschrijving = str.replace(Omschrijving, "Assuranti?n", 'Assurantiën')
> print Omschrijving
> 
> When I run this script I get the following message.
> 
> "Traceback (most recent call last):
>   File "/home/jaap/Desktop/unicode.py", line 3, in 
> Omschrijving = str.replace(Omschrijving, "priv?", 'priv�')
> TypeError: descriptor 'replace' requires a 'str' object but received a
> 'unicode'"
> 
> How can I solve this?

By using unicode.replace() instead of str.replace(), i.e.

Omschrijving = Omschrijving.replace("priv?", 'privé')

Stefan
--
http://mail.python.org/mailman/listinfo/python-list


ordinal not in range

2009-02-21 Thread Jaap van Wingerde

Stefan Behnel wrote:

Omschrijving = Omschrijving.replace("priv?", 'privé')


I Thank you, this works now, but I get a new error message.


import codecs
file = "postbank.csv"
output = "%s.eb" % file
outfile = codecs.open(output, "w", "utf_8")
Omschrijving = u'priv? assuranti?n' # string from postbank.csv
Omschrijving = Omschrijving.replace("priv?", 'privé')
Omschrijving = Omschrijving.replace("Assuranti?n", 'Assurantiën')
outfile.write (Omschrijving)

"Traceback (most recent call last):
  File "/home/jaap/Desktop/unicode.py", line 9, in 
outfile.write (Omschrijving)
  File "/usr/lib/python2.5/codecs.py", line 638, in write
return self.writer.write(data)
  File "/usr/lib/python2.5/codecs.py", line 303, in write
data, consumed = self.encode(object, self.errors)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4: 
ordinal not in range(128)"






--
http://mail.python.org/mailman/listinfo/python-list


Re: Uploading big files wit cherrypy

2009-02-21 Thread Aahz
In article <366595b2-226c-48e4-961d-85bd0ce4b...@h16g2000yqj.googlegroups.com>,
Farsheed Ashouri   wrote:
>
>But I couldn't upload files bigger than 100Mb.  Why and what is
>workaround?

What happens when you upload a file larger than 100MB?
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: "Byte" type?

2009-02-21 Thread Steve Holden
John Nagle wrote:
> Steve Holden wrote:
>> John Nagle wrote:
>>> Benjamin Kaplan wrote:
 On Sun, Feb 15, 2009 at 11:57 AM, John Nagle  wrote:

> ...Re "bytes" not behaving as documented in 2.6:
> 
>>>That's indeed how Python 2.6 works.  But that's not how
>>> PEP 3137 says it's supposed to work.
>>>
>>> Guido:
>>>
>>>  "I propose the following type names at the Python level:
>>>
>>> * bytes is an immutable array of bytes (PyString)
>>> * bytearray is a mutable array of bytes (PyBytes)"
> ...
>>> (Not true in Python 2.6
>>> Is this a bug, a feature, a documentation error, or bad design?
>>>
>> It's a feature. In fact all that was done to accommodate easier
>> migration to 3.x is easily shown in one statement:
>>
> str is bytes
>> True
>>
>> So that's why bytes works the way it does in 2.6 ... hence my contested
>> description of it as an "ugly hack". I am happy to withdraw "ugly", but
>> I think "hack" could still be held to apply.
> 
>Agreed.  But is this a 2.6 thing, making 2.6 incompatible with 3.0, or
> what?  How will 3.x do it?  The PEP 3137 way, or the Python 2.6 way?
> 
>The way it works in 2.6 makes it necessary to do "ord" conversions
> where they shouldn't be required.
> 
Yes, the hack was to achieve a modicum of compatibility with 3.0 without
having to turn the world upside down.

I haven't used 3.0 enough the say whether bytearray has been correctly
implemented. But I believe the intention is that 3.0 should fully
implement PEP 3137.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: ordinal not in range

2009-02-21 Thread Stefan Behnel
Jaap van Wingerde wrote:
> Stefan Behnel wrote:
>> Omschrijving = Omschrijving.replace("priv?", 'privé')

actually, make that

Omschrijving = Omschrijving.replace(u"priv?", u'privé')

(mind the u"...")


> 
> import codecs
> file = "postbank.csv"
> output = "%s.eb" % file
> outfile = codecs.open(output, "w", "utf_8")
> Omschrijving = u'priv? assuranti?n' # string from postbank.csv
> Omschrijving = Omschrijving.replace("priv?", 'privé')
> Omschrijving = Omschrijving.replace("Assuranti?n", 'Assurantiën')

I guess you mixed up the case here.


> outfile.write (Omschrijving)
> 
> "Traceback (most recent call last):
>   File "/home/jaap/Desktop/unicode.py", line 9, in 
> outfile.write (Omschrijving)
>   File "/usr/lib/python2.5/codecs.py", line 638, in write
> return self.writer.write(data)
>   File "/usr/lib/python2.5/codecs.py", line 303, in write
> data, consumed = self.encode(object, self.errors)
> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 4:
> ordinal not in range(128)"

Does this help?

outfile = codecs.open(output, "wb", encoding="UTF-8")

(mind the "wb" for 'write binary/bytes')

Looks like you'd be happier with Python 3.0, BTW...

Stefan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Change in cgi handling of POST requests

2009-02-21 Thread Aahz
[posted & e-mailed]

In article ,
Mac   wrote:
>
>We just upgraded Python to 2.6 on some of our servers and a number of
>our CGI scripts broke because the cgi module has changed the way it
>handles POST requests.  When the 'action' attribute was not present in
>the form element on an HTML page the module behaved as if the value of
>the attribute was the URL which brought the user to the page with the
>form, but without the query (?x=y...) part.  Now FieldStorage.getvalue
>() is giving the script a list of two copies of the value for some of
>the parameters (folding in the parameters from the previous request)
>instead of the single string it used to return for each.  I searched
>this newsgroup looking for a discussion of the proposal to impose this
>change of behavior, and perhaps I wasn't using the right phrases in my
>search, but I didn't find anything.  

Interesting.  Nobody has responded, so I suggest first filing a report
using bugs.python.org and then asking on python-dev (with reference to
your report).
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Python C-API Object Allocation

2009-02-21 Thread William Newbery

Ive been learning the C-API lately so I can write python extensions for some of 
my c++ stuff.

I want to use the new and delete operators for creating and destroying my 
objects.

The problem is python seems to break it into several stages. tp_new, tp_init 
and tp_alloc for creation and tp_del, tp_free and tp_dealloc for destruction. 
However c++ just has new which allocates and fully constructs the object and 
delete which destructs and deallocates the object.


Which of the python tp_* methods do I need to provide and what must they do to 
be compatible with python subclassing.

Also I want to be able to create the object directly in c++ eg "PyObject *obj = 
new MyExtensionObject(args);"



_
Love Hotmail?  Check out the new services from Windows Live! 
http://clk.atdmt.com/UKM/go/132630768/direct/01/--
http://mail.python.org/mailman/listinfo/python-list


Re: TypeError: descriptor 'replace' requires a 'str' object but received a 'unicode'

2009-02-21 Thread Steve Holden
Jaap van Wingerde wrote:
> # -*- coding: utf_8 -*-
> Omschrijving = u'priv? assuranti?n' # string from a bank.csv
> Omschrijving = str.replace(Omschrijving, "priv?", 'privé')
> Omschrijving = str.replace(Omschrijving, "Assuranti?n", 'Assurantiën')
> print Omschrijving
> 
> When I run this script I get the following message.
> 
> "Traceback (most recent call last):
>   File "/home/jaap/Desktop/unicode.py", line 3, in 
> Omschrijving = str.replace(Omschrijving, "priv?", 'priv�')
> TypeError: descriptor 'replace' requires a 'str' object but received a
> 'unicode'"
> 
> How can I solve this?
> 

First of all, use the methods of the unicode type, not the str type.
Secondly, call the methods on an instance, not on the type (the instance
is passed automatically). Thirdly, use Unicode arguments to replace.

Omschrijving = u'priv? assuranti?n' # string from a bank.csv
Omschrijving = Omschrijving.replace(u"priv?", u'privé')
Omschrijving = Omschrijving.replace(u"assuranti?n", u'assurantiën')
print Omschrijving

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: ordinal not in range

2009-02-21 Thread Jaap van Wingerde

Stefan Behnel wrote:

Omschrijving = Omschrijving.replace(u"priv?", u'privé')
(mind the u"...")
outfile = codecs.open(output, "wb", encoding="UTF-8")
(mind the "wb" for 'write binary/bytes')

It works now!


Looks like you'd be happier with Python 3.0, BTW...

Python 3 is not in Debian Lenny.

With your help I made my first Python-script. This script saves me from
hours dumb work.

Thanks a lot!!




--
Jaap van Wingerde
e-mail: 1234567...@vanwingerde.net


--
http://mail.python.org/mailman/listinfo/python-list


Re: how to assert that method accepts specific types

2009-02-21 Thread Scott David Daniels

Rhodri James wrote:

On Sat, 21 Feb 2009 01:12:01 -, Darren Dale  wrote:

I would like to assert that a method accepts certain types
from functools import wraps
def accepts(*types):
def check_accepts(f): ...
class Test(object):
@accepts(int)
def check(self, obj):
print obj
but now I want Test.check to accept an instance of Test as well


An icky but relatively clear way to get around this is to gratuitously
subclass Test:
class AcceptableTest(object):
pass

class Test(AcceptableTest):
@accepts(int, AcceptableTest)
def check(self, obj):
print obj


To Increase the ick, but lowering the name pollution
(while making the _source_ read more clearly):

class Test():
pass # Just a placeholder for type checking

class Test(Test):
@accepts(int, Test)
def check(self, obj):
print obj


--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: "Byte" type?

2009-02-21 Thread John Nagle

Steve Holden wrote:

John Nagle wrote:

Steve Holden wrote:

John Nagle wrote:

Benjamin Kaplan wrote:

On Sun, Feb 15, 2009 at 11:57 AM, John Nagle  wrote:


...Re "bytes" not behaving as documented in 2.6:


   That's indeed how Python 2.6 works.  But that's not how
PEP 3137 says it's supposed to work.

Guido:

 "I propose the following type names at the Python level:

* bytes is an immutable array of bytes (PyString)
* bytearray is a mutable array of bytes (PyBytes)"

...

(Not true in Python 2.6
Is this a bug, a feature, a documentation error, or bad design?


It's a feature. In fact all that was done to accommodate easier
migration to 3.x is easily shown in one statement:


str is bytes

True

So that's why bytes works the way it does in 2.6 ... hence my contested
description of it as an "ugly hack". I am happy to withdraw "ugly", but
I think "hack" could still be held to apply.

   Agreed.  But is this a 2.6 thing, making 2.6 incompatible with 3.0, or
what?  How will 3.x do it?  The PEP 3137 way, or the Python 2.6 way?

   The way it works in 2.6 makes it necessary to do "ord" conversions
where they shouldn't be required.


Yes, the hack was to achieve a modicum of compatibility with 3.0 without
having to turn the world upside down.

I haven't used 3.0 enough the say whether bytearray has been correctly
implemented. But I believe the intention is that 3.0 should fully
implement PEP 3137.


   If "bytes", a new keyword, works differently in 2.6 and 3.0, that was really
dumb.  There's no old code using "bytes".  So converting code to 2.6 means
it has to be converted AGAIN for 3.0.  That's a good reason to ignore 2.6 as
defective.

John Nagle
--
http://mail.python.org/mailman/listinfo/python-list


Re: Find the location of a loaded module

2009-02-21 Thread rdmurray
"Gabriel Genellina"  wrote:
> En Fri, 20 Feb 2009 20:44:21 -0200, Aaron Scott
>  escribi=F3:
> 
> > So, the problem lies with how Python cached the modules in memory.
> > Yes, the modules were in two different locations and yes, the one that
> > I specified using its direct path should be the one loaded. The
> > problem is, the module isn't always loaded -- if it's already in
> > memory, it'll use that instead. And since the modules had the same
> > name, Python wouldn't distinguish between them, even though they
> > weren't exactly the same.
> 
> Yes, that's how import works. It's barely documented, and you finally
> learned it the hard way...

I'd argue a little bit with "barely documented".  In the reference, the
discussion of the import statement starts off saying:

Import statements are executed in two steps: (1) find a module,
and initialize it if necessary; (2) define a name or names in the
local namespace (of the scope where the import statement occurs).

The third paragraph then says:

The system maintains a table of modules that have been or are being
initialized, indexed by module name. This table is accessible as
sys.modules. When a module name is found in this table, step (1)
is finished.

That is pretty up front and unambiguous documentation.

However, the consequences of that statement won't be immediately clear
on first reading.  I think it would have cleared up Aaron's confusion
if he'd happened to think to read it.  But since he knew the syntax
of the import statement already, I'm not surprised he did not read it.

The Tutorial, in the section on modules and import, says:

A module can contain executable statements as well as function
definitions. These statements are intended to initialize the
module. They are executed only the first time the module is imported
somewhere.

This is considerably less precise, if more superficially understandable.
I wonder if it would be worth expanding on that statement to mention
that the module is not even looked for on disk if a module by that
name has already been imported.

--RDM

--
http://mail.python.org/mailman/listinfo/python-list


Re: Wanted: Online Python Course for Credit

2009-02-21 Thread Scott David Daniels

jsidell wrote:

I'm a high school game development teacher and I have recently
discovered Python to be a great way to introduce computer
programming.  I intend to study Python on my own but I can get
professional development credit at my job for taking a Python course.
So I'm looking for an online class that I can get a certificate,
college credit, or something to that effect.  Any suggestions would be
greatly appreciated


I'd strongly suggest going to PyCon, (sign up for tutorials as well):
http://us.pycon.org/2009/about/
I know, you want on-line, but the chance to meet with a group of others,
and this PyCon will likely have the best collection of teachers and
Python.

for example, There is a tutorial:
Python for Teachers (Urner/Holden)

Also two strongly relevant sections in the scheduled talks:
8.  Learning and Teaching Python Programming: The Crunchy Way
Dr. André Roberge
88.  Seven ways to use Python's new turtle module
Mr. Gregor Lingl

But make no mistake, you need to mingle when you are there to get the
inspiration you'll want.

--Scott David Daniels
scott.dani...@acm.org
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=   wrote:
>I don't think that was the complaint. Instead, the complaint was
>that the OP's original message did not have a Content-type header,
>and that it was thus impossible to tell what the byte in front of
>"Wiki" meant. To properly post either MICRO SIGN or GREEK SMALL LETTER
>MU in a usenet or email message, you really must use MIME. (As both
>your article and Thorsten's did, by choosing UTF-8)

MIME only applies Internet e-mail messages.  RFC 1036 doesn't require
nor give a meaning to a Content-Type header in a Usenet message, so
there's nothing wrong with the original poster's newsreader.

In any case what the original poster really should do is come up with
a better name for his program

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
--
http://mail.python.org/mailman/listinfo/python-list


Re: "Byte" type?

2009-02-21 Thread Christian Heimes
John Nagle wrote
>If "bytes", a new keyword, works differently in 2.6 and 3.0, that was
> really
> dumb.  There's no old code using "bytes".  So converting code to 2.6 means
> it has to be converted AGAIN for 3.0.  That's a good reason to ignore
> 2.6 as
> defective.

Please don't call something dumb that you don't fully understand. It's
offenses the people who have spent lots of time developing Python --
personal, unpaid and voluntary time!
I can assure, the bytes alias and b'' alias have their right to exist.

Christian

--
http://mail.python.org/mailman/listinfo/python-list


Re: python contextmanagers and ruby blocks

2009-02-21 Thread Aahz
In article ,
Alia K   wrote:
>
>Nevertheless, I remain curious about whether once can use the
>contextmanager in python to achieve the full power of ruby's blocks...

Short answer: no

Longer answer: the way in Python to achieve the full power of Ruby
blocks is to write a function.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: can multi-core improve single funciton?

2009-02-21 Thread Aahz
In article ,
Grant Edwards   wrote:
>On 2009-02-20, Aahz  wrote:
>> Steven D'Aprano   wrote:
>>>
>>> As I understand it, there's very little benefit to multi-cores in
>>> Python due to the GIL.
>>
>> As phrased, your statement is completely wrong.  Here's a more
>> correct phrasing: "For threaded compute-bound applications written
>> in pure Python, there's very little benefit to multiple cores." But
>> threaded I/O-bound applications do receive some benefit from multiple
>> cores, and using multiple processes certainly leverages multiple
>> cores.  If boosting the performance of a threaded compute-bound
>> application is important, one can always write the critical parts in
>> C/C++.
>
>Do the crunchy bits of scipy/numpy, scientific python, vtk and other
>compute-intensive libraries tend to release the GIL while they're busy
>"computing"?
>
>[Perhaps using them doesn't count as "pure Python", but...]

They definitely do not count as pure Python -- but I probably should have
mentioned that there are pre-existing libraries that can be used.  It's
just that I/O is about the only area where there is a concerted effort to
ensure that GIL gets released.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Aahz
In article <499f397c.7030...@v.loewis.de>,
=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=   wrote:
>> Yes, I know that.  But every concrete representation of a unicode string 
>> has to have an encoding associated with it, including unicode strings 
>> produced by the Python parser when it parses the ascii string "u'\xb5'"
>> 
>> My question is: what is that encoding?
>
>The internal representation is either UTF-16, or UTF-32; which one is
>a compile-time choice (i.e. when the Python interpreter is built).

Wait, I thought it was UCS-2 or UCS-4?  Or am I misremembering the
countless threads about the distinction between UTF and UCS?
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 12:22:36 -0500)
> =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=   wrote:
> >I don't think that was the complaint. Instead, the complaint was
> >that the OP's original message did not have a Content-type header,
> >and that it was thus impossible to tell what the byte in front of
> >"Wiki" meant. To properly post either MICRO SIGN or GREEK SMALL LETTER
> >MU in a usenet or email message, you really must use MIME. (As both
> >your article and Thorsten's did, by choosing UTF-8)
> 
> MIME only applies Internet e-mail messages.

No, it doesn't: "MIME's use, however, has grown beyond describing the 
content of e-mail to describing content type in general. [...]

The content types defined by MIME standards are also of importance 
outside of e-mail, such as in communication protocols like HTTP [...]"

http://en.wikipedia.org/wiki/MIME

> RFC 1036 doesn't require nor give a meaning to a Content-Type header
> in a Usenet message

Well, /maybe/ the reason for that is that RFC 1036 was written in 1987 
and the first MIME RFC in 1992...? The "Son of RFC 1036" mentions MIME 
more often than you can count.

> so there's nothing wrong with the original poster's newsreader.

If you follow RFC 1036 (who was written before anyone even thought of 
MIME) then all content has to ASCII. The OP used non ASCII letters.

It's all about declaring your charset. In Python as well as in your 
newsreader. If you don't declare your charset it's ASCII for you - in 
Python as well as in your newsreader.

Thorsten
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Thorsten Kampe
* "Martin v. Löwis" (Sat, 21 Feb 2009 00:15:08 +0100)
> > Yes, I know that. But every concrete representation of a unicode
> > string has to have an encoding associated with it, including unicode
> > strings produced by the Python parser when it parses the ascii
> > string "u'\xb5'"
> > 
> > My question is: what is that encoding?
> 
> The internal representation is either UTF-16, or UTF-32; which one is
> a compile-time choice (i.e. when the Python interpreter is built).

I'm pretty much sure it is UCS-2 or UCS-4. (Yes, I know there is only a 
slight difference to UTF-16/UTF-32).

Thorsten
--
http://mail.python.org/mailman/listinfo/python-list


Ctypes debug of dll function

2009-02-21 Thread Massi
Hi everyone, I'm pretty new to the ctypes module and I'm encountering
a problem. I'm working under windows xp with python 2.5 and in my
script I use ctypes to call from a dll some functions I wrote in C.
When I call one of these functions it happens that my script crashes
raising the following error:
WindowsError: exception: access violation writing 0x
My question is...is there a way to debug my C function from the dll
while it is run from the python script?
Thanks in advance for the help.
--
http://mail.python.org/mailman/listinfo/python-list


Re: "Byte" type?

2009-02-21 Thread Benjamin Kaplan
On Sat, Feb 21, 2009 at 12:21 PM, John Nagle  wrote:

> Steve Holden wrote:
>
>> John Nagle wrote:
>>
>>> Steve Holden wrote:
>>>
 John Nagle wrote:

> Benjamin Kaplan wrote:
>
>> On Sun, Feb 15, 2009 at 11:57 AM, John Nagle 
>> wrote:
>>
>>  ...Re "bytes" not behaving as documented in 2.6:
>>>
>>>That's indeed how Python 2.6 works.  But that's not how
> PEP 3137 says it's supposed to work.
>
> Guido:
>
>  "I propose the following type names at the Python level:
>
>* bytes is an immutable array of bytes (PyString)
>* bytearray is a mutable array of bytes (PyBytes)"
>
 ...
>>>
 (Not true in Python 2.6
> Is this a bug, a feature, a documentation error, or bad design?
>
>  It's a feature. In fact all that was done to accommodate easier
 migration to 3.x is easily shown in one statement:

  str is bytes
>>>
>> True

 So that's why bytes works the way it does in 2.6 ... hence my contested
 description of it as an "ugly hack". I am happy to withdraw "ugly", but
 I think "hack" could still be held to apply.

>>>   Agreed.  But is this a 2.6 thing, making 2.6 incompatible with 3.0, or
>>> what?  How will 3.x do it?  The PEP 3137 way, or the Python 2.6 way?
>>>
>>>   The way it works in 2.6 makes it necessary to do "ord" conversions
>>> where they shouldn't be required.
>>>
>>>  Yes, the hack was to achieve a modicum of compatibility with 3.0 without
>> having to turn the world upside down.
>>
>> I haven't used 3.0 enough the say whether bytearray has been correctly
>> implemented. But I believe the intention is that 3.0 should fully
>> implement PEP 3137.
>>
>
>   If "bytes", a new keyword, works differently in 2.6 and 3.0, that was
> really
> dumb.  There's no old code using "bytes".  So converting code to 2.6 means
> it has to be converted AGAIN for 3.0.  That's a good reason to ignore 2.6
> as
> defective.
>

"""

The primary use of bytes in 2.6 will be to write tests of object type such
as isinstance(x, bytes). This will help the 2to3 converter, which can't tell
whether 2.x code intends strings to contain either characters or 8-bit
bytes; you can now use either bytes or
strto represent
your intention exactly, and the resulting code will also be
correct in Python 3.0.

"""

The reason for putting bytes (a new type, not a keyword) into 2.6 was purely
for use in the 2to3 tool. It is designed to break less code on the
conversion from 2.x to 3.x, not to add extra features to 2.6+.


>
>John Nagle
> --
> http://mail.python.org/mailman/listinfo/python-list
>
--
http://mail.python.org/mailman/listinfo/python-list


Re: Change in cgi handling of POST requests

2009-02-21 Thread Bob Kline

Aahz wrote:


Interesting.  Nobody has responded, so I suggest first filing a report
using bugs.python.org and then asking on python-dev (with reference to
your report).


http://bugs.python.org/issue5340

Cheers,
Bob
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Denis Kasak
On Sat, Feb 21, 2009 at 7:24 PM, Thorsten Kampe
 wrote:
>
> I'm pretty much sure it is UCS-2 or UCS-4. (Yes, I know there is only a
> slight difference to UTF-16/UTF-32).

I wouldn't call the difference that slight, especially between UTF-16
and UCS-2, since the former can encode all Unicode code points, while
the latter can only encode those in the BMP.

--
Denis Kasak
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Thorsten Kampe   wrote:
>> RFC 1036 doesn't require nor give a meaning to a Content-Type header
>> in a Usenet message
>
>Well, /maybe/ the reason for that is that RFC 1036 was written in 1987 
>and the first MIME RFC in 1992...?

Obviously.

>"Son of RFC 1036" mentions MIME more often than you can count.

Since it was never sumbitted and accepted, RFC 1036 remains current.

>> so there's nothing wrong with the original poster's newsreader.
>
>If you follow RFC 1036 (who was written before anyone even thought of 
>MIME) then all content has to ASCII. The OP used non ASCII letters.

RFC 1036 doesn't place any restrictions on the content on the body of
an article.  On the other hand "Son of RFC 1036" does have restrictions
on characters used in the body of message:

Articles MUST not contain any octet with value exceeding 127,
i.e. any octet that is not an ASCII character

Which means that merely adding a Content-Encoding header wouldn't
be enough to conform to "Son of RFC 1036", the original poster would
also have had to either switch to a 7-bit character set or use a 7-bit
compatible transfer encoding.  If you trying to claim that "Son of RFC
1036" is the new defacto standard, then that would mean your newsreader
is broken too.

>It's all about declaring your charset. In Python as well as in your 
>newsreader. If you don't declare your charset it's ASCII for you - in 
>Python as well as in your newsreader.

Except in practice unlike Python, many newsreaders don't assume ASCII.
The original article displayed fine for me.  Google Groups displays it
correctly too:

http://groups.google.com/group/comp.lang.python/msg/828fefd7040238bc

I could just as easily argue that assuming ISO 8859-1 is the defacto
standard, and that its your newsreader that's broken.  The reality however
is that RFC 1036 is the only standard for Usenet messages, defacto or
otherwise, and so there's nothing wrong with anyone's newsreader.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Martin v. Löwis
>>> My question is: what is that encoding?
>> The internal representation is either UTF-16, or UTF-32; which one is
>> a compile-time choice (i.e. when the Python interpreter is built).
> 
> Wait, I thought it was UCS-2 or UCS-4?  Or am I misremembering the
> countless threads about the distinction between UTF and UCS?

You are not misremembering. I personally never found them conclusive,
and, with PEP 261, I think, calling the 2-byte version "UCS-2" is
incorrect.

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Martin v. Löwis
>> I'm pretty much sure it is UCS-2 or UCS-4. (Yes, I know there is only a
>> slight difference to UTF-16/UTF-32).
> 
> I wouldn't call the difference that slight, especially between UTF-16
> and UCS-2, since the former can encode all Unicode code points, while
> the latter can only encode those in the BMP.

Indeed. As Python *can* encode all characters even in 2-byte mode
(since PEP 261), it seems clear that Python's Unicode representation
is *not* strictly UCS-2 anymore.

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Re: `high overhead of multiple Python processes' (was: Will multithreading make python less popular?)

2009-02-21 Thread Joshua Judson Rosen
Paul Rubin  writes:
>
> Right, that's basically the issue here: the cost of using multiple
> Python processes is unnecessarily high.  If that cost were lower then
> we could more easily use multiple cores to make oru apps faster.

What cost is that? At least on unix systems, fork() tends have
*trivial* overhead in terms of both time and space, because the
processes use lazy copy-on-write memory underneath, so the real costs
of resource-consumption for spawning a new process vs. spawning a new
thread should be comparable.

Are you referring to overhead associated with inter-process
communication? If so, what overhead is that?

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Denis Kasak
On Sat, Feb 21, 2009 at 9:10 PM, "Martin v. Löwis"  wrote:
>>> I'm pretty much sure it is UCS-2 or UCS-4. (Yes, I know there is only a
>>> slight difference to UTF-16/UTF-32).
>>
>> I wouldn't call the difference that slight, especially between UTF-16
>> and UCS-2, since the former can encode all Unicode code points, while
>> the latter can only encode those in the BMP.
>
> Indeed. As Python *can* encode all characters even in 2-byte mode
> (since PEP 261), it seems clear that Python's Unicode representation
> is *not* strictly UCS-2 anymore.

Since we're already discussing this, I'm curious - why was UCS-2
chosen over plain UTF-16 or UTF-8 in the first place for Python's
internal storage?

-- 
Denis Kasak
--
http://mail.python.org/mailman/listinfo/python-list


urllib2 login help

2009-02-21 Thread john . weatherwax
Hello,

I'm having trouble using urllib2 (maybe) when trying to log into a web
site that requires a user to enter a login name and a password
(authentication).  I've tried many things but none seem to work and
have become stuck recently and was hoping to get a hint from those
much more knowledgeable than myself.

I want to automate logging on to the investopedia stock simulator web
site.

http://simulator.investopedia.com/authorization/login.aspx

I can't seem to automate this successfully.

My python script is below:

import os,re
import urllib, urllib2

theurl = "http://simulator.investopedia.com/authorization/login.aspx";

post_dict = { "ct100$MainPlaceHolder$usernameText": "XXX",
"ct100$MainPlaceHolder$passwordText": "XXX", "ct100$MainPlaceHolder
$loginButton": "Sign In", "ct100$MainPlaceHolder$rememberMeCheckBox":
"on" }

post_data = urllib.urlencode( post_dict )

headers = { 'User-agent': 'Mozilla/4.0 (compatible; MSIE 5.5; Windows
NT)',
'Host': 'simulator.investopedia.com',
'Referer': 'http://simulator.investopedia.com/
authorization/login.aspx',
}

req = urllib2.Request( theurl, post_data, headers )
response = urllib2.urlopen(req)
the_page = response.read()

The problem is that this page returned seems to be the same as the
login page which I connected to initially.  When I perform the login
process manually things work as expected.  What am I doing
incorrectly?

*ANY* hints/suggestions/directions would be very appreciated since
I've run out of ideas of things to try at this point.

Thanks very much


--
http://mail.python.org/mailman/listinfo/python-list


Re: urllib2 login help

2009-02-21 Thread Stephen Hansen
>
> *ANY* hints/suggestions/directions would be very appreciated since
> I've run out of ideas of things to try at this point.
>

The last time I heard something like this, I suggested that the problem
might be cookies -- and it ended up working for the person I believe.

http://groups.google.com/group/comp.lang.python/browse_thread/thread/4d78de927ee1e549

--S
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Adam Olsen
On Feb 21, 10:48 am, a...@pythoncraft.com (Aahz) wrote:
> In article <499f397c.7030...@v.loewis.de>,
>
> =?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=   wrote:
> >> Yes, I know that.  But every concrete representation of a unicode string
> >> has to have an encoding associated with it, including unicode strings
> >> produced by the Python parser when it parses the ascii string "u'\xb5'"
>
> >> My question is: what is that encoding?
>
> >The internal representation is either UTF-16, or UTF-32; which one is
> >a compile-time choice (i.e. when the Python interpreter is built).
>
> Wait, I thought it was UCS-2 or UCS-4?  Or am I misremembering the
> countless threads about the distinction between UTF and UCS?

Nope, that's partly mislabeling and partly a bug.  UCS-2/UCS-4 refer
to Unicode 1.1 and earlier, with no surrogates.  We target Unicode
5.1.

If you naively encode UCS-2 as UTF-8 you really end up with CESU-8.
You miss the step where you combine surrogate pairs (which only exist
in UTF-16) into a single supplementary character.  Lo and behold,
that's actually what current python does in some places.  It's not
pretty.

See bugs #3297 and #3672.
--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Martin v. Löwis
>> Indeed. As Python *can* encode all characters even in 2-byte mode
>> (since PEP 261), it seems clear that Python's Unicode representation
>> is *not* strictly UCS-2 anymore.
> 
> Since we're already discussing this, I'm curious - why was UCS-2
> chosen over plain UTF-16 or UTF-8 in the first place for Python's
> internal storage?

You mean, originally? Originally, the choice was only between UCS-2
and UCS-4; choice was in favor of UCS-2 because of size concerns.
UTF-8 was ruled out easily because it doesn't allow constant-size
indexing; UTF-16 essentially for the same reason (plus there was
no point to UTF-16, since there were no assigned characters outside
the BMP).

Regards,
Martin



--
http://mail.python.org/mailman/listinfo/python-list


Re: What encoding does u'...' syntax use?

2009-02-21 Thread Denis Kasak
On Sat, Feb 21, 2009 at 9:45 PM, "Martin v. Löwis"  wrote:
>>> Indeed. As Python *can* encode all characters even in 2-byte mode
>>> (since PEP 261), it seems clear that Python's Unicode representation
>>> is *not* strictly UCS-2 anymore.
>>
>> Since we're already discussing this, I'm curious - why was UCS-2
>> chosen over plain UTF-16 or UTF-8 in the first place for Python's
>> internal storage?
>
> You mean, originally? Originally, the choice was only between UCS-2
> and UCS-4; choice was in favor of UCS-2 because of size concerns.
> UTF-8 was ruled out easily because it doesn't allow constant-size
> indexing; UTF-16 essentially for the same reason (plus there was
> no point to UTF-16, since there were no assigned characters outside
> the BMP).

Yes, I failed to realise how long ago the unicode data type was
implemented originally. :-)
Thanks for the explanation.

-- 
Denis Kasak
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 14:52:09 -0500)
> Thorsten Kampe   wrote:
>> It's all about declaring your charset. In Python as well as in your
>> newsreader. If you don't declare your charset it's ASCII for you - in
>> Python as well as in your newsreader.
> 
> Except in practice unlike Python, many newsreaders don't assume ASCII.

They assume ASCII - unless you declare your charset (the exception being 
Outlook Express and a few Windows newsreaders). Everything else is 
"guessing".

> The original article displayed fine for me. Google Groups displays it
> correctly too:
> 
>   http://groups.google.com/group/comp.lang.python/msg/828fefd7040238bc

Your understanding of the principles of Unicode is as least as non-
existant as the OP's.
 
> I could just as easily argue that assuming ISO 8859-1 is the defacto
> standard, and that its your newsreader that's broken.

There is no "standard" in regard to guessing (this is what you call 
"assuming"). The need for explicit declaration of an encoding is exactly 
the same in Python as in any Usenet article.

> The reality however is that RFC 1036 is the only standard for Usenet
> messages, defacto or otherwise, and so there's nothing wrong with
> anyone's newsreader.

The reality is that all non-broken newsreaders use MIME headers to 
declare and interpret the charset being used. I suggest you read at 
least http://www.joelonsoftware.com/articles/Unicode.html to get an idea 
of Unicode and associated topics.

Thorsten
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread odeits
On Feb 21, 12:47 am, "Gabriel Genellina" 
wrote:
> En Sat, 21 Feb 2009 01:14:02 -0200, odeits  escribió:
>
>
>
> > On Feb 15, 11:31 pm, odeits  wrote:
> >> It seems what you are actually testing for is if the intersection of
> >> the two sets is not empty where the first set is the characters in
> >> your word and the second set is the characters in your defined string.
>
> > To expand on what I was saying I thought i should provide a code
> > snippet:
>
> > WORD = 'g' * 100
> > WORD2 = 'g' * 50 + 'U'
> > VOWELS = 'aeiouAEIOU'
> > BIGWORD = 'g' * 1 + 'U'
>
> > def set_test(vowels, word):
>
> >     vowels = set( iter(vowels))
> >     letters = set( iter(word) )
>
> >     if letters & vowels:
> >         return True
> >     else:
> >         return False
>
> > with python 2.5 I got 1.30 usec/pass against the BIGWORD
>
> You could make it slightly faster by removing the iter() call: letters =  
> set(word)
> And (if vowels are really constant) you could pre-build the vowels set.
>
> --
> Gabriel Genellina

set(word) = set{[word]} meaning a set with one element, the string
the call to iter makes it set of the letters making up the word.
--
http://mail.python.org/mailman/listinfo/python-list


Re: python contextmanagers and ruby blocks

2009-02-21 Thread Alia K
Aahz wrote:
> Longer answer: the way in Python to achieve the full power of Ruby
> blocks is to write a function.

You are most likely right... there is probably no need to introduce
ruby-like blocks to python where iteration comes naturally with list
comprehensions and generators. But for the simple case of entering a
block of code as one does with @contextmanager I suppose it would be
nice to make a generator with a single yield statement a
contextmanager by default such that:

>>> def g():
... print "a"
... yield
... print "b"
...
>>> with g():
... print "c"
...
a
c
b

would be equivalent to

>>> from __future__ import with_statement
>>> from contextlib import contextmanager
>>> @contextmanager
... def g():
... print "a"
... yield
... print "b"
...
>>> with g():
... print "c"
...
a
c
b

but then again, I suppose "explicit is better than implicit"...

AK

--
http://mail.python.org/mailman/listinfo/python-list


Re: `high overhead of multiple Python processes' (was: Will multithreading make python less popular?)

2009-02-21 Thread Paul Rubin
Joshua Judson Rosen  writes:
> > Right, that's basically the issue here: the cost of using multiple
> > Python processes is unnecessarily high.
> What cost is that? 

The cost of messing with the multiprocessing module instead of having
threads work properly, and the overhead of serializing Python data
structures to send to another process by IPC, instead of using the
same object in two threads.  Also, one way I often use threading is by
sending function objects to another thread through a Queue, so the
other thread can evaluate the function.  I don't think multiprocessing
gives a way to serialize functions, though maybe something like it
can be done at higher nuisance using classes.
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Ross Ridge (Sat, 21 Feb 2009 14:52:09 -0500)
> Except in practice unlike Python, many newsreaders don't assume ASCII.

Thorsten Kampe   wrote:
>They assume ASCII - unless you declare your charset (the exception being 
>Outlook Express and a few Windows newsreaders). Everything else is 
>"guessing".

No, it's an assumption like the way Python by default assumes ASCII.

>> The original article displayed fine for me. Google Groups displays it
>> correctly too:
>> 
>>  http://groups.google.com/group/comp.lang.python/msg/828fefd7040238bc
>
>Your understanding of the principles of Unicode is as least as non-
>existant as the OP's.
 
The link demonstrates that Google Groups doesn't assume ASCII like
Python does.  Since popular newsreaders like Google Groups and Outlook
Express can display the message correctly without the MIME headers,
but your obscure one can't, there's a much stronger case to made that
it's your newsreader that's broken.

>> I could just as easily argue that assuming ISO 8859-1 is the defacto
>> standard, and that its your newsreader that's broken.
>
>There is no "standard" in regard to guessing (this is what you call 
>"assuming"). The need for explicit declaration of an encoding is exactly 
>the same in Python as in any Usenet article.

No, many newsreaders don't assume ASCII by default like Python. 

>> The reality however is that RFC 1036 is the only standard for Usenet
>> messages, defacto or otherwise, and so there's nothing wrong with
>> anyone's newsreader.
>
>The reality is that all non-broken newsreaders use MIME headers to 
>declare and interpret the charset being used. 

Since RFC 1036 doesn't require MIME headers a reader that doesn't generate
them is by definition not broken.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Carl Banks
On Feb 19, 6:57 pm, Ron Garret  wrote:
> I'm writing a little wiki that I call µWiki.  That's a lowercase Greek
> mu at the beginning (it's pronounced micro-wiki).  It's working, except
> that I can't actually enter the name of the wiki into the wiki itself
> because the default unicode encoding on my Python installation is
> "ascii".  So I'm trying to decide on a course of action.  There seem to
> be three possibilities:
>
> 1.  Change the code to properly support unicode.  Preliminary
> investigations indicate that this is going to be a colossal pain in the
> ass.
>
> 2.  Change the default encoding on my Python installation to be latin-1
> or UTF8.  The disadvantage to this is that no one else will be able to
> run my code without making the same change to their installation, since
> you can't change default encodings once Python has started.
>
> 3.  Punt and spell it 'uwiki' instead.
>
> I'm feeling indecisive so I thought I'd ask other people's opinion.  
> What should I do?
>
> rg

--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread rdmurray
odeits  wrote:
> On Feb 21, 12:47=A0am, "Gabriel Genellina" 
> wrote:
> > En Sat, 21 Feb 2009 01:14:02 -0200, odeits  escribi=F3:
> >
> > > On Feb 15, 11:31=A0pm, odeits  wrote:
> > >> It seems what you are actually testing for is if the intersection of
> > >> the two sets is not empty where the first set is the characters in
> > >> your word and the second set is the characters in your defined string.
> >
> > > To expand on what I was saying I thought i should provide a code
> > > snippet:
> >
> > > WORD = 'g' * 100
> > > WORD2 = 'g' * 50 + 'U'
> > > VOWELS = 'aeiouAEIOU'
> > > BIGWORD = 'g' * 1 + 'U'
> >
> > > def set_test(vowels, word):
> >
> > >  vowels = set( iter(vowels))
> > >  letters = set( iter(word) )
> >
> > >  if letters & vowels:
> > >  return True
> > >  else:
> > > return False
> >
> > > with python 2.5 I got 1.30 usec/pass against the BIGWORD
> >
> > You could make it slightly faster by removing the iter() call:
> > letters = set(word)
> > And (if vowels are really constant) you could pre-build the vowels set.
> 
> set(word) = set{[word]} meaning a set with one element, the string
> the call to iter makes it set of the letters making up the word.

Did you try it?

Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> set('abcd')
set(['a', 'c', 'b', 'd'])

--RDM

--
http://mail.python.org/mailman/listinfo/python-list


datetime.time and midnight

2009-02-21 Thread Ethan Furman

Greetings, List!

I was curious if anyone knew the rationale behind making midnight False?

--> import datetime
--> midnight = datetime.time(0,0,0)
--> bool(midnight)
False

To my way of thinking, midnight does actually exist so it should be 
true.  If datetime.time was measuring an *amount* of time, rather than a 
certain point during the day, then a time of 0:0:0 should certainly be 
False as it would mean no time had passed.  However, since midnight does 
indeed exist (as many programmers have observed when deadlines approach 
;) I would think it should be true.

--
~Ethan~
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 17:07:35 -0500)
> The link demonstrates that Google Groups doesn't assume ASCII like
> Python does.  Since popular newsreaders like Google Groups and Outlook
> Express can display the message correctly without the MIME headers,
> but your obscure one can't, there's a much stronger case to made that
> it's your newsreader that's broken.

*sigh* I give up on you. You didn't even read the "Joel on Software" 
article. The whole "why" and "what for" of Unicode and MIME will always 
be a complete mystery to you.

T.
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Ross Ridge (Sat, 21 Feb 2009 17:07:35 -0500) 
> The link demonstrates that Google Groups doesn't assume ASCII like
> Python does.  Since popular newsreaders like Google Groups and Outlook
> Express can display the message correctly without the MIME headers,
> but your obscure one can't, there's a much stronger case to made that
> it's your newsreader that's broken.

Thorsten Kampe   wrote:
>*sigh* I give up on you. You didn't even read the "Joel on Software" 
>article. The whole "why" and "what for" of Unicode and MIME will always 
>be a complete mystery to you.

I understand what Unicode and MIME are for and why they exist.  Neither
their merits nor your insults change the fact that the only current
standard governing the content of Usenet posts doesn't require their use.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
--
http://mail.python.org/mailman/listinfo/python-list


Re: zlib interface semi-broken

2009-02-21 Thread Aahz
In article ,
Travis   wrote:
>
>So I've submitted a patch to bugs.python.org to add a new member
>called is_finished to the zlib decompression object.
>
>Issue 5210, file 13056, msg 81780

You may also want to bring this up on the python-ideas mailing list for
further discussion.
-- 
Aahz (a...@pythoncraft.com)   <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote 
programs, then the first woodpecker that came along would destroy civilization.
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 18:06:35 -0500)
> > The link demonstrates that Google Groups doesn't assume ASCII like
> > Python does.  Since popular newsreaders like Google Groups and Outlook
> > Express can display the message correctly without the MIME headers,
> > but your obscure one can't, there's a much stronger case to made that
> > it's your newsreader that's broken.
> 
> Thorsten Kampe   wrote:
> >*sigh* I give up on you. You didn't even read the "Joel on Software" 
> >article. The whole "why" and "what for" of Unicode and MIME will always 
> >be a complete mystery to you.
> 
> I understand what Unicode and MIME are for and why they exist. Neither
> their merits nor your insults change the fact that the only current
> standard governing the content of Usenet posts doesn't require their
> use.

That's right. As long as you use pure ASCII you can skip this nasty step 
of informing other people which charset you are using. If you do use non 
ASCII then you have to do that. That's the way virtually all newsreaders 
work. It has nothing to do with some 21+ year old RFC. Even your Google 
Groups "newsreader" does that ('content="text/html; charset=UTF-8"').

Being explicit about your encoding is 99% of the whole Unicode magic in 
Python and in any communication across the Internet (may it be NNTP, 
SMTP or HTTP). Your Google Groups simply uses heuristics to guess the 
encoding the OP probably used. Windows newsreaders simply use the locale 
of the local host. That's guessing. You can call it assuming but it's 
still guessing. There is no way you can be sure without any declaration.

And it's unpythonic. Python "assumes" ASCII and if the decodes/encoded 
text doesn't fit that encoding it refuses to guess.

T.
--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime.time and midnight

2009-02-21 Thread MRAB

Ethan Furman wrote:

Greetings, List!

I was curious if anyone knew the rationale behind making midnight False?

--> import datetime
--> midnight = datetime.time(0,0,0)
--> bool(midnight)
False

To my way of thinking, midnight does actually exist so it should be 
true.  If datetime.time was measuring an *amount* of time, rather than a 
certain point during the day, then a time of 0:0:0 should certainly be 
False as it would mean no time had passed.  However, since midnight does 
indeed exist (as many programmers have observed when deadlines approach 
;) I would think it should be true.



I think it's because midnight is to the time of day what zero is to
integers, or an empty string is to strings, or an empty container ...
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module and os.close(sys.stdin.fileno())

2009-02-21 Thread Graham Dumpleton
On Feb 21, 4:20 pm, Joshua Judson Rosen  wrote:
> Jesse Noller  writes:
>
> > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
> >  wrote:
> > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
> > > _bootstrap() doing:
>
> > >  os.close(sys.stdin.fileno())
>
> > > rather than:
>
> > >  sys.stdin.close()
>
> > > Technically it is feasible that stdin could have been replaced with
> > > something other than a file object, where the replacement doesn't have
> > > a fileno() method.
>
> > > In that sort of situation an AttributeError would be raised, which
> > > isn't going to be caught as either OSError or ValueError, which is all
> > > the code watches out for.
>
> > I don't know why it was implemented that way. File an issue on the
> > tracker and assign it to me (jnoller) please.
>
> My guess would be: because it's also possible for sys.stdin to be a
> file that's open in read+*write* mode, and for that file to have
> pending output buffered (for example, in the case of a socketfile).

If you are going to have a file that is writable as well as readable,
such as a socket, then likely that sys.stdout/sys.stderr are going to
be bound to it at the same time. If that is the case then one should
not be using close() at all as it will then also close the write side
of the pipe and cause errors when code subsequently attempts to write
to sys.stdout/sys.stderr.

In the case of socket you would actually want to use shutdown() to
close just the input side of the socket.

What this all means is that what is the appropriate thing to do is
going to depend on the environment in which the code is used. Thus,
having the behaviour hard wired a certain way is really bad. There
perhaps instead should be a way of a user providing a hook function to
be called to perform any case specific cleanup of stdin, stdout and
stderr, or otherwise reassign them.

That this is currently in the _bootstrap() function, which does other
important stuff, doesn't exactly make it look like it is easily
overridden to work for a specific execution environment which is
different to the norm.

> There's a general guideline, inherited from C, that one should ensure
> that the higher-level close() routine is invoked on a given
> file-descriptor in at most *one* process after that descriptor has
> passed through a fork(); in the other (probably child) processes, the
> lower-level close() routine should be called to avoid a
> double-flush--whereby buffered data is flushed out of one process, and
> then the *same* buffered data is flushed out of the (other)
> child-/parent-process' copy of the file-object.
>
> So, if you call sys.stdin.close() in the child-process in
> _bootstrap(), then it could lead to a double-flush corrupting output
> somewhere in the application that uses the multiprocessing module.
>
> You can expect similar issues with just about /any/ `file-like objects'
> that might have `file-like semantics' of buffering data and flushing
> it on close, also--because you end up with multiple copies of the same
> object in `pre-flush' state, and each copy tries to flush at some point.
>
> As such, I'd recommend against just using .close(); you might use
> something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
> `else' clause unconditionally calls sys.stdin.close(), then you still
> have double-flush problems if someone's set sys.stdin to a file-like
> object with output-buffering.
>
> I guess you could try calling that an `edge-case' and seeing if anyone
> screams. It'd be sort-of nice if there was finer granularity in the
> file API--maybe if file.close() took a boolean `flush' argument

Graham
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Ross Ridge
Ross Ridge (Sat, 21 Feb 2009 18:06:35 -0500)
> I understand what Unicode and MIME are for and why they exist. Neither
> their merits nor your insults change the fact that the only current
> standard governing the content of Usenet posts doesn't require their
> use.

Thorsten Kampe   wrote:
>That's right. As long as you use pure ASCII you can skip this nasty step 
>of informing other people which charset you are using. If you do use non 
>ASCII then you have to do that. That's the way virtually all newsreaders 
>work. It has nothing to do with some 21+ year old RFC. Even your Google 
>Groups "newsreader" does that ('content="text/html; charset=UTF-8"').

No, the original post demonstrates you don't have include MIME headers for
ISO 8859-1 text to be properly displayed by many newsreaders.  The fact
that your obscure newsreader didn't display it properly doesn't mean
that original poster's newsreader is broken.

>Being explicit about your encoding is 99% of the whole Unicode magic in 
>Python and in any communication across the Internet (may it be NNTP, 
>SMTP or HTTP).

HTTP requires the assumption of ISO 8859-1 in the absense of any
specified encoding. 

>Your Google Groups simply uses heuristics to guess the 
>encoding the OP probably used. Windows newsreaders simply use the locale 
>of the local host. That's guessing. You can call it assuming but it's 
>still guessing. There is no way you can be sure without any declaration.

Newsreaders assuming ISO 8859-1 instead of ASCII doesn't make it a guess.
It's just a different assumption, nor does making an assumption, ASCII
or ISO 8850-1, give you any certainty.

>And it's unpythonic. Python "assumes" ASCII and if the decodes/encoded 
>text doesn't fit that encoding it refuses to guess.

Which is reasonable given that Python is programming language where it's
better to have more conservative assumption about encodings so errors
can be more quickly diagnosed.  A newsreader however is a different
beast, where it's better to make a less conservative assumption that's
more likely to display messages correctly to the user.  Assuming ISO
8859-1 in the absense of any specified encoding allows the message to be
correctly displayed if the character set is either ISO 8859-1 or ASCII.
Doing things the "pythonic" way and assuming ASCII only allows such
messages to be displayed if ASCII is used.

Ross Ridge

-- 
 l/  //   Ross Ridge -- The Great HTMU
[oo][oo]  rri...@csclub.uwaterloo.ca
-()-/()/  http://www.csclub.uwaterloo.ca/~rridge/ 
 db  //   
--
http://mail.python.org/mailman/listinfo/python-list


Re: Pythonic way to determine if one char of many in a string

2009-02-21 Thread odeits
On Feb 21, 2:24 pm, rdmur...@bitdance.com wrote:
> odeits  wrote:
> > On Feb 21, 12:47=A0am, "Gabriel Genellina" 
> > wrote:
> > > En Sat, 21 Feb 2009 01:14:02 -0200, odeits  escribi=F3:
>
> > > > On Feb 15, 11:31=A0pm, odeits  wrote:
> > > >> It seems what you are actually testing for is if the intersection of
> > > >> the two sets is not empty where the first set is the characters in
> > > >> your word and the second set is the characters in your defined string.
>
> > > > To expand on what I was saying I thought i should provide a code
> > > > snippet:
>
> > > > WORD = 'g' * 100
> > > > WORD2 = 'g' * 50 + 'U'
> > > > VOWELS = 'aeiouAEIOU'
> > > > BIGWORD = 'g' * 1 + 'U'
>
> > > > def set_test(vowels, word):
>
> > > >  vowels = set( iter(vowels))
> > > >  letters = set( iter(word) )
>
> > > >  if letters & vowels:
> > > >      return True
> > > >  else:
> > > >     return False
>
> > > > with python 2.5 I got 1.30 usec/pass against the BIGWORD
>
> > > You could make it slightly faster by removing the iter() call:
> > > letters = set(word)
> > > And (if vowels are really constant) you could pre-build the vowels set.
>
> > set(word) = set{[word]} meaning a set with one element, the string
> > the call to iter makes it set of the letters making up the word.
>
> Did you try it?
>
> Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
> [GCC 4.3.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.>>> 
> set('abcd')
>
> set(['a', 'c', 'b', 'd'])
>
> --RDM

You are in fact correct. Thank you for pointing that out.
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Thorsten Kampe
* Ross Ridge (Sat, 21 Feb 2009 19:39:42 -0500)
> Thorsten Kampe   wrote:
> >That's right. As long as you use pure ASCII you can skip this nasty step 
> >of informing other people which charset you are using. If you do use non 
> >ASCII then you have to do that. That's the way virtually all newsreaders 
> >work. It has nothing to do with some 21+ year old RFC. Even your Google 
> >Groups "newsreader" does that ('content="text/html; charset=UTF-8"').
> 
> No, the original post demonstrates you don't have include MIME headers for
> ISO 8859-1 text to be properly displayed by many newsreaders.

*sigh* As you still refuse to read the article[1] I'm going to quote it 
now here:

'The Single Most Important Fact About Encodings

If you completely forget everything I just explained, please remember 
one extremely important fact. It does not make sense to have a string 
without knowing what encoding it uses.
[...]
If you have a string [...] in an email message, you have to know what 
encoding it is in or you cannot interpret it or display it to users 
correctly.

Almost every [...] "she can't read my emails when I use accents" problem 
comes down to one naive programmer who didn't understand the simple fact 
that if you don't tell me whether a particular string is encoded using 
UTF-8 or ASCII or ISO 8859-1 (Latin 1) or Windows 1252 (Western 
European), you simply cannot display it correctly [...]. There are over 
a hundred encodings and above code point 127, all bets are off.'

Enough said.

> The fact that your obscure newsreader didn't display it properly
> doesn't mean that original poster's newsreader is broken.

You don't even know if my "obscure newsreader" displayed it properly. 
Non ASCII text without a declared encoding is just a bunch of bytes. 
It's not even text.

T.

[1] http://www.joelonsoftware.com/articles/Unicode.html
--
http://mail.python.org/mailman/listinfo/python-list


Re: Using clock() in threading on Windows

2009-02-21 Thread David Bolen
"Martin v. Löwis"  writes:

> As a consequence, the half-busy loops could go away, at least
> on systems where lock timeouts can be given to the system.

I know that in some cases in the past I've had to bypass a Queue's use
of threading objects for waiting for a queue to unblock because of the
increased overhead (and latency as the timer increases) of the busy
loop.  On windows, replacing it with an implementation using
WaitForObject calls with the same timeouts I would have used with the
Queue performed much better, not unexpectedly, but was non-portable.

The current interface to the lowest level locks in Python are
certainly generic enough to cross lots of platforms, but it would
definitely be useful if they could implement timeouts without busy
loops on those platforms where they were supported.

-- David
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module and os.close(sys.stdin.fileno())

2009-02-21 Thread Joshua Judson Rosen
Graham Dumpleton  writes:
>
> On Feb 21, 4:20 pm, Joshua Judson Rosen  wrote:
> > Jesse Noller  writes:
> >
> > > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
> > >  wrote:
> > > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
> > > > _bootstrap() doing:
> >
> > > >  os.close(sys.stdin.fileno())
> >
> > > > rather than:
> >
> > > >  sys.stdin.close()
> >
> > > > Technically it is feasible that stdin could have been replaced with
> > > > something other than a file object, where the replacement doesn't have
> > > > a fileno() method.
> >
> > > > In that sort of situation an AttributeError would be raised, which
> > > > isn't going to be caught as either OSError or ValueError, which is all
> > > > the code watches out for.
> >
> > > I don't know why it was implemented that way. File an issue on the
> > > tracker and assign it to me (jnoller) please.
> >
> > My guess would be: because it's also possible for sys.stdin to be a
> > file that's open in read+*write* mode, and for that file to have
> > pending output buffered (for example, in the case of a socketfile).
> 
> If you are going to have a file that is writable as well as readable,
> such as a socket, then likely that sys.stdout/sys.stderr are going to
> be bound to it at the same time.

Yes.

> If that is the case then one should not be using close() at all

If you mean stdin.close(), then that's what I said :)

> as it will then also close the write side of the pipe and cause
> errors when code subsequently attempts to write to
> sys.stdout/sys.stderr.
>
> 
> In the case of socket you would actually want to use shutdown() to
> close just the input side of the socket.

Sure--but isn't this "you" the /calling/ code that set the whole thing
up? What the /caller/ does with its stdio is up to /him/, and beyond
the scope of the present discourse. I can appreciate a library forking
and then using os.close() on stdio (it protects my files from any I/O
the subprocess might think it wants to do with them), but I think I
might be even more annoyed if it *shutdown my sockets* than if it
caused double-flushes (there's at least a possibility that I could
cope with the double-flushes by just ensuring that *I* flushed before
the fork--not so with socket.shutdown()!)

> What this all means is that what is the appropriate thing to do is
> going to depend on the environment in which the code is used. Thus,
> having the behaviour hard wired a certain way is really bad. There
> perhaps instead should be a way of a user providing a hook function to
> be called to perform any case specific cleanup of stdin, stdout and
> stderr, or otherwise reassign them.

Usually, I'd say that that's what the methods on the passed-in object
are for. Though, as I said--the file-object API is lacking, here :(

> > As such, I'd recommend against just using .close(); you might use
> > something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
> > `else' clause unconditionally calls sys.stdin.close(), then you still
> > have double-flush problems if someone's set sys.stdin to a file-like
> > object with output-buffering.
> >
> > I guess you could try calling that an `edge-case' and seeing if anyone
> > screams. It'd be sort-of nice if there was finer granularity in the
> > file API--maybe if file.close() took a boolean `flush' argument

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python dictionary size/entry limit?

2009-02-21 Thread Sean

Stefan Behnel wrote:

intellimi...@gmail.com wrote:
You may be better served with one of the dbm databases that come with
Python. They live on-disk but do the usual in-memory caching. They'll
likely perform a lot better than your OS level swap file.

Stefan


the bsddb module has the feature that access to the database uses the 
same methods as python dictionaries.


Sean
--
http://mail.python.org/mailman/listinfo/python-list


Re: count secton of data in list

2009-02-21 Thread odeits
On Feb 20, 3:45 pm, Emile van Sebille  wrote:
> brianrpsgt1 wrote:
>
> > def step1(val):
>
>        data2_row = []
>
> >     for d1r in data1_row:
> >         if d1r[1] >= val:
> >             switch = 0
> >             data2_row = d1r[0],d1r[1],d1r[2],switch
>
>                data2_row.append([d1r[0],d1r[1],d1r[2],switch])
>
> HTH,
>
> Emile

def count_consecutive(rows):
switch = 0
count = 0
for r in rows:
if r[-1] == switch:
count += 1
else:
switch = not switch
if count != 0:
yield count
count = 0
if count != 0:
yield count



rows = [
['2009-01-09','13:17:30,96',123456,0],
['2009-01-09','13:17:31,95',123456,0],
['2009-01-09','13:17:32,95',123456,0],
['2009-01-09','13:17:33,95',123456,0],
['2009-01-09','13:17:34,94',123456,1],
['2009-01-09','13:17:35,94',123456,1],
['2009-01-09','13:17:36,94',123456,1],
['2009-01-09','13:17:37,94',123456,1],
['2009-01-09','13:17:38,94',123456,1],
['2009-01-09','13:17:39,94',123456,1],
['2009-01-09','13:17:40,94',123456,1],
['2009-01-09','13:17:41,94',123456,1],
['2009-01-09','13:17:42,95',123456,0],
['2009-01-09','13:17:43,95',123456,0],
['2009-01-09','13:17:44,95',123456,0],
['2009-01-09','13:17:45,95',123456,0]
]

for cnt in count_consecutive(rows):
print cnt
--
http://mail.python.org/mailman/listinfo/python-list


Re: Wanted: Online Python Course for Credit

2009-02-21 Thread Steve Holden
Scott David Daniels wrote:
> jsidell wrote:
>> I'm a high school game development teacher and I have recently
>> discovered Python to be a great way to introduce computer
>> programming.  I intend to study Python on my own but I can get
>> professional development credit at my job for taking a Python course.
>> So I'm looking for an online class that I can get a certificate,
>> college credit, or something to that effect.  Any suggestions would be
>> greatly appreciated
> 
O'Reilly have commissioned me to write four on-line classes. We
anticipate that they will become available later this year through the
O'Reilly School of Technology - see http://www.oreillyschool.com/

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Steve Holden
Thorsten Kampe wrote:
> * Ross Ridge (Sat, 21 Feb 2009 14:52:09 -0500)
>> Thorsten Kampe   wrote:
>>> It's all about declaring your charset. In Python as well as in your
>>> newsreader. If you don't declare your charset it's ASCII for you - in
>>> Python as well as in your newsreader.
>> Except in practice unlike Python, many newsreaders don't assume ASCII.
> 
> They assume ASCII - unless you declare your charset (the exception being 
> Outlook Express and a few Windows newsreaders). Everything else is 
> "guessing".
> 
>> The original article displayed fine for me. Google Groups displays it
>> correctly too:
>>
>>  http://groups.google.com/group/comp.lang.python/msg/828fefd7040238bc
> 
> Your understanding of the principles of Unicode is as least as non-
> existant as the OP's.
>  
>> I could just as easily argue that assuming ISO 8859-1 is the defacto
>> standard, and that its your newsreader that's broken.
> 
> There is no "standard" in regard to guessing (this is what you call 
> "assuming"). The need for explicit declaration of an encoding is exactly 
> the same in Python as in any Usenet article.
> 
>> The reality however is that RFC 1036 is the only standard for Usenet
>> messages, defacto or otherwise, and so there's nothing wrong with
>> anyone's newsreader.
> 
> The reality is that all non-broken newsreaders use MIME headers to 
> declare and interpret the charset being used. I suggest you read at 
> least http://www.joelonsoftware.com/articles/Unicode.html to get an idea 
> of Unicode and associated topics.
> 
And I suggest you try to phrase your remarks in a way more respectful of
those you are discussing these matters with. I understand that
exasperation can lead to offensiveness, but if a lack of understanding
does exist then it's better to simply try and remove it without
commenting on its existence.

regards
 Steve
-- 
Steve Holden+1 571 484 6266   +1 800 494 3119
Holden Web LLC  http://www.holdenweb.com/

--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Martin v. Löwis
>   Since when is "Google Groups" a newsreader? So far as I know, all
> the display/formatting is handled by my web browser and GG merely stuffs
> messages into an HTML wrapper...

It also transmits this HTML wrapper via HTTP, where it claims that the
charset of the HTML is UTF-8. To do that, it must have converted the
original message from Latin-1 to UTF-8, which must have required
interpreting it as Latin-1 in the first place.

Regards,
Martin
--
http://mail.python.org/mailman/listinfo/python-list


Python 3 and easygui problem

2009-02-21 Thread Peter Anderson
I have just installed Python 3. I have been using Tkinter and easygui 
(with Python 2.5.4) for any GUI needs. I have just started to port some 
of my existing scripts to Python 3 and discovered problems with easygui.


I was using the following script for testing:

from easygui import *
import sys

while 1:
msgbox("Hello, world!")

msg ="What is your favorite flavor?"
title = "Ice Cream Survey"
choices = ["Vanilla", "Chocolate", "Strawberry", "Rocky Road"]
choice = choicebox(msg, title, choices)

# note that we convert choice to string, in case
# the user cancelled the choice, and we got None.
msgbox("You chose: " + str(choice), "Survey Result")

msg = "Do you want to continue?"
title = "Please Confirm"
if ccbox(msg, title): # show a Continue/Cancel dialog
pass # user chose Continue
else:
sys.exit(0) # user chose Cancel

I have changed the easygui source to Python 3 'import' and 'print' 
requirements and the initial message box in the above script displays 
fine fine. However the subsequent message boxes do not display and after 
the script completes I get the following error message:


-- Python 3 --
Traceback (most recent call last):
File "easyguidemo.py", line 10, in
choice = choicebox(msg, title, choices)
File "C:\Python30\lib\site-packages\easygui.py", line 703, in choicebox
return __choicebox(msg,title,choices,buttons)
File "C:\Python30\lib\site-packages\easygui.py", line 824, in __choicebox
choices.sort( lambda x,y: cmp(x.lower(), y.lower())) # case-insensitive sort
TypeError: must use keyword argument for key function

Output completed (7 sec consumed)
--

Fixing this is a bit beyond my skills and I was wondering whether anyone 
has any thoughts.


I am happy to post a copy of my revised easygui.py script.

Regards,
Peter
--
*Peter Anderson*
There is nothing more difficult to take in hand, more perilous to 
conduct, or more uncertain in its success, than to take the lead in the 
introduction of a new order of things—Niccolo Machiavelli, /The Prince/, 
ch. 6

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 and easygui problem

2009-02-21 Thread Chris Rebert
On Sat, Feb 21, 2009 at 8:46 PM, Peter Anderson
 wrote:
> I have just installed Python 3. I have been using Tkinter and easygui (with
> Python 2.5.4) for any GUI needs. I have just started to port some of my
> existing scripts to Python 3 and discovered problems with easygui.
>
> I was using the following script for testing:


> File "C:\Python30\lib\site-packages\easygui.py", line 824, in __choicebox
> choices.sort( lambda x,y: cmp(x.lower(), y.lower())) # case-insensitive sort
> TypeError: must use keyword argument for key function

Change the line to

choices.sort(key=lambda x,y: cmp(x.lower(), y.lower()))

Note the "key=". The calling sequence for .sort() changed in Python
3.0 to make 'key' a keyword-only argument (I think).

Cheers,
Chris

-- 
Follow the path of the Iguana...
http://rebertia.com
--
http://mail.python.org/mailman/listinfo/python-list


Re: To unicode or not to unicode

2009-02-21 Thread Joshua Judson Rosen
Ross Ridge  writes:
>
> > It's all about declaring your charset. In Python as well as in your 
> > newsreader. If you don't declare your charset it's ASCII for you - in 
> > Python as well as in your newsreader.
> 
> Except in practice unlike Python, many newsreaders don't assume ASCII.
> The original article displayed fine for me.

Right. Exactly.

Wasn't that exact issue a driving force behind unicode's creation in
the first place? :)

To avoid horrors like this:

   http://en.wikipedia.org/wiki/File:Letter_to_Russia_with_krokozyabry.jpg

... and people getting into arguments on usenet and having to use
rebuttals like "Well, it looked fine to *me*--there's nothing wrong,
we're just using incompatible encodings!"?

But you're right--specifying in usenet-posts is like
turn-signals

Can we get back to Python programming, now? :)

-- 
Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.
--
http://mail.python.org/mailman/listinfo/python-list


Re: multiprocessing module and os.close(sys.stdin.fileno())

2009-02-21 Thread Graham Dumpleton
On Feb 22, 12:52 pm, Joshua Judson Rosen  wrote:
> Graham Dumpleton  writes:
>
> > On Feb 21, 4:20 pm, Joshua Judson Rosen  wrote:
> > > Jesse Noller  writes:
>
> > > > On Tue, Feb 17, 2009 at 10:34 PM, Graham Dumpleton
> > > >  wrote:
> > > > > Why is the multiprocessing module, ie., multiprocessing/process.py, in
> > > > > _bootstrap() doing:
>
> > > > >  os.close(sys.stdin.fileno())
>
> > > > > rather than:
>
> > > > >  sys.stdin.close()
>
> > > > > Technically it is feasible that stdin could have been replaced with
> > > > > something other than a file object, where the replacement doesn't have
> > > > > a fileno() method.
>
> > > > > In that sort of situation an AttributeError would be raised, which
> > > > > isn't going to be caught as either OSError or ValueError, which is all
> > > > > the code watches out for.
>
> > > > I don't know why it was implemented that way. File an issue on the
> > > > tracker and assign it to me (jnoller) please.
>
> > > My guess would be: because it's also possible for sys.stdin to be a
> > > file that's open in read+*write* mode, and for that file to have
> > > pending output buffered (for example, in the case of a socketfile).
>
> > If you are going to have a file that is writable as well as readable,
> > such as a socket, then likely that sys.stdout/sys.stderr are going to
> > be bound to it at the same time.
>
> Yes.
>
> > If that is the case then one should not be using close() at all
>
> If you mean stdin.close(), then that's what I said :)

Either. The problem is that same, it close for both read and write and
if was expecting to still be able to write because used for stdout or
stderr, then will not work.

> > as it will then also close the write side of the pipe and cause
> > errors when code subsequently attempts to write to
> > sys.stdout/sys.stderr.
>
> > In the case of socket you would actually want to use shutdown() to
> > close just the input side of the socket.
>
> Sure--but isn't this "you" the /calling/ code that set the whole thing
> up? What the /caller/ does with its stdio is up to /him/, and beyond
> the scope of the present discourse. I can appreciate a library forking
> and then using os.close() on stdio (it protects my files from any I/O
> the subprocess might think it wants to do with them), but I think I
> might be even more annoyed if it *shutdown my sockets*

Ah, yeah, forgot that shutdown does end to end shutdown rather than
just that file object reference. :-)

Graham

> than if it
> caused double-flushes (there's at least a possibility that I could
> cope with the double-flushes by just ensuring that *I* flushed before
> the fork--not so with socket.shutdown()!)
>
> > What this all means is that what is the appropriate thing to do is
> > going to depend on the environment in which the code is used. Thus,
> > having the behaviour hard wired a certain way is really bad. There
> > perhaps instead should be a way of a user providing a hook function to
> > be called to perform any case specific cleanup of stdin, stdout and
> > stderr, or otherwise reassign them.
>
> Usually, I'd say that that's what the methods on the passed-in object
> are for. Though, as I said--the file-object API is lacking, here :(
>
> > > As such, I'd recommend against just using .close(); you might use
> > > something like `if hasattr(sys.stdin, "fileno"): ...'; but, if your
> > > `else' clause unconditionally calls sys.stdin.close(), then you still
> > > have double-flush problems if someone's set sys.stdin to a file-like
> > > object with output-buffering.
>
> > > I guess you could try calling that an `edge-case' and seeing if anyone
> > > screams. It'd be sort-of nice if there was finer granularity in the
> > > file API--maybe if file.close() took a boolean `flush' argument
>
> --
> Don't be afraid to ask (Lf.((Lx.xx) (Lr.f(rr.

--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime.time and midnight

2009-02-21 Thread Paddy3118
On Feb 21, 10:44 pm, Ethan Furman  wrote:
> Greetings, List!
>
> I was curious if anyone knew the rationale behind making midnight False?
>
> --> import datetime
> --> midnight = datetime.time(0,0,0)
> --> bool(midnight)
> False
>
> To my way of thinking, midnight does actually exist so it should be
> true.  If datetime.time was measuring an *amount* of time, rather than a
> certain point during the day, then a time of 0:0:0 should certainly be
> False as it would mean no time had passed.  However, since midnight does
> indeed exist (as many programmers have observed when deadlines approach
> ;) I would think it should be true.
> --
> ~Ethan~

Ethan,
Knights are true and seek the light. Evil trolls seek the night and so
their hour is false.

;-)

- Paddy.
--
http://mail.python.org/mailman/listinfo/python-list


Problem trying to install ReportLab with easy_install

2009-02-21 Thread Sebastian Bassi
I don't understand what is wrong when I try to install ReportLab. This
is under Ubuntu and all build packages are installed.
Here is what I get when trying to install it: (I could install it with
apt-get, but I am testing virtualenv and easy_install).

(testbio149)vi...@maricurie:~/Public/testbio149$ easy_install ReportLab
Searching for ReportLab
Reading http://pypi.python.org/simple/ReportLab/
Reading http://www.reportlab.com/
Best match: reportLab 2.3
Downloading 
http://pypi.python.org/packages/source/r/reportlab/reportLab-2.3.zip#md5=7d98b26fa287a9e4be4d35d682ce64ac
Processing reportLab-2.3.zip
Running ReportLab_2_3/setup.py -q bdist_egg --dist-dir
/tmp/easy_install-ZZcgFG/ReportLab_2_3/egg-dist-tmp-FqKULE

#Attempting install of _rl_accel, sgmlop & pyHnj
#extensions from '/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel'


#Attempting install of _renderPM
#extensions from '/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/renderPM'
# installing with freetype version 21

/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:
In function âhex32â:
/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:793:
warning: format â%8.8Xâ expects type âunsigned intâ, but argument 3
has type âlong unsigned intâ
/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:
In function â_instanceStringWidthUâ:
/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:1200:warning:
pointer targets in assignment differ in signedness
/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:1123:warning:
âfâ may be used uninitialized in this function
/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:1123:warning:
âtâ may be used uninitialized in this function
/tmp/easy_install-ZZcgFG/ReportLab_2_3/src/rl_addons/rl_accel/_rl_accel.c:1123:warning:
âLâ may be used uninitialized in this function
/usr/bin/ld: cannot find -l_renderPM_libart
collect2: ld returned 1 exit status
error: Setup script exited with error: command 'gcc' failed with exit status 1
(testbio149)vi...@maricurie:~/Public/testbio149$
--
http://mail.python.org/mailman/listinfo/python-list


Re: Find the location of a loaded module

2009-02-21 Thread Gabriel Genellina

En Sat, 21 Feb 2009 14:51:40 -0200,  escribió:


"Gabriel Genellina"  wrote:

En Fri, 20 Feb 2009 20:44:21 -0200, Aaron Scott
 escribi=F3:

> So, the problem lies with how Python cached the modules in memory.
> Yes, the modules were in two different locations and yes, the one that
> I specified using its direct path should be the one loaded. The
> problem is, the module isn't always loaded -- if it's already in
> memory, it'll use that instead. And since the modules had the same
> name, Python wouldn't distinguish between them, even though they
> weren't exactly the same.

Yes, that's how import works. It's barely documented, and you finally
learned it the hard way...


I'd argue a little bit with "barely documented".  In the reference, the
discussion of the import statement starts off saying:

Import statements are executed in two steps: (1) find a module,
and initialize it if necessary; (2) define a name or names in the
local namespace (of the scope where the import statement occurs).

The third paragraph then says:

The system maintains a table of modules that have been or are being
initialized, indexed by module name. This table is accessible as
sys.modules. When a module name is found in this table, step (1)
is finished.

That is pretty up front and unambiguous documentation.


Sure? At a minimum, it doesn't define what "module name" means (isnt't so  
easy). It does not describe correctly how packages are handled, nor dotted  
names that aren't packages. It does not describe sys.meta_path nor  
sys.path_hooks, than can *radically* alter things. It does not describe  
absolute vs. relative imports, nor the "mix" of them in 2.x. It doesn't  
menction the *caller* module as relevant.



However, the consequences of that statement won't be immediately clear
on first reading.  I think it would have cleared up Aaron's confusion
if he'd happened to think to read it.  But since he knew the syntax
of the import statement already, I'm not surprised he did not read it.

The Tutorial, in the section on modules and import, says:

A module can contain executable statements as well as function
definitions. These statements are intended to initialize the
module. They are executed only the first time the module is imported
somewhere.

This is considerably less precise, if more superficially understandable.
I wonder if it would be worth expanding on that statement to mention
that the module is not even looked for on disk if a module by that
name has already been imported.


If only that were true... Importers can do almost anything they want. The  
behaviour you describe is only what happens when a) there are no meta_path  
hooks installed, b) there are no path_hooks involved, c) __import__ has  
not been replaced, and d) you are importing a module from the filesystem.


This is what I call "barely documented".

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime.time and midnight

2009-02-21 Thread Steven D'Aprano
Paddy3118 wrote:

> Ethan,
> Knights are true and seek the light. Evil trolls seek the night and so
> their hour is false.
> 
> ;-)

That's speciest *and* lightist. There's nothing wrong with avoiding the evil
burning day star, that's practically de rigour for programmers.


*wink*


-- 
Steven





--
http://mail.python.org/mailman/listinfo/python-list


Re: Python dictionary size/entry limit?

2009-02-21 Thread intelliminer
On Feb 21, 6:47 pm, Stefan Behnel  wrote:
> intellimi...@gmail.com wrote:
> > I wrote a script to process textual data and extract phrases from
> > them, storing these phrases in a dictionary. It encounters a
> > MemoryError when there are about 11.18M keys in the dictionary, and
> > the size is about 1.5GB.
> > [...]
> > I have 1GB of pysical memory and 3GB in pagefile. Is there a limit to
> > the size or number of entries that a single dictionary can possess? By
> > searching on the web I can't find a clue why this problem occurs.
>
> Python dicts are only limited by what your OS returns as free memory.
> However, when a dict grows, it needs to resize, which means that it has to
> create a bigger copy of itself and redistribute the keys. For a dict that
> is already 1.5GB big, this can temporarily eat a lot more memory than you
> have, at least more than two times as much as the size of the dict itself.
>
> You may be better served with one of the dbm databases that come with
> Python. They live on-disk but do the usual in-memory caching. They'll
> likely perform a lot better than your OS level swap file.
>
> Stefan

Ummm, I didn't know about the dbm databases. It seems there are many
different
modules for this kind of tasks: gdbm, berkeley db, cdb, etc. I'm
needing to implement
a constant hashtable with a large number of keys, but only a small
fraction of them
will be accessed frequently, the read speed is crucial. It would be
ideal if
the implementation caches all the frequently used key/value pairs in
memory. Which
module should I use? And is there a way to specify the amount of
memory it uses for caching?
BTW, the target platform is Linux.

Thank you.
--
http://mail.python.org/mailman/listinfo/python-list


Re: datetime.time and midnight

2009-02-21 Thread Gabriel Genellina
En Sat, 21 Feb 2009 21:55:23 -0200, MRAB   
escribió:



Ethan Furman wrote:

Greetings, List!
 I was curious if anyone knew the rationale behind making midnight  
False?

 --> import datetime
--> midnight = datetime.time(0,0,0)
--> bool(midnight)
False
 To my way of thinking, midnight does actually exist so it should be  
true.  If datetime.time was measuring an *amount* of time, rather than  
a certain point during the day, then a time of 0:0:0 should certainly  
be False as it would mean no time had passed.  However, since midnight  
does indeed exist (as many programmers have observed when deadlines  
approach ;) I would think it should be true.



I think it's because midnight is to the time of day what zero is to
integers, or an empty string is to strings, or an empty container ...


So chr(0) should be False too...

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list


Re: Python 3 and easygui problem

2009-02-21 Thread Gabriel Genellina
En Sun, 22 Feb 2009 03:05:51 -0200, Chris Rebert   
escribió:



On Sat, Feb 21, 2009 at 8:46 PM, Peter Anderson
 wrote:
I have just installed Python 3. I have been using Tkinter and easygui  
(with

Python 2.5.4) for any GUI needs. I have just started to port some of my
existing scripts to Python 3 and discovered problems with easygui.

I was using the following script for testing:



File "C:\Python30\lib\site-packages\easygui.py", line 824, in  
__choicebox
choices.sort( lambda x,y: cmp(x.lower(), y.lower())) # case-insensitive  
sort

TypeError: must use keyword argument for key function


Change the line to

choices.sort(key=lambda x,y: cmp(x.lower(), y.lower()))

Note the "key=". The calling sequence for .sort() changed in Python
3.0 to make 'key' a keyword-only argument (I think).


That's very short-lived; cmp is gone in 3.0.1 (should not have existed in  
3.0 in the first place).

Try with:
choices.sort(key=str.lower)

--
Gabriel Genellina

--
http://mail.python.org/mailman/listinfo/python-list