Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Peter Otten
Steven D'Aprano wrote:

> In the midst of that discussion, Guido van Rossum made a comment about
> subclassing dicts:
> 
> [quote]

> Personally I wouldn't add any words suggesting or referring
> to the option of creation another class for this purpose. You
> wouldn't recommend subclassing dict for constraining the
> types of keys or values, would you?
> [end quote]

> This surprises me, and rather than bother Python-Dev (where it will
> likely be lost in the noise, and certain will be off-topic), I'm hoping
> there may be someone here who is willing to attempt to channel GvR. I
> would have thought that subclassing dict for the purpose of constraining
> the type of keys or values would be precisely an excellent use of
> subclassing.
> 
> 
> class TextOnlyDict(dict):
> def __setitem__(self, key, value):
> if not isinstance(key, str):
> raise TypeError

Personally I feel dirty whenever I write Python code that defeats duck-
typing -- so I would not /recommend/ any isinstance() check.
I realize that this is not an argument...

PS: I tried to read GvR's remark in context, but failed. It's about time to 
to revolt and temporarily install the FLUFL as our leader, long enough to 
revoke Guido's top-posting license, but not long enough to reintroduce the 
<> operator...

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Mark Lawrence

On 15/01/2014 01:27, Steven D'Aprano wrote:

Over on the Python-Dev mailing list, there is an ENORMOUS multi-thread
discussion involving at least two PEPs, about bytes/str compatibility.
But I don't want to talk about that. (Oh gods, I *really* don't want to
talk about that...)


+ trillions



In the midst of that discussion, Guido van Rossum made a comment about
subclassing dicts:

 [quote]
 From: Guido van Rossum 
 Date: Tue, 14 Jan 2014 12:06:32 -0800
 Subject: Re: [Python-Dev] PEP 460 reboot

 Personally I wouldn't add any words suggesting or referring
 to the option of creation another class for this purpose. You
 wouldn't recommend subclassing dict for constraining the
 types of keys or values, would you?
 [end quote]

https://mail.python.org/pipermail/python-dev/2014-January/131537.html

This surprises me, and rather than bother Python-Dev (where it will
likely be lost in the noise, and certain will be off-topic), I'm hoping
there may be someone here who is willing to attempt to channel GvR. I
would have thought that subclassing dict for the purpose of constraining
the type of keys or values would be precisely an excellent use of
subclassing.


Exactly what I was thinking.




class TextOnlyDict(dict):
 def __setitem__(self, key, value):
 if not isinstance(key, str):
 raise TypeError
 super().__setitem__(key, value)
 # need to override more methods too


But reading Guido, I think he's saying that wouldn't be a good idea. I
don't get it -- it's not a violation of the Liskov Substitution
Principle, because it's more restrictive, not less. What am I missing?




Couple of replies I noted from Ned Batchelder and Terry Reedy.  Smacked 
bottom for Peter Otten, how dare he? :)


--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Trying to wrap my head around futures and coroutines

2014-01-15 Thread Phil Connell
On Mon, Jan 06, 2014 at 06:56:00PM -0600, Skip Montanaro wrote:
> So, I'm looking for a little guidance. It seems to me that futures,
> coroutines, and/or the new Tulip/asyncio package might be my salvation, but
> I'm having a bit of trouble seeing exactly how that would work. Let me
> outline a simple hypothetical calculation. I'm looking for ways in which
> these new facilities might improve the structure of my code.

This instinct is exactly right -- the point of coroutines and tulip futures is
to liberate you from having to daisy chain callbacks together.


> 
> Let's say I have a dead simple GUI with two buttons labeled, "Do A" and "Do
> B". Each corresponds to executing a particular activity, A or B, which take
> some non-zero amount of time to complete (as perceived by the user) or
> cancel (as perceived by the state of the running system - not safe to run A
> until B is complete/canceled, and vice versa). The user, being the fickle
> sort that he is, might change his mind while A is running, and decide to
> execute B instead. (The roles can also be reversed.) If s/he wants to run
> task A, task B must be canceled or allowed to complete before A can be
> started. Logically, the code looks something like (I fear Gmail is going to
> destroy my indentation):
> 
> def do_A():
> when B is complete, _do_A()
> cancel_B()
> 
> def do_B():
> when A is complete, _do_B()
> cancel_A()
> 
> def _do_A():
> do the real A work here, we are guaranteed B is no longer running
> 
> def _do_B():
> do the real B work here, we are guaranteed A is no longer running
> 
> cancel_A and cancel_B might be no-ops, in which case they need to start up
> the other calculation immediately, if one is pending.

It strikes me that what you have two linear sequences of 'things to do':
- 'Tasks', started in reaction to some event.
- Cancellations, if a particular task happens to be running.

So, a reasonable design is to have two long-running coroutines, one that
executes your 'tasks' sequentially, and another that executes cancellations.
These are both fed 'things to do' via a couple of queues populated in event
callbacks.

Something like (apologies for typos/non-working code):


cancel_queue = asyncio.Queue()
run_queue = asyncio.Queue()

running_task = None
running_task_name = ""

def do_A():
cancel_queue.put_nowait("B")
run_queue.put_nowait(("A", _do_A()))

def do_B():
cancel_queue.put_nowait("A")
run_queue.put_nowait(("B", _do_B()))

def do_C():
run_queue.put_nowait(("C", _do_C()))

@asyncio.coroutine
def canceller():
while True:
name = yield from cancel_queue.get()
if running_task_name == name:
running_task.cancel()

@asyncio.coroutine
def runner():
while True:
name, coro = yield from run_queue.get()
running_task_name = name
running_task = asyncio.async(coro)
yield from running_task

def main():
...
cancel_task = asyncio.Task(canceller())
run_task = asyncio.Task(runner())
...



Cheers,
Phil

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Paul Pittlerson
I'm sorry if this is a bit late of a response, but here goes.

Big thanks to Chris Angelico for his comprehensive reply, and yes, I do have 
some questions!


> On Thursday, January 9, 2014 1:29:03 AM UTC+2, Chris Angelico wrote:
> Those sorts of frameworks would be helpful if you need to scale to
> infinity, but threads work fine when it's small.
That's what I thought, but I was just asking if it would be like trivially easy 
to set up this stuff in some of those frameworks. I'm sticking to threads for 
now because the learning curve of twisted seems too steep to be worth it at the 
moment.

> Absolutely! The thing to look at is MUDs and chat servers. Ultimately,
> a multiplayer game is really just a chat room with a really fancy
> front end.
If you know of any open source projects or just instructional code of this 
nature in general I'll be interested to take a look. For example, you mentioned 
you had some similar projects of your own..?


> The server shouldn't require interaction at all. It should accept any
> number of clients (rather than getting the exact number that you
> enter), and drop them off the list when they're not there. That's a
> bit of extra effort but it's hugely beneficial.
I get what you are saying, but I should mention that I'm just making a 2 player 
strategy game at this point, which makes sense of the limited number of 
connections.

> One extremely critical point about your protocol. TCP is a stream -
> you don't have message boundaries. You can't depend on one send()
> becoming one recv() at the other end. It might happen to work when you
> do one thing at a time on localhost, but it won't be reliable on the 
> internet or when there's more traffic. So you'll need to delimit
> messages; I recommend you use one of two classic ways: either prefix
> it with a length (so you know how many more bytes to receive), or
> terminate it with a newline (which depends on there not being a
> newline in the text).
I don't understand. Can you show some examples of how to do this?

> Another rather important point, in two halves. You're writing this for
> Python 2, and you're writing with no Unicode handling. I strongly
> recommend that you switch to Python 3 and support full Unicode.
Good point, however the framework I'm using for graphics does not currently 
support python3. I could make the server scripts be in python3, but I don't  
think the potential confusion is worth it until the whole thing can be in the 
same version.


> Note, by the way, that it's helpful to distinguish "data" and "text",
> even in pseudo-code. It's impossible to send text across a socket -
> you have to send bytes of data. If you keep this distinction clearly
> in your head, you'll have no problem knowing when to encode and when
> to decode. For what you're doing here, for instance, I would packetize
> the bytes and then decode into text, and on sending, I'd encode text
> (UTF-8 would be hands-down best here) and then packetize. There are
> other options but that's how I'd do it.
I'm not sure what you are talking about here. Would you care to elaborate on 
this please (it interests and confuses) ?


I'm posting this on google groups, so I hope the formatting turns out ok :P 
thanks.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Tim Chase
On 2014-01-15 01:27, Steven D'Aprano wrote:
> class TextOnlyDict(dict):
> def __setitem__(self, key, value):
> if not isinstance(key, str):
> raise TypeError
> super().__setitem__(key, value)
> # need to override more methods too
> 
> 
> But reading Guido, I think he's saying that wouldn't be a good
> idea. I don't get it -- it's not a violation of the Liskov
> Substitution Principle, because it's more restrictive, not less.
> What am I missing?

Just as an observation, this seems almost exactly what anydbm does,
behaving like a dict (whether it inherits from dict, or just
duck-types like a dict), but with the limitation that keys/values need
to be strings.

-tkc


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Denis McMahon
On Wed, 15 Jan 2014 02:37:05 -0800, Paul Pittlerson wrote:

>> One extremely critical point about your protocol. TCP is a stream - you
>> don't have message boundaries. You can't depend on one send() becoming
>> one recv() at the other end. It might happen to work when you do one
>> thing at a time on localhost, but it won't be reliable on the internet
>> or when there's more traffic. So you'll need to delimit messages; I
>> recommend you use one of two classic ways: either prefix it with a
>> length (so you know how many more bytes to receive), or terminate it
>> with a newline (which depends on there not being a newline in the
>> text).

> I don't understand. Can you show some examples of how to do this?

How much do you understand about tcp/ip networking? because when trying 
to build something on top of tcp/ip, it's a good idea to understand the 
basics of tcp/ip first.

A tcp/ip connection is just a pipe that you pour data (octets, more or 
less analagous to bytes or characters) into at one end, and it comes out 
at the other end.

For your stream of octets (bytes / characters) to have any meaning to a 
higher level program, then the applications using the pipe at both ends 
have to understand that a message has some structure.

The message structure might be to send an n character message length 
count (where in a simple protocol n would have to be a fixed number) 
followed by the specified number of characters.

Assuming your maximum message length is  characters:

You could send the characters  followed by  characters of message 
content.

The receiving end would receive 4 characters, convert them to the number 
, and assume the next  characters will be the message. Then it 
expects another 4 character number.

You could send json encoded strings, in which case each message might 
start with the "{" character and end with the "}" character, but you 
would have to allow for the fact that "}" can also occur within a json 
encoded string.

You might decide that each message is simply going to end with a specific 
character or character sequence.

Whatever you choose, you need some way for the receiving application to 
distinguish between individual messages in the stream of octets / bytes / 
characters that is coming out of the pipe.

-- 
Denis McMahon, denismfmcma...@gmail.com
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Robin Becker

On 12/01/2014 07:50, wxjmfa...@gmail.com wrote:

sys.version

2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]

s = 'Straße'
assert len(s) == 6
assert s[5] == 'e'



jmf



On my utf8 based system



robin@everest ~:
$ cat ooo.py
if __name__=='__main__':
import sys
s='A̅B'
print('version_info=%s\nlen(%s)=%d' % (sys.version_info,s,len(s)))
robin@everest ~:
$ python ooo.py
version_info=sys.version_info(major=3, minor=3, micro=3, releaselevel='final', 
serial=0)
len(A̅B)=3
robin@everest ~:
$



so two 'characters' are 3 (or 2 or more) codepoints. If I want to isolate so 
called graphemes I need an algorithm even for python's unicode ie when it really 
matters, python3 str is just another encoding.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: proposal: bring nonlocal to py2.x

2014-01-15 Thread Robin Becker

On 13/01/2014 15:28, Chris Angelico wrote:
..


It's even worse than that, because adding 'nonlocal' is not a bugfix.
So to be committed to the repo, it has to be approved for either 2.7
branch (which is in bugfix-only maintenance mode) or 2.8 branch (which
does not exist). Good luck. :)

...
fixing badly named variables is not a bug fix either, but that has happened in 
python 2.7. A micro change release changed


compiler.consts.SC_GLOBAL_EXPLICT

to

compiler.consts.SC_GLOBAL_EXPLICIT

this is a change of api for the consts module (if you regard exported variables 
as part of its api), but that didn't count for the developers.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Ned Batchelder

On 1/15/14 7:00 AM, Robin Becker wrote:

On 12/01/2014 07:50, wxjmfa...@gmail.com wrote:

sys.version

2.7.6 (default, Nov 10 2013, 19:24:18) [MSC v.1500 32 bit (Intel)]

s = 'Straße'
assert len(s) == 6
assert s[5] == 'e'



jmf



On my utf8 based system



robin@everest ~:
$ cat ooo.py
if __name__=='__main__':
import sys
s='A̅B'
print('version_info=%s\nlen(%s)=%d' % (sys.version_info,s,len(s)))
robin@everest ~:
$ python ooo.py
version_info=sys.version_info(major=3, minor=3, micro=3,
releaselevel='final', serial=0)
len(A̅B)=3
robin@everest ~:
$



so two 'characters' are 3 (or 2 or more) codepoints. If I want to
isolate so called graphemes I need an algorithm even for python's
unicode ie when it really matters, python3 str is just another encoding.


You are right that more than one codepoint makes up a grapheme, and that 
you'll need code to deal with the correspondence between them. But let's 
not muddy these already confusing waters by referring to that mapping as 
an encoding.


In Unicode terms, an encoding is a mapping between codepoints and bytes. 
 Python 3's str is a sequence of codepoints.


--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3 __bytes__ method

2014-01-15 Thread Thomas Rachel

Am 12.01.2014 01:24 schrieb Ethan Furman:


I must admit I'm not entirely clear how this should be used.  Is anyone
using this now?  If so, how?


I am not, as I currently am using Py2, but if I would, I would do it e. 
g. for serialization of objects in order to send them over the line or 
to save them into a file. IOW, the same purpose as we havd on __str__ in 
Py2.



Thomas

--
https://mail.python.org/mailman/listinfo/python-list


Question about object lifetime and access

2014-01-15 Thread Asaf Las
Hi community 

i am beginner in Python and have possibly silly questions i could not figure 
out answers for. 

Below is the test application working with uwsgi to test json-rpc.

from multiprocessing import Process
from werkzeug.wrappers import Request, Response
from werkzeug.serving import run_simple

from jsonrpc import JSONRPCResponseManager, dispatcher

p = "module is loaded"   <-- (3)
print(p)
print(id(p))

@Request.application
def application(request):
print("server started")

print(id(p))

# Dispatcher is dictionary {: callable}
dispatcher["echo"] = lambda s: s   < (1)
dispatcher["add"] = lambda a, b: a + b < (2)

print("request data ==> ", request.data)
response = JSONRPCResponseManager.handle(request.data, dispatcher)
return Response(response.json, mimetype='application/json')


As program will grow new rpc method dispatchers will be added so there is idea 
to reduce initialization code at steps 1 and 2 by making them global objects 
created at module loading, like string p at step 3.

Multithreading will be enabled in uwsgi and 'p' will be used for read only.

Questions are:
- what is the lifetime for global object (p in this example). 
- will the p always have value it got during module loading
- if new thread will be created will p be accessible to it
- if p is accessible to new thread will new thread initialize p value again?
- is it guaranteed to have valid p content (set to "module is loaded") whenever 
application() function is called.
- under what condition p is cleaned by gc.

The rationale behind these question is to avoid object creation within 
application() whose content is same and do not change between requests calling 
application() function and thus to reduce script response time. 

Thanks in advance!




-- 
https://mail.python.org/mailman/listinfo/python-list


ANN: Python Meeting Düsseldorf - 21.01.2014

2014-01-15 Thread eGenix Team: M.-A. Lemburg
[This announcement is in German since it targets a local user group
 meeting in Düsseldorf, Germany]


ANKÜNDIGUNG

 Python Meeting Düsseldorf

 http://pyddf.de/

   Ein Treffen von Python Enthusiasten und Interessierten
in ungezwungener Atmosphäre.

  Dienstag, 21.01.2014, 18:00 Uhr
  Raum 1, 2.OG im Bürgerhaus Stadtteilzentrum Bilk
Düsseldorfer Arcaden, Bachstr. 145, 40217 Düsseldorf

Diese Nachricht ist auch online verfügbar:
http://www.egenix.com/company/news/Python-Meeting-Duesseldorf-2014-01-21


NEUIGKEITEN

 * Bereits angemeldete Vorträge:

   Charlie Clark
   "Properties & Descriptors"

   Marc-Andre Lemburg
   "Webseiten Screenshots mit Python automatisieren"

   Charlie Clark
   "Einfache Test-Automatisierung mit tox"

 * Neue Videos

   Wir haben in den letzten Wochen eine ganze Reihe neuer Videos
   produziert und auf unseren YouTube-Kanal hochgeladen:

   PyDDF YouTube-Kanal: http://www.youtube.com/pyddf/

 * Neuer Veranstaltungsraum:

   Wir treffen uns im Bürgerhaus in den Düsseldorfer Arcaden.
   Da beim letzten Mal einige Teilnehmer Schwierigkeiten hatten,
   den Raum zu finden, hier eine kurze Beschreibung:

   Das Bürgerhaus teilt sich den Eingang mit dem Schwimmbad
   und befindet sich an der Seite der Tiefgarageneinfahrt der
   Düsseldorfer Arcaden.

   Über dem Eingang steht ein großes “Schwimm’'in Bilk”
   Logo. Hinter der Tür direkt links zu den zwei Aufzügen,
   dann in den 2. Stock hochfahren. Der Eingang zum Raum 1
   liegt direkt links, wenn man aus dem Aufzug kommt.

   Google Street View: http://bit.ly/11sCfiw


EINLEITUNG

Das Python Meeting Düsseldorf ist eine regelmäßige Veranstaltung in
Düsseldorf, die sich an Python Begeisterte aus der Region wendet:

 * http://pyddf.de/

Einen guten Überblick über die Vorträge bietet unser YouTube-Kanal,
auf dem wir die Vorträge nach den Meetings veröffentlichen:

 * http://www.youtube.com/pyddf/

Veranstaltet wird das Meeting von der eGenix.com GmbH, Langenfeld,
in Zusammenarbeit mit Clark Consulting & Research, Düsseldorf:

 * http://www.egenix.com/
 * http://www.clark-consulting.eu/


PROGRAMM

Das Python Meeting Düsseldorf nutzt eine Mischung aus Open Space
und Lightning Talks, wobei die Gewitter bei uns auch schon mal
20 Minuten dauern können ;-).

Lightning Talks können vorher angemeldet werden, oder auch
spontan während des Treffens eingebracht werden. Ein Beamer mit
XGA Auflösung steht zur Verfügung. Folien bitte als PDF auf USB
Stick mitbringen.

Lightning Talk Anmeldung bitte formlos per EMail an i...@pyddf.de


KOSTENBETEILIGUNG

Das Python Meeting Düsseldorf wird von Python Nutzern für Python
Nutzer veranstaltet. Um die Kosten zumindest teilweise zu
refinanzieren, bitten wir die Teilnehmer um einen Beitrag
in Höhe von EUR 10,00 inkl. 19% Mwst, Schüler und Studenten
zahlen EUR 5,00 inkl. 19% Mwst.

Wir möchten alle Teilnehmer bitten, den Betrag in bar mitzubringen.


ANMELDUNG

Da wir nur für ca. 20 Personen Sitzplätze haben, möchten wir
bitten, sich per EMail anzumelden. Damit wird keine Verpflichtung
eingegangen. Es erleichtert uns allerdings die Planung.

Meeting Anmeldung bitte formlos per EMail an i...@pyddf.de


WEITERE INFORMATIONEN

Weitere Informationen finden Sie auf der Webseite des Meetings:

http://pyddf.de/

Mit freundlichen Grüßen,
-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jan 15 2014)
>>> Python Projects, Consulting and Support ...   http://www.egenix.com/
>>> mxODBC.Zope/Plone.Database.Adapter ...   http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...http://python.egenix.com/


: Try our mxODBC.Connect Python Database Interface for free ! ::

   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
   Registered at Amtsgericht Duesseldorf: HRB 46611
   http://www.egenix.com/company/contact/
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Robin Becker

On 15/01/2014 12:13, Ned Batchelder wrote:


On my utf8 based system



robin@everest ~:
$ cat ooo.py
if __name__=='__main__':
import sys
s='A̅B'
print('version_info=%s\nlen(%s)=%d' % (sys.version_info,s,len(s)))
robin@everest ~:
$ python ooo.py
version_info=sys.version_info(major=3, minor=3, micro=3,
releaselevel='final', serial=0)
len(A̅B)=3
robin@everest ~:
$






You are right that more than one codepoint makes up a grapheme, and that you'll
need code to deal with the correspondence between them. But let's not muddy
these already confusing waters by referring to that mapping as an encoding.

In Unicode terms, an encoding is a mapping between codepoints and bytes.  Python
3's str is a sequence of codepoints.

Semantics is everything. For me graphemes are the endpoint (or should be); to 
get a proper rendering of a sequence of graphemes I can use either a sequence of 
bytes or a sequence of codepoints. They are both encodings of the graphemes; 
what unicode says is an encoding doesn't define what encodings are ie mappings 
from some source alphabet to a target alphabet.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Chris Angelico
On Wed, Jan 15, 2014 at 9:37 PM, Paul Pittlerson  wrote:
> I'm sorry if this is a bit late of a response, but here goes.
>
> Big thanks to Chris Angelico for his comprehensive reply, and yes, I do have 
> some questions!

Best way to learn! And the thread's not even a week old, this isn't
late. Sometimes there've been responses posted to something from
2002... now THAT is thread necromancy!!

>> On Thursday, January 9, 2014 1:29:03 AM UTC+2, Chris Angelico wrote:
>> Those sorts of frameworks would be helpful if you need to scale to
>> infinity, but threads work fine when it's small.
> That's what I thought, but I was just asking if it would be like trivially 
> easy to set up this stuff in some of those frameworks. I'm sticking to 
> threads for now because the learning curve of twisted seems too steep to be 
> worth it at the moment.

I really don't know, never used the frameworks. But threads are fairly
easy to get your head around. Let's stick with them.

>> Absolutely! The thing to look at is MUDs and chat servers. Ultimately,
>> a multiplayer game is really just a chat room with a really fancy
>> front end.
> If you know of any open source projects or just instructional code of this 
> nature in general I'll be interested to take a look. For example, you 
> mentioned you had some similar projects of your own..?
>

Here's something that I did up as a MUD-writing tutorial for Pike:

http://rosuav.com/piketut.zip

I may need to port that tutorial to Python at some point. In any case,
it walks you through the basics. (Up to section 4, everything's the
same, just different syntax for the different languages. Section 5 is
Pike-specific.)

>> The server shouldn't require interaction at all. It should accept any
>> number of clients (rather than getting the exact number that you
>> enter), and drop them off the list when they're not there. That's a
>> bit of extra effort but it's hugely beneficial.
> I get what you are saying, but I should mention that I'm just making a 2 
> player strategy game at this point, which makes sense of the limited number 
> of connections.

One of the fundamentals of the internet is that connections *will*
break. A friend of mine introduced me to Magic: The Gathering via a
program that couldn't handle drop-outs, and it got extremely
frustrating - we couldn't get a game going. Build your server such
that your clients can disconnect and reconnect, and you protect
yourself against half the problem; allow them to connect and kick the
other connection off, and you solve the other half. (Sometimes, the
server won't know that the client has gone, so it helps to be able to
kick like that.) It might not be an issue when you're playing around
with localhost, and you could even get away with it on a LAN, but on
the internet, it's so much more friendly to your users to let them
connect multiple times like that.

>> One extremely critical point about your protocol. TCP is a stream -
>> you don't have message boundaries. You can't depend on one send()
>> becoming one recv() at the other end. It might happen to work when you
>> do one thing at a time on localhost, but it won't be reliable on the
>> internet or when there's more traffic. So you'll need to delimit
>> messages; I recommend you use one of two classic ways: either prefix
>> it with a length (so you know how many more bytes to receive), or
>> terminate it with a newline (which depends on there not being a
>> newline in the text).
> I don't understand. Can you show some examples of how to do this?

Denis gave a decent explanation of the problem, with a few
suggestions. One of the easiest to work with (and trust me, you will
LOVE the ease of debugging this kind of system) is the line-based
connection. You just run a loop like this:

buffer = b''

def gets():
while '\n' not in buffer:
data = sock.recv(1024)
if not data:
# Client is disconnected, handle it gracefully
return None # or some other sentinel
line, buffer = buffer.split(b'\n',1)
return line.decode().replace('\r', '')

You could put this into a class definition that wraps up all the
details. The key here is that you read as much as you can, buffering
it, and as soon as you have a newline, you return that. This works
beautifully with the basic TELNET client, so it's easy to see what's
going on. Its only requirement is that there be no newlines *inside*
commands. The classic MUD structure guarantees that (if you want a
paragraph of text, you have some marker that says "end of paragraph" -
commonly a dot on a line of its own, which is borrowed from SMTP), and
if you use json.dumps() then it'll use two characters "\\" and "n" to
represent a newline, so that's safe too.

The next easiest structure to work with is length-preceded, which
Denis explained. Again, you read until you have a full packet, but
instead of "while '\n' not in buffer", it would be "while
len(buffer)> Another rather important point, in two halves. You're writing 

Re: Question about object lifetime and access

2014-01-15 Thread Ned Batchelder

On 1/15/14 7:13 AM, Asaf Las wrote:

Hi community

i am beginner in Python and have possibly silly questions i could not figure 
out answers for.

Below is the test application working with uwsgi to test json-rpc.

from multiprocessing import Process
from werkzeug.wrappers import Request, Response
from werkzeug.serving import run_simple

from jsonrpc import JSONRPCResponseManager, dispatcher

p = "module is loaded"   <-- (3)
print(p)
print(id(p))

@Request.application
def application(request):
 print("server started")

 print(id(p))

 # Dispatcher is dictionary {: callable}
 dispatcher["echo"] = lambda s: s   < (1)
 dispatcher["add"] = lambda a, b: a + b < (2)

 print("request data ==> ", request.data)
 response = JSONRPCResponseManager.handle(request.data, dispatcher)
 return Response(response.json, mimetype='application/json')


As program will grow new rpc method dispatchers will be added so there is idea 
to reduce initialization code at steps 1 and 2 by making them global objects 
created at module loading, like string p at step 3.

Multithreading will be enabled in uwsgi and 'p' will be used for read only.


The important concepts to understand are names and values. All values in 
Python work the same way: they live until no name refers to them.  Also, 
any name can be assigned to (rebound) after it has been defined.


This covers the details in more depth: 
http://nedbatchelder.com/text/names.html




Questions are:
- what is the lifetime for global object (p in this example).


The name p is visible in this module for as long as the program is 
running.  The object you've assigned to p can be shorter-lived if p is 
reassigned.



- will the p always have value it got during module loading


Depends if you reassign it.


- if new thread will be created will p be accessible to it


If the thread is running code defined in this module, yes, that code 
will be able to access p in that thread.



- if p is accessible to new thread will new thread initialize p value again?


No, the module is only imported once, so the statements at the top level 
of the module are only executed once.



- is it guaranteed to have valid p content (set to "module is loaded") whenever 
application() function is called.


Yes, unless you reassign p.


- under what condition p is cleaned by gc.


Names are not reclaimed by the garbage collector, values are.  The value 
assigned to p can be reclaimed if you reassign the name p, and nothing 
else is referring to the value.




The rationale behind these question is to avoid object creation within 
application() whose content is same and do not change between requests calling 
application() function and thus to reduce script response time.

Thanks in advance!


Welcome.


--
Ned Batchelder, http://nedbatchelder.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: Trying to wrap my head around futures and coroutines

2014-01-15 Thread Oscar Benjamin
On Mon, Jan 06, 2014 at 09:15:56PM -0600, Skip Montanaro wrote:
> From the couple responses I've seen, I must have not made myself
> clear. Let's skip specific hypothetical tasks. Using coroutines,
> futures, or other programming paradigms that have been introduced in
> recent versions of Python 3.x, can traditionally event-driven code be
> written in a more linear manner so that the overall algorithms
> implemented in the code are easier to follow? My code is not
> multi-threaded, so using threads and locking is not really part of the
> picture. In fact, I'm thinking about this now precisely because the
> first sentence of the asyncio documentation mentions single-threaded
> concurrent code: "This module provides infrastructure for writing
> single-threaded concurrent code using coroutines, multiplexing I/O
> access over sockets and other resources, running network clients and
> servers, and other related primitives."
> 
> I'm trying to understand if it's possible to use coroutines or objects
> like asyncio.Future to write more readable code, that today would be
> implemented using callbacks, GTK signals, etc.

Hi Skip,

I don't yet understand how asyncio works in complete examples (I'm not sure
that many people do yet) but I have a loose idea of it so take the following
with a pinch of salt and expect someone else to correct me later. :)

With asyncio the idea is that you can run IO operations concurrently in the
a single thread. Execution can switch between different tasks while each task
can be written as a linear-looking generator function without the need for
callbacks and locks. Execution switching is based on which task has avilable
IO data. So the core switcher keeps track of a list of objects (open files,
sockets etc.) and executes the task when something is available.

>From the perspective of the task generator code what happens is that you yield
to allow other code to execute while you wait for some IO e.g.:

@asyncio.coroutine
def task_A():
a1 = yield from futureA1()
a2 = yield from coroutineA2(a1) # Indirectly yields from futures
a3 = yield from futureA3(a2)
return a3

At each yield statement you are specifying some operation that takes time.
During that time other coroutine code is allowed to execute in this thread.

If task_B has a reference to the future that task_A is waiting on then it can
be cancelled with the Future.cancel() method. I think that you can also cancel
with a reference to the task. So I think you can do something like

@asyncio.coroutine
def task_A():
# Cancel the other task and wait
if ref_taskB is not None:
ref_taskB.cancel()
asyncio.wait([ref_taskB])
try:
# Linear looking code with no callbacks
a1 = yield from futureA1()
a2 = yield from coroutineA2(a1) # Indirectly yields from futures
a3 = yield from futureA3(a2)
except CancelledError:
stop_A()
raise # Or return something...
return a3

Then task_B() would have the reciprocal structure. The general form with more
than just A or B would be to have a reference to the current task then you
could factor out the cancellation code with context managers, decorators or
something else.


Oscar
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question about object lifetime and access

2014-01-15 Thread Chris Angelico
On Wed, Jan 15, 2014 at 11:13 PM, Asaf Las  wrote:
> Questions are:
> - what is the lifetime for global object (p in this example).
> - will the p always have value it got during module loading
> - if new thread will be created will p be accessible to it
> - if p is accessible to new thread will new thread initialize p value again?
> - is it guaranteed to have valid p content (set to "module is loaded") 
> whenever application() function is called.
> - under what condition p is cleaned by gc.

Your global p is actually exactly the same as the things you imported.
In both cases, you have a module-level name bound to some object. So
long as that name references that object, the object won't be garbage
collected, and from anywhere in the module, you can reference that
name and you'll get that object. (Unless you have a local that shadows
it. I'll assume you're not doing that.)

How do you go about creating threads? Is it after initializing the
module? If so, they'll share the same p and the same object that it's
pointing to - nothing will be reinitialized.

As long as you don't change what's in p, it'll have the same value
([1] - handwave) whenever application() is called. That's a guarantee.

For your lambda functions, you could simply make them module-level
functions. You could then give them useful names, too. But decide
based on code readability rather than questions of performance. At
this stage, you have no idea what's going to be fast or slow - wait
till you have a program that's not fast enough, and then *profile it*
to find the slow bits. Unless you're doing that, you're completely
wasting your time trying to make something faster. Start with
readable, idiomatic code, code that you could come back to in six
months and be confident of understanding. Do whatever it takes to
ensure that, and let performance take care of itself. Nine times out
of ten, you won't even have a problem. In the past twelve months, I
can think of exactly *one* time when I needed to improve an app's
performance after I'd coded it the readable way, and there was just
one part of the code that needed to be tweaked. (And it was more of an
algorithmic change than anything else, so it didn't much hurt
readability.) Remember the two rules of code optimization:

1. Don't.
2. (For experts only) Don't yet.

Follow those and you'll save more time than you would gain by
micro-optimizing. And your time is worth more than the computer's.

ChrisA

[1] Technically p doesn't "have a value" at all. It's a name that's
bound to some object. You can rebind it to another object, you can
mutate the object it's bound to (except that you've bound it to a
string, which is immutable), or you can sever the connection (with
'del p'), but in simple terms, it's generally "near enough" to say
that p has a value.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: proposal: bring nonlocal to py2.x

2014-01-15 Thread Chris Angelico
On Wed, Jan 15, 2014 at 11:07 PM, Robin Becker  wrote:
> On 13/01/2014 15:28, Chris Angelico wrote:
> ..
>
>>
>> It's even worse than that, because adding 'nonlocal' is not a bugfix.
>> So to be committed to the repo, it has to be approved for either 2.7
>> branch (which is in bugfix-only maintenance mode) or 2.8 branch (which
>> does not exist). Good luck. :)
>
> ...
> fixing badly named variables is not a bug fix either, but that has happened
> in python 2.7. A micro change release changed
>
> compiler.consts.SC_GLOBAL_EXPLICT
>
> to
>
> compiler.consts.SC_GLOBAL_EXPLICIT
>
> this is a change of api for the consts module (if you regard exported
> variables as part of its api), but that didn't count for the developers.

Hmm. I'd say that one's arguable; that's clearly a misspelled name. It
comes down to the release manager's decision on points like that. I
would say that adding a new keyword and a whole pile of new semantics
is a bit bigger than renaming one constant :) But yes, this could
break code in a point release, and that's a potential issue.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question about object lifetime and access

2014-01-15 Thread Asaf Las
Thanks a lot for detailed answer! 

i plan to assign object to name only when module loads, that means outside of 
function or class method. Then object will be accessed from functions only for 
read purpose. 

I have read somewhere that global objects are referenced from module namespace 
will never have reference count down to 0 even if they are not referenced from 
functions or class methods. Is this true? Does it mean that global objects are 
destroyed when interpreter exits or thread where it runs is terminated?


On Wednesday, January 15, 2014 2:13:56 PM UTC+2, Asaf Las wrote:
> Hi community 
> 
> 
> 
> i am beginner in Python and have possibly silly questions i could not figure 
> out answers for. 
> 
> 
> 
> Below is the test application working with uwsgi to test json-rpc.
> 
> 
> 
> from multiprocessing import Process
> 
> from werkzeug.wrappers import Request, Response
> 
> from werkzeug.serving import run_simple
> 
> 
> 
> from jsonrpc import JSONRPCResponseManager, dispatcher
> 
> 
> 
> p = "module is loaded"   <-- (3)
> 
> print(p)
> 
> print(id(p))
> 
> 
> 
> @Request.application
> 
> def application(request):
> 
> print("server started")
> 
> 
> 
> print(id(p))
> 
> 
> 
> # Dispatcher is dictionary {: callable}
> 
> dispatcher["echo"] = lambda s: s   < (1)
> 
> dispatcher["add"] = lambda a, b: a + b < (2)
> 
> 
> 
> print("request data ==> ", request.data)
> 
> response = JSONRPCResponseManager.handle(request.data, dispatcher)
> 
> return Response(response.json, mimetype='application/json')
> 
> 
> 
> 
> 
> As program will grow new rpc method dispatchers will be added so there is 
> idea to reduce initialization code at steps 1 and 2 by making them global 
> objects created at module loading, like string p at step 3.
> 
> 
> 
> Multithreading will be enabled in uwsgi and 'p' will be used for read only.
> 
> 
> 
> Questions are:
> 
> - what is the lifetime for global object (p in this example). 
> 
> - will the p always have value it got during module loading
> 
> - if new thread will be created will p be accessible to it
> 
> - if p is accessible to new thread will new thread initialize p value again?
> 
> - is it guaranteed to have valid p content (set to "module is loaded") 
> whenever application() function is called.
> 
> - under what condition p is cleaned by gc.
> 
> 
> 
> The rationale behind these question is to avoid object creation within 
> application() whose content is same and do not change between requests 
> calling application() function and thus to reduce script response time. 
> 
> 
> 
> Thanks in advance!

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question about object lifetime and access

2014-01-15 Thread Asaf Las
Thanks!

On Wednesday, January 15, 2014 3:05:43 PM UTC+2, Chris Angelico wrote:
>
> 
> > Questions are:
> 
> > - what is the lifetime for global object (p in this example).
> 
> > - will the p always have value it got during module loading
> 
> > - if new thread will be created will p be accessible to it
> 
> > - if p is accessible to new thread will new thread initialize p value again?
> 
> > - is it guaranteed to have valid p content (set to "module is loaded") 
> > whenever application() function is called.
> 
> > - under what condition p is cleaned by gc.
> 
> 
> 
> Your global p is actually exactly the same as the things you imported.
> 
> In both cases, you have a module-level name bound to some object. So
> 
> long as that name references that object, the object won't be garbage
> 
> collected, and from anywhere in the module, you can reference that
> 
> name and you'll get that object. (Unless you have a local that shadows
> 
> it. I'll assume you're not doing that.)
> 
> 
> 
> How do you go about creating threads? Is it after initializing the
> 
> module? If so, they'll share the same p and the same object that it's
> 
> pointing to - nothing will be reinitialized.
> 
> 
> 
> As long as you don't change what's in p, it'll have the same value
> 
> ([1] - handwave) whenever application() is called. That's a guarantee.
> 
> 
> 
> For your lambda functions, you could simply make them module-level
> 
> functions. You could then give them useful names, too. But decide
> 
> based on code readability rather than questions of performance. At
> 
> this stage, you have no idea what's going to be fast or slow - wait
> 
> till you have a program that's not fast enough, and then *profile it*
> 
> to find the slow bits. Unless you're doing that, you're completely
> 
> wasting your time trying to make something faster. Start with
> 
> readable, idiomatic code, code that you could come back to in six
> 
> months and be confident of understanding. Do whatever it takes to
> 
> ensure that, and let performance take care of itself. Nine times out
> 
> of ten, you won't even have a problem. In the past twelve months, I
> 
> can think of exactly *one* time when I needed to improve an app's
> 
> performance after I'd coded it the readable way, and there was just
> 
> one part of the code that needed to be tweaked. (And it was more of an
> 
> algorithmic change than anything else, so it didn't much hurt
> 
> readability.) Remember the two rules of code optimization:
> 
> 
> 
> 1. Don't.
> 
> 2. (For experts only) Don't yet.
> 
> 
> 
> Follow those and you'll save more time than you would gain by
> 
> micro-optimizing. And your time is worth more than the computer's.
> 
> 
> 
> ChrisA
> 
> 
> 
> [1] Technically p doesn't "have a value" at all. It's a name that's
> 
> bound to some object. You can rebind it to another object, you can
> 
> mutate the object it's bound to (except that you've bound it to a
> 
> string, which is immutable), or you can sever the connection (with
> 
> 'del p'), but in simple terms, it's generally "near enough" to say
> 
> that p has a value.

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Frank Millman

"Chris Angelico"  wrote in message 
news:CAPTjJmpb6yr-VpWypbJQn0a=pnjvnv2cchvbzak+v_5josq...@mail.gmail.com...
> You just run a loop like this:
>
> buffer = b''
>
> def gets():
>while '\n' not in buffer:
>data = sock.recv(1024)
>if not data:
># Client is disconnected, handle it gracefully
>return None # or some other sentinel
>line, buffer = buffer.split(b'\n',1)
>return line.decode().replace('\r', '')
>

I think you may have omitted a line there -

def gets():
while '\n' not in buffer:
data = sock.recv(1024)
if not data:
# Client is disconnected, handle it gracefully
return None # or some other sentinel
#-->
buffer = buffer + data
#-->
line, buffer = buffer.split(b'\n',1)
return line.decode().replace('\r', '')

Also, as I am looking at it, I notice that the second line should say -

while b'\n' not in buffer:

I feel a bit guilty nitpicking, as you have provided a wonderfully 
comprehensive answer, but I wanted to make sure the OP did not get confused.

Frank Millman



-- 
https://mail.python.org/mailman/listinfo/python-list


ANN: Wing IDE 5.0.2 released

2014-01-15 Thread Wingware

Hi,

Wingware has released version 5.0.2 of Wing IDE, our integrated development
environment designed specifically for the Python programming language.

Wing IDE includes a professional quality code editor with vi, emacs, and 
other
key bindings, auto-completion, call tips, refactoring, context-aware 
auto-editing,
a powerful graphical debugger, version control, unit testing, search, 
and many

other features.  For details see http://wingware.com/

Changes in this minor release include:

* Support for matplotlib with Anaconda
* Support for Django 1.6
* Preference to auto-add EOL at end of files
* Preference to disable mouse wheel font zoom
* Fix code analysis in files containing \r EOLs
* Fix typing in middle of toolbar search
* Improve look of tabbed areas in dark color palettes
* Fix problems with backspace at start of line
* Fix VI mode : commands
* Fix dragging tools back into main window
* 30 other bug fixes

For details see http://wingware.com/pub/wingide/5.0.2/CHANGELOG.txt

New features in Wing 5 include:

* Now runs native on OS X
* Draggable tools and editors
* Configurable toolbar and editor & project context menus
* Optionally opens a different sets of files in each editor split
* Lockable editor splits
* Optional Python Turbo completion (context-appropriate completion on 
all non-symbol keys)

* Sharable color palettes and syntax highlighting configurations
* Auto-editing is on by default (except some operations that have a 
learning curve)

* Named file sets
* Sharable launch configurations
* Asynchronous I/O in Debug Probe and Python Shell
* Expanded and rewritten tutorial
* Support for Python 3.4 and Django 1.6

For more information on what's new in Wing 5, see 
http://wingware.com/wingide/whatsnew


Free trial: http://wingware.com/wingide/trial
Downloads: http://wingware.com/downloads
Feature matrix: http://wingware.com/wingide/features
Sales: http://wingware.com/store/purchase
Upgrades: https://wingware.com/store/upgrade

Questions?  Don't hesitate to email us at supp...@wingware.com.

Thanks,

--

Stephan Deibel
Wingware | Python IDE
Advancing Software Development

www.wingware.com
--
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 12:31 AM, Frank Millman  wrote:
> I think you may have omitted a line there -
>
> def gets():
> while '\n' not in buffer:
> data = sock.recv(1024)
> if not data:
> # Client is disconnected, handle it gracefully
> return None # or some other sentinel
> #-->
> buffer = buffer + data
> #-->
> line, buffer = buffer.split(b'\n',1)
> return line.decode().replace('\r', '')

Yes, indeed I did, thanks. Apart from using augmented assignment,
that's exactly what I would have put there, if I'd actually taken a
moment to test the code.

> Also, as I am looking at it, I notice that the second line should say -
>
> while b'\n' not in buffer:

Right again. Fortunately, Py3 would catch that one with a TypeError.
See? This is why you should use Py3. :)

> I feel a bit guilty nitpicking, as you have provided a wonderfully
> comprehensive answer, but I wanted to make sure the OP did not get confused.

No no, nitpicking is exactly what ensures that the end result is
correct. If I got offended at you correcting my code, it would imply
that I think myself perfect (or at least, that I consider you to be
utterly incapable of noticing my errors), which is provably false :)
One of the mind-set changes that I had to introduce at work was that
people don't own code, the repository does - if you see an improvement
to something I wrote, or I see an improvement to something you wrote,
they're improvements to be committed, not turf wars to be battled
over.

Especially on something like this, please *do* catch other people's mistakes :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread wxjmfauth
Le mercredi 15 janvier 2014 13:13:36 UTC+1, Ned Batchelder a écrit :

> 
> ... more than one codepoint makes up a grapheme ...

No

> In Unicode terms, an encoding is a mapping between codepoints and bytes. 

No

jmf

-- 
https://mail.python.org/mailman/listinfo/python-list


Re:Question about object lifetime and access

2014-01-15 Thread Dave Angel
 Asaf Las  Wrote in message:
> Hi community 
> 
Welcome.

> 
> Multithreading will be enabled in uwsgi and 'p' will be used for read only.
> 
> Questions are:
> - what is the lifetime for global object (p in this example). 

The name will be visible in this module until the application
 shuts down or till you use del.

> - will the p always have value it got during module loading

The (str) object that you bind to it will survive till you del or
 reassign p.  Reassignment can happen with a simple assignment
 statement or via an 'as' clause. The value of such a str object
 will never change because it's an immutable type.

Convention is to use names that are all uppercase.  And long,
 descriptive names are preferred over one-letter names, especially
 for long-lived ones. Don't worry, long names do not take longer.
 


> - if new thread will be created will p be accessible to it

It is accessible to all threads in the same process.

It is also available to other modules via the import mechanism. 
 But watch out for circular imports, which frequently cause bugs.
 

> - if p is accessible to new thread will new thread initialize p value again?

Module level code runs only once per process. 

> - is it guaranteed to have valid p content (set to "module is loaded") 
> whenever application() function is called.

Except with circular imports.

> - under what condition p is cleaned by gc.

Names are never garbage collected.  See above for objects. 
> 
> The rationale behind these question is to avoid object creation within 
> application() whose content is same and do not change between requests 
> calling application() function and thus to reduce script response time. 
> 

Highly unlikely to matter, and it might slow down a program
 slightly rather than speed it up slightly. Get your program
 readable so you have a chance of catching bugs, pick your
 algorithms reasonably,  and if it's not fast enough, measure, 
 don't guess.


> Thanks in advance!
> 
> 
> 
> 
> 


-- 
DaveA



Android NewsGroup Reader
http://www.piaohong.tk/newsgroup

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 1:55 AM,   wrote:
> Le mercredi 15 janvier 2014 13:13:36 UTC+1, Ned Batchelder a écrit :
>
>>
>> ... more than one codepoint makes up a grapheme ...
>
> No

Yes.
http://www.unicode.org/faq/char_combmark.html

>> In Unicode terms, an encoding is a mapping between codepoints and bytes.
>
> No

Yes.
http://www.unicode.org/reports/tr17/
Specifically:
"Character Encoding Form: a mapping from a set of nonnegative integers
that are elements of a CCS to a set of sequences of particular code
units of some specified width, such as 32-bit integers"

Or are you saying that www.unicode.org is wrong about the definitions
of Unicode terms?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: setup.py issue - some files are included as intended, but one is not

2014-01-15 Thread Piet van Oostrum
Dan Stromberg  writes:

> On Sat, Jan 11, 2014 at 2:04 PM, Dan Stromberg  wrote:
>> Hi folks.
>>
>> I have a setup.py problem that's driving me nuts.
>
> Anyone?  I've received 0 responses.

I can't even install your code because there's a bug in it.

m4_treap.m4 contains this instruction twice:

ifdef(/*pyx*/,cp)if current is None:
ifdef(/*pyx*/,cp)raise KeyError

Which when generating pyx_treap.pyx (with *pyx* defined) expands to the 
syntactically incorrect

cpif current is None:
cpraise KeyError

-- 
Piet van Oostrum 
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Chris Angelico
On Wed, Jan 15, 2014 at 11:52 PM, Chris Angelico  wrote:
> One of the fundamentals of the internet is that connections *will*
> break. A friend of mine introduced me to Magic: The Gathering via a
> program that couldn't handle drop-outs, and it got extremely
> frustrating - we couldn't get a game going. Build your server such
> that your clients can disconnect and reconnect, and you protect
> yourself against half the problem; allow them to connect and kick the
> other connection off, and you solve the other half.

Case in point, and a very annoying one: Phone queues do NOT handle
drop-outs. There's no way to reconnect to the queue and resume your
place, you have to start over from the back of the queue. I'm
currently on hold to my ISP because of an outage, and the cordless
phone ran out of battery 27 minutes into an estimated 30-minute wait
time. (Though I suspect it'd be a lot longer than 30 minutes. Those
wait times are notoriously inaccurate.) So now I'm waiting, AGAIN, and
those previous 27 minutes of sitting around with their on-hold music
playing through speakerphone were of no value whatsoever. I can't
transfer to a different handset or connection, I have to just hope
that this one will get through.

With TCP-based servers, it's easy to do better than that - all you
have to do is separate the connection state from the actual socket,
and hang onto a "connection" for some period of time after its socket
disconnects (say, 10-15 minutes). Your users will thank you!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.x adoption

2014-01-15 Thread Travis Griggs
Here we go again…

On Jan 14, 2014, at 11:33 AM, Staszek  wrote:

> Hi
> 
> What's the problem with Python 3.x? It was first released in 2008, but
> web hosting companies still seem to offer Python 2.x rather.
> 
> For example, Google App Engine only offers Python 2.7.
> 
> What's wrong?...

Maybe what it means is that Python3 is just fine, but Google App Engine isn’t 
seeing a lot of development/improvement lately, that it’s just in maintenance 
mode. Imagine that, Google not finishing/maintaining something.

I wish amongst the periodic maelstroms of Python2 vs Python3 handwringing, 
people would look at the new project starts. When I work with someone’s old 
library that they’ve moved on from, I use python2 if I have to, but anytime I 
can, I use python3.

Personally, I wish they’d start python4, sure would take the heat out of the 3 
vs 2 debates. And maybe there’d be a program called twentyfour as a result.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.x adoption

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 2:43 AM, Travis Griggs  wrote:
> Personally, I wish they’d start python4, sure would take the heat out of the 
> 3 vs 2 debates. And maybe there’d be a program called twentyfour as a result.

Learn All Current Versions of Python in Twenty-Four Hours?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Travis Griggs

On Jan 15, 2014, at 4:50 AM, Robin Becker  wrote:

> On 15/01/2014 12:13, Ned Batchelder wrote:
> 
>>> On my utf8 based system
>>> 
>>> 
 robin@everest ~:
 $ cat ooo.py
 if __name__=='__main__':
import sys
s='A̅B'
print('version_info=%s\nlen(%s)=%d' % (sys.version_info,s,len(s)))
 robin@everest ~:
 $ python ooo.py
 version_info=sys.version_info(major=3, minor=3, micro=3,
 releaselevel='final', serial=0)
 len(A̅B)=3
 robin@everest ~:
 $
>>> 
>>> 
> 
>> You are right that more than one codepoint makes up a grapheme, and that 
>> you'll
>> need code to deal with the correspondence between them. But let's not muddy
>> these already confusing waters by referring to that mapping as an encoding.
>> 
>> In Unicode terms, an encoding is a mapping between codepoints and bytes.  
>> Python
>> 3's str is a sequence of codepoints.
>> 
> Semantics is everything. For me graphemes are the endpoint (or should be); to 
> get a proper rendering of a sequence of graphemes I can use either a sequence 
> of bytes or a sequence of codepoints. They are both encodings of the 
> graphemes; what unicode says is an encoding doesn't define what encodings are 
> ie mappings from some source alphabet to a target alphabet.

But you’re talking about two levels of encoding. One runs on top of the other. 
So insisting that you be able to call them all encodings, makes the term 
pointless, because now it’s ambiguous as to what you’re referring to. Are you 
referring to encoding in the sense of representing code points with bytes? Or 
are you referring to what the unicode guys call “forms”?

For example, the NFC form of ‘ñ’ is ’\u00F1’. ‘nThe NFD form represents the 
exact same grapheme, but is ‘\u006e\u0303’. You can call them encodings if you 
want, but I echo Ned’s sentiment that you keep that to yourself. 
Conventionally, they’re different forms, not different encodings. You can 
encode either form with an encoding, e.g.

'\u00F1'.encode('utf8’)
'\u00F1'.encode('utf16’)

'\u006e\u0303'.encode('utf8’)
'\u006e\u0303'.encode('utf16')

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 3:25 AM, William Ray Wing  wrote:
> On Jan 15, 2014, at 7:52 AM, Chris Angelico  wrote:
>> One of the fundamentals of the internet is that connections *will*
>> break. A friend of mine introduced me to Magic: The Gathering via a
>> program that couldn't handle drop-outs, and it got extremely
>> frustrating - we couldn't get a game going. Build your server such
>> that your clients can disconnect and reconnect, and you protect
>> yourself against half the problem; allow them to connect and kick the
>> other connection off, and you solve the other half.
>
> But note VERY carefully that this can open HUGE security holes if not done 
> with extreme care.
>
> Leaving a dangling connection (not session, TCP closes sessions) open is an 
> invitation so bad things happening.

Not sure what you mean here. I'm assuming an authentication system
that stipulates one single active connection per authenticated user
(if you reauthenticate with the same credentials, it'll disconnect the
other one on the presumption that the connection's been lost). In
terms of resource wastage, there's no difference between disconnecting
now and letting it time out, and waiting the ten minutes (or whatever)
and then terminating cleanly. Or do you mean another user gaining
access? It's still governed by the same authentication.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 3:31 AM, Chris Angelico  wrote:
> I'm assuming an authentication system
> that stipulates one single active connection per authenticated user

Incidentally, in an environment where everything's trusted (LAN or
localhost), the "authentication system" can be as simple as "type a
user name". I've done systems like that; first line entered becomes
the handle or key, and it does the same kick-off system on duplicate.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread William Ray Wing
On Jan 15, 2014, at 11:31 AM, Chris Angelico  wrote:

> On Thu, Jan 16, 2014 at 3:25 AM, William Ray Wing  wrote:
>> On Jan 15, 2014, at 7:52 AM, Chris Angelico  wrote:
>>> One of the fundamentals of the internet is that connections *will*
>>> break. A friend of mine introduced me to Magic: The Gathering via a
>>> program that couldn't handle drop-outs, and it got extremely
>>> frustrating - we couldn't get a game going. Build your server such
>>> that your clients can disconnect and reconnect, and you protect
>>> yourself against half the problem; allow them to connect and kick the
>>> other connection off, and you solve the other half.
>> 
>> But note VERY carefully that this can open HUGE security holes if not done 
>> with extreme care.
>> 
>> Leaving a dangling connection (not session, TCP closes sessions) open is an 
>> invitation so bad things happening.
> 
> Not sure what you mean here. I'm assuming an authentication system
> that stipulates one single active connection per authenticated user
> (if you reauthenticate with the same credentials, it'll disconnect the
> other one on the presumption that the connection's been lost). In
> terms of resource wastage, there's no difference between disconnecting
> now and letting it time out, and waiting the ten minutes (or whatever)
> and then terminating cleanly. Or do you mean another user gaining
> access? It's still governed by the same authentication.
> 

I was assuming another user picking up the connection using sniffed credentials 
(and yes, despite all the work on ssh, not all man-in-the-middle attacks have 
been killed).

-Bill

> ChrisA
> -- 
> https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.x adoption

2014-01-15 Thread Mark Lawrence

On 15/01/2014 16:14, Chris Angelico wrote:

On Thu, Jan 16, 2014 at 2:43 AM, Travis Griggs  wrote:

Personally, I wish they’d start python4, sure would take the heat out of the 3 
vs 2 debates. And maybe there’d be a program called twentyfour as a result.


Learn All Current Versions of Python in Twenty-Four Hours?

ChrisA



Totally unfair, Steven D'Aprano amongst others would have a head start :)

--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread John Ladasky
On Wednesday, January 15, 2014 12:40:33 AM UTC-8, Peter Otten wrote:
> Personally I feel dirty whenever I write Python code that defeats duck-
> typing -- so I would not /recommend/ any isinstance() check.

While I am inclined to agree, I have yet to see a solution to the problem of 
flattening nested lists/tuples which avoids isinstance().  If anyone has 
written one, I would like to see it, and consider its merits.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Robin Becker

On 15/01/2014 16:28, Travis Griggs wrote:
 of a sequence of graphemes I can use either a sequence of bytes or a 
sequence of codepoints. They are both encodings of the graphemes; what unicode 
says is an encoding doesn't define what encodings are ie mappings from some 
source alphabet to a target alphabet.


But you’re talking about two levels of encoding. One runs on top of the other. 
So insisting that you be able to call them all encodings, makes the term 
pointless, because now it’s ambiguous as to what you’re referring to. Are you 
referring to encoding in the sense of representing code points with bytes? Or 
are you referring to what the unicode guys call “forms”?

For example, the NFC form of ‘ñ’ is ’\u00F1’. ‘nThe NFD form represents the 
exact same grapheme, but is ‘\u006e\u0303’. You can call them encodings if you 
want, but I echo Ned’s sentiment that you keep that to yourself. 
Conventionally, they’re different forms, not different encodings. You can 
encode either form with an encoding, e.g.

'\u00F1'.encode('utf8’)
'\u00F1'.encode('utf16’)

'\u006e\u0303'.encode('utf8’)
'\u006e\u0303'.encode('utf16')



I think about these as encodings, because that's what they are mathematically, 
logically & practically. I can encode the target grapheme sequence as a sequence 
of bytes using a particular 'unicode encoding' eg utf8 or a sequence of code points.


The fact that unicoders want to take over the meaning of encoding is not 
relevant.

In my utf8 bash shell the python print() takes one encoding (python3 str) and 
translates that to the stdout encoding which happens to be utf8 and passes that 
to the shell which probably does a lot of work to render the result as graphical 
symbols (or graphemes).


I'm not anti unicode, that's just an assignment of identity to some symbols. 
Coding the values of the ids is a separate issue. It's my belief that we don't 
need more than the byte level encoding to represent unicode. One of the claims 
made for python3 unicode is that it somehow eliminates the problems associated 
with other encodings eg utf8, but in fact they will remain until we force 
printers/designers to stop using complicated multi-codepoint graphemes. I 
suspect that won't happen.

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.x adoption

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 3:46 AM, Mark Lawrence  wrote:
> On 15/01/2014 16:14, Chris Angelico wrote:
>>
>> On Thu, Jan 16, 2014 at 2:43 AM, Travis Griggs 
>> wrote:
>>>
>>> Personally, I wish they’d start python4, sure would take the heat out of
>>> the 3 vs 2 debates. And maybe there’d be a program called twentyfour as a
>>> result.
>>
>>
>> Learn All Current Versions of Python in Twenty-Four Hours?
>>
>> ChrisA
>>
>
> Totally unfair, Steven D'Aprano amongst others would have a head start :)

Heh. I said "Current" specifically to cut out 1.5.2, and also to
eliminate the need to worry about string exceptions and so on. But
mainly just for the pun.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 3:43 AM, William Ray Wing  wrote:
> I was assuming another user picking up the connection using sniffed 
> credentials (and yes, despite all the work on ssh, not all man-in-the-middle 
> attacks have been killed).

If that can happen, then I would much prefer that it kick my
connection off - at least that way, I have some chance of knowing it's
happened. But I suspect that this sort of thing is way WAY out of the
league of the OP's stated problem :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Python declarative

2014-01-15 Thread Sergio Tortosa Benedito
Hi I'm developing a sort of language extension for writing GUI programs
called guilang, right now it's written in Lua but I'm considreing Python
instead (because it's more tailored to alone applications). My question
it's if I can achieve this declarative-thing in python. Here's an
example:

Window "myWindow" {
title="Hello world";
Button "myButton" {
label="I'm a button";
onClick=exit
}
}
print(myWindow.myButton.label)

Of course it doesn't need to be 100% equal. Thanks in advance

Sergio


-- 
https://mail.python.org/mailman/listinfo/python-list


Communicate between Python and Node.js

2014-01-15 Thread Manish
I've been tasked to write a module that sends data from Django to a Node.js 
server running on the same machine. Some magic happens in node and I recv the 
results back, which are then rendered using Django templates. 

At first I thought to use the requests library to GET/POST data to node, but I 
googled around and it seems lots of people think TCP sockets are the way to go. 
I tried implementing my own using several examples I have found online. It 
*kind of* works. It seems like I get blocked while trying to receive data back 
in the recv() loop. I never reach the end. I'm not an expert in 
sockets/networking, but maybe I'm not wrong in guessing it is because of the 
non-blocking nature of Node.js ?

A Stackoverflow post helped a little more in figuring things out (though I'm 
not sure if I'm correct here). Right now, I'm failing during connect() - I get 
"Operation now in progress". 

So my question is, how can I get recv() to work properly so that data is 
seamlessly passed back and forth between my Python script and the node server. 
Am I taking the right approach? Is there any better way? 

Relevant scripts: 
1) http://bpaste.net/show/NI2z9RhbT3HVtLVWUKuq/ 
2) http://bpaste.net/show/YlulEZBTDE5KS5ZvSyET/

Thanks! 

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 3:55 AM, Robin Becker  wrote:
> I think about these as encodings, because that's what they are
> mathematically, logically & practically. I can encode the target grapheme
> sequence as a sequence of bytes using a particular 'unicode encoding' eg
> utf8 or a sequence of code points.

By that definition, you can equally encode it as a bitmapped image, or
as a series of lines and arcs, and those are equally well "encodings"
of the character. This is not the normal use of that word.

http://en.wikipedia.org/wiki/Character_encoding

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Robin Becker

On 15/01/2014 17:14, Chris Angelico wrote:

On Thu, Jan 16, 2014 at 3:55 AM, Robin Becker  wrote:

I think about these as encodings, because that's what they are
mathematically, logically & practically. I can encode the target grapheme
sequence as a sequence of bytes using a particular 'unicode encoding' eg
utf8 or a sequence of code points.


By that definition, you can equally encode it as a bitmapped image, or
as a series of lines and arcs, and those are equally well "encodings"
of the character. This is not the normal use of that word.

http://en.wikipedia.org/wiki/Character_encoding

ChrisA

Actually I didn't use the term 'character encoding', but that doesn't alter the 
argument. If I chose to embed the final graphemes as images encoded as bytes or 
lists of numbers that would still be still be an encoding; it just wouldn't be 
very easily usable (lots of typing).

--
Robin Becker

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python declarative

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 4:02 AM, Sergio Tortosa Benedito
 wrote:
> Hi I'm developing a sort of language extension for writing GUI programs
> called guilang, right now it's written in Lua but I'm considreing Python
> instead (because it's more tailored to alone applications). My question
> it's if I can achieve this declarative-thing in python. Here's an
> example:
>
> Window "myWindow" {
> title="Hello world";
> Button "myButton" {
> label="I'm a button";
> onClick=exit
> }
> }
> print(myWindow.myButton.label)

Probably the easiest way to do that would be with dictionaries or
function named arguments. It'd be something like this:

myWindow = Window(
title="Hello World",
myButton=Button(
label="I'm a button",
onClick=exit
)
)
print(myWindow.myButton.label)

For this to work, you'd need a Window class that recognizes a number
of keyword arguments (eg for title and other attributes), and then
takes all other keyword arguments and turns them into its children.
Possible, but potentially messy; if you happen to name your button
"icon", it might be misinterpreted as an attempt to set the window's
icon, and cause a very strange and incomprehensible error.

The syntax you describe there doesn't allow any flexibility in
placement. I don't know how you'd organize that; but what you could do
is something like GTK uses: a Window contains exactly one child, which
will usually be a layout manager. Unfortunately the Python GTK
bindings don't allow such convenient syntax as you use here, so you'd
need some sort of wrapper. But it wouldn't be hard to set up something
like this:


def clickme(obj):
print("You clicked me!")
obj.set_label("Click me again!")

myWindow = Window(
title="Hello World",
signal_delete=exit
).add(Vbox(spacing=10)
.add(Button(label="I'm a button!", signal_clicked=exit))
.add(Button(label="Click me", signal_clicked=clickme))
)

This is broadly similar to how I create a window using GTK with Pike,
and it's easy to structure the code the same way the window is
structured.

If the number of helper classes you'd have to make gets daunting, you
could possibly do this:

myWindow = GTK(Gtk.Window,
title="Hello World",
signal_delete=exit
).add(GTK(Gtk.Vbox, spacing=10)
.add(GTK(Gtk.Button, label="I'm a button!", signal_clicked=exit))
.add(GTK(Gtk.Button, label="Click me", signal_clicked=clickme))
)

so there's a single "GTK" class, which takes as its first positional
argument a GTK object type to clone. This is all perfectly valid
Python syntax, but I don't know of a convenient way to do this with
the current modules, so it may require writing a small amount of
wrapper code.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python 3.x adoption

2014-01-15 Thread Christopher Welborn

On 01/14/2014 01:33 PM, Staszek wrote:

Hi

What's the problem with Python 3.x? It was first released in 2008, but
web hosting companies still seem to offer Python 2.x rather.

For example, Google App Engine only offers Python 2.7.

What's wrong?...



My last two hosts have offered multiple versions of python.
I upgraded to python 3.3 recently on my site. I guess its easier for
some folks to offer such a thing, it just depends on their setup.

My host also offered Django 1.5 almost immediately after its release,
and the same with 1.6. They give the user options, and if they
break their site by upgrading too early (without migrating code) it's
the user's fault.
--

- Christopher Welborn 
  http://welbornprod.com

--
https://mail.python.org/mailman/listinfo/python-list


Re: Communicate between Python and Node.js

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 4:12 AM, Manish  wrote:
> At first I thought to use the requests library to GET/POST data to node, but 
> I googled around and it seems lots of people think TCP sockets are the way to 
> go. I tried implementing my own using several examples I have found online. 
> It *kind of* works. It seems like I get blocked while trying to receive data 
> back in the recv() loop. I never reach the end. I'm not an expert in 
> sockets/networking, but maybe I'm not wrong in guessing it is because of the 
> non-blocking nature of Node.js ?

Do you need to use non-blocking sockets here? I think, from a quick
skim of your code, that you'd do better with a blocking socket. Tip:
Any time you have a sleep(1) call in your code, look to see if it's
doing the wrong thing. In this case, I'm pretty sure it is. Sleeping
for a second and then trying to read from a nonblocking socket seems
like a messy way to just read until you have what you want.

There's another thread happening at the moment about networking in
Python. You may find it of interest.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Ian Kelly
On Wed, Jan 15, 2014 at 9:55 AM, Robin Becker  wrote:
> The fact that unicoders want to take over the meaning of encoding is not
> relevant.

A virus is a small infectious agent that replicates only inside the
living cells of other organisms.  In the context of computing however,
that definition is completely false, and if you insist upon it when
trying to talk about computers, you're only going to confuse people as
to what you mean.  Somehow, I haven't seen any biologists complaining
that computer users want to take over the meaning of virus.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Peter Otten
John Ladasky wrote:

> On Wednesday, January 15, 2014 12:40:33 AM UTC-8, Peter Otten wrote:
>> Personally I feel dirty whenever I write Python code that defeats duck-
>> typing -- so I would not /recommend/ any isinstance() check.
> 
> While I am inclined to agree, I have yet to see a solution to the problem
> of flattening nested lists/tuples which avoids isinstance().  If anyone
> has written one, I would like to see it, and consider its merits.

Well, you should always be able to find some property that discriminates 
what you want to treat as sequences from what you want to treat as atoms.

(flatten() Adapted from a nine-year-old post by Nick Craig-Wood
)

>>> def flatten(items, check):
... if check(items):
... for item in items:
... yield from flatten(item, check)
... else:
... yield items
... 
>>> items = [1, 2, (3, 4), [5, [6, (7,)]]]
>>> print(list(flatten(items, check=lambda o: hasattr(o, "sort"
[1, 2, (3, 4), 5, 6, (7,)]
>>> print(list(flatten(items, check=lambda o: hasattr(o, "count"
[1, 2, 3, 4, 5, 6, 7]

The approach can of course break

>>> items = ["foo", 1, 2, (3, 4), [5, [6, (7,)]]]
>>> print(list(flatten(items, check=lambda o: hasattr(o, "count"
Traceback (most recent call last):
  File "", line 1, in 
  File "", line 4, in flatten
  File "", line 4, in flatten
  File "", line 4, in flatten
  File "", line 4, in flatten
  File "", line 4, in flatten
  File "", line 4, in flatten
  File "", line 4, in flatten
  File "", line 2, in flatten
RuntimeError: maximum recursion depth exceeded

and I'm the first to admit that the fix below looks really odd:

>>> print(list(flatten(items, check=lambda o: hasattr(o, "count") and not 
hasattr(o, "split"
['foo', 1, 2, 3, 4, 5, 6, 7]

In fact all of the following examples look more natural...

>>> print(list(flatten(items, check=lambda o: isinstance(o, list
['foo', 1, 2, (3, 4), 5, 6, (7,)]
>>> print(list(flatten(items, check=lambda o: isinstance(o, (list, 
tuple)
['foo', 1, 2, 3, 4, 5, 6, 7]
>>> print(list(flatten(items, check=lambda o: isinstance(o, (list, tuple)) 
or (isinstance(o, str) and len(o) > 1
['f', 'o', 'o', 1, 2, 3, 4, 5, 6, 7]

... than the duck-typed variants because it doesn't matter for the problem 
of flattening whether an object can be sorted or not. But in a real-world 
application the "atoms" are more likely to have something in common that is 
required for the problem at hand, and the check for it with 

def check(obj):
return not (obj is an atom) # pseudo-code

may look more plausible.

-- 
https://mail.python.org/mailman/listinfo/python-list


Python Scalability TCP Server + Background Game

2014-01-15 Thread phiwer
My problem is as follows: 

I'm developing an online game with the requirement of being able to handle 
thousands of requests every second.

The frontend consists of web server(s) exposing a rest api. These web servers 
in turn communicate with a game server over TCP. When a message arrives at the 
game server, each client handler inserts the client message into a shared 
message queue, and then waits for the result from the game loop. When the game 
loop has informed the waiting handler that a result is ready, the handler 
returns the result to the client.

Things to take note of:

1) The main game loop runs in a separate process, and the intention is to use a 
Queue from the multiprocess library.

2) The network layer of the game server runs a separate process as well, and my 
intention was to use gevent or tornado 
(http://nichol.as/asynchronous-servers-in-python).

3) The game server has a player limit of 5. My requirement/desire is to be 
able to serve 50k requests per second (without any caching layer, although the 
game server will cache data), so people don't get a poor user experience during 
high peaks.

4) The game is not a real-time based game, but is catered towards the web.

And now to my little problem. All high performance async TCP servers use 
greenlets (or similar light threads), but does not seem to be compatible with 
the multiprocess library. From what I've read, Python (largely due to GIL) does 
not seem suited for this type of task, compared to other languages where 
threading and IPC is not an issue.

Due to this information, I have developed the initial server using netty in 
java. I would, however, rather develop the server using python, if possible. 
But if these limitations truly exist, then I'll go with java for the game 
server, and then use python for the frontend.

Has anyone developed something similar, and if so, could you point me in the 
right direction, or perhaps I've missed something along the way?

If one can solve the issue with IPC, it seems the high performance servers 
tested on this page, http://nichol.as/asynchronous-servers-in-python, only can 
handle roughly 8k requests per second before performance degrades. Does anyone 
know how Python high performance TCP servers compare to other language's TCP 
servers?

Thanks for all replies!
-- 
https://mail.python.org/mailman/listinfo/python-list


Bind event is giving me a bug.

2014-01-15 Thread eneskristo
While working with tkinter in python 3.3, I had the following problem.
def get_text(event):
self.number_of_competitors = entered_text.get()
try:
self.number_of_competitors = int(self.number_of_competitors)
except:
pass
if type(self.number_of_competitors) == int:
root.destroy()
else:
label.config(text = "Enter the number of competitors. Please 
enter a number.")
root = Tk()
label = Label(root, text = "Enter the number of competitors.")
label.pack(side = TOP)
entered_text = Entry(root)
entered_text.pack()
Button(root, text = "Submit", command = get_text).pack()
root.bind('', get_text)
root.mainloop()

This is a buggy part of the code. When I run it, instead of doing what it 
should do, it responds to all events BUT enter. I'm not sure if this error is 
on tkinters or my side. Please help!
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Learning python networking

2014-01-15 Thread William Ray Wing
On Jan 15, 2014, at 7:52 AM, Chris Angelico  wrote:

[megabyte]

> One of the fundamentals of the internet is that connections *will*
> break. A friend of mine introduced me to Magic: The Gathering via a
> program that couldn't handle drop-outs, and it got extremely
> frustrating - we couldn't get a game going. Build your server such
> that your clients can disconnect and reconnect, and you protect
> yourself against half the problem; allow them to connect and kick the
> other connection off, and you solve the other half. (Sometimes, the
> server won't know that the client has gone, so it helps to be able to
> kick like that.) It might not be an issue when you're playing around
> with localhost, and you could even get away with it on a LAN, but on
> the internet, it's so much more friendly to your users to let them
> connect multiple times like that.

But note VERY carefully that this can open HUGE security holes if not done with 
extreme care.

Leaving a dangling connection (not session, TCP closes sessions) open is an 
invitation so bad things happening.

-Bill
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bind event is giving me a bug.

2014-01-15 Thread Peter Otten
eneskri...@gmail.com wrote:

> While working with tkinter in python 3.3, I had the following problem.

> root = Tk()
> label = Label(root, text = "Enter the number of competitors.")
> label.pack(side = TOP)
> entered_text = Entry(root)
> entered_text.pack()
> Button(root, text = "Submit", command = get_text).pack()
> root.bind('', get_text)

Quoting http://infohost.nmt.edu/tcc/help/pubs/tkinter/web/event-types.html
"""
Enter
The user moved the mouse pointer into a visible part of a widget. (This is 
different than the enter key, which is a KeyPress event for a key whose name 
is actually 'return'.)
"""

So I think you want "", not "".

> root.mainloop()
> 
> This is a buggy part of the code. When I run it, instead of doing what it
> should do, it responds to all events BUT enter. I'm not sure if this error
> is on tkinters or my side. Please help!


-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python Scalability TCP Server + Background Game

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 5:37 AM,   wrote:
> 3) The game server has a player limit of 5. My requirement/desire is to 
> be able to serve 50k requests per second (without any caching layer, although 
> the game server will cache data), so people don't get a poor user experience 
> during high peaks.

Quick smoke test. How big are your requests/responses? You mention
REST, which implies they're going to be based on HTTP. I would expect
you would have some idea of the rough size. Multiply that by 50,000,
and see whether your connection can handle it. For instance, if you
have a 100Mbit/s uplink, supporting 50K requests/sec means your
requests and responses have to fit within about 256 bytes each,
including all overhead. You'll need a gigabit uplink to be able to
handle a 2KB request or response, and that's assuming perfect
throughput. And is 2KB enough for you?

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bind event is giving me a bug.

2014-01-15 Thread MRAB

On 2014-01-15 20:16, eneskri...@gmail.com wrote:

While working with tkinter in python 3.3, I had the following problem.
def get_text(event):
 self.number_of_competitors = entered_text.get()
 try:
 self.number_of_competitors = int(self.number_of_competitors)


A bare except like this is virtually never a good idea:


 except:
 pass
 if type(self.number_of_competitors) == int:
 root.destroy()
 else:
 label.config(text = "Enter the number of competitors. Please enter 
a number.")


Something like this would be better:

try:
self.number_of_competitors = int(entered_text.get())
except ValueError:
label.config(text="Enter the number of competitors. Please 
enter a number.")

else:
root.destroy()


root = Tk()
label = Label(root, text = "Enter the number of competitors.")
label.pack(side = TOP)
entered_text = Entry(root)
entered_text.pack()


This will make it call 'get_text' when the button is clicked:


Button(root, text = "Submit", command = get_text).pack()


This will make it call 'get_text' when the pointer enters the frame:


root.bind('', get_text)


Did you mean '', i.e. the Return key?


root.mainloop()

This is a buggy part of the code. When I run it, instead of doing what it 
should do, it responds to all events BUT enter. I'm not sure if this error is 
on tkinters or my side. Please help!



--
https://mail.python.org/mailman/listinfo/python-list


Re: Bind event is giving me a bug.

2014-01-15 Thread eneskristo
Thank you, I thought Enter was Enter, but I still have this problem, when I 
press the Button, this appears:
Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python33\lib\tkinter\__init__.py", line 1475, in __call__
return self.func(*args)
TypeError: get_text() missing 1 required positional argument: 'event'
Exception in Tkinter callback
Traceback (most recent call last):
  File "C:\Python33\lib\tkinter\__init__.py", line 1475, in __call__
return self.func(*args)
TypeError: get_text() missing 1 required positional argument: 'event'

Should I make 2 functions, or is there a simpler solution?
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bind event is giving me a bug.

2014-01-15 Thread Peter Otten
MRAB wrote:

> This will make it call 'get_text' when the button is clicked:
> 
>> Button(root, text = "Submit", command = get_text).pack()

...and then produce a TypeError because of the missing `event` argument. To 
avoid that you can provide a default with

def get_text(event=None):
...

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Cameron Simpson
On 15Jan2014 05:03, Tim Chase  wrote:
> On 2014-01-15 01:27, Steven D'Aprano wrote:
> > class TextOnlyDict(dict):
> > def __setitem__(self, key, value):
> > if not isinstance(key, str):
> > raise TypeError
> > super().__setitem__(key, value)
> > # need to override more methods too
> > 
> > 
> > But reading Guido, I think he's saying that wouldn't be a good
> > idea. I don't get it -- it's not a violation of the Liskov
> > Substitution Principle, because it's more restrictive, not less.
> > What am I missing?
> 
> Just as an observation, this seems almost exactly what anydbm does,
> behaving like a dict (whether it inherits from dict, or just
> duck-types like a dict), but with the limitation that keys/values need
> to be strings.

I would expect anydbm to be duck typing: just implementing the
mapping interface and directing the various methods directly to the
DBM libraries.

The comment in question was specificly about subclassing dict.

There is a rule of thumb amongst the core devs and elder Python
programmers that it is a bad idea to subclass the basic types which
I have seen stated many times, but not explained in depth.

Naively, I would have thought subclassing dict to constraint the
key types for some special purpose seems like a fine idea. You'd need
to override .update() as well and also the initialiser. Maybe it
is harder than it seems.

The other pitfall I can see is code that does an isinstance(..,
dict) check for some reason; having discovered that it has a dict
it may behave specially. Pickle? Who knows? Personally, if it is
using isinstance instead of a direct type() check then I think it
should expect to cope with subclasses.

I've subclassed str() a number of times, most extensively as a URL
object that is a str with a bunch of utility methods, and it seems
to work well.

I've subclassed dict a few times, most extensively as the in-memory
representation of record in a multibackend data store which I use
for a bunch of things. That is also working quite well.

The benefit of subclassing dict is getting a heap of methods like
iterkeys et al for free. A from-scratch mapping has a surprising
number of methods involved.

Cheers,
-- 
Cameron Simpson 

BTW, don't bother flaming me. I can't read.
- afde...@lims03.lerc.nasa.gov (Stephen Dennison)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: proposal: bring nonlocal to py2.x

2014-01-15 Thread Terry Reedy

On 1/15/2014 7:07 AM, Robin Becker wrote:

On 13/01/2014 15:28, Chris Angelico wrote:
..


It's even worse than that, because adding 'nonlocal' is not a bugfix.
So to be committed to the repo, it has to be approved for either 2.7
branch (which is in bugfix-only maintenance mode) or 2.8 branch (which
does not exist). Good luck. :)

...
fixing badly named variables is not a bug fix either, but that has
happened in python 2.7. A micro change release changed

compiler.consts.SC_GLOBAL_EXPLICT
to
compiler.consts.SC_GLOBAL_EXPLICIT

this is a change of api for the consts module (if you regard exported
variables as part of its api),


A bug is generally a discrepancy between the doc the defines a version 
of the language and the code that implements that version. Yes, code 
fixes break code that depends on the bug, which is why tests should be 
run with bug-fix releases, and why some bug fixes are treated as 
enhancements and not back-ported. They also fix current and future code 
written to the specification.


Since the compiler.consts submodule is not documented, I believe it was 
regarded as an internal module for use only by the pycodegen and symbols 
modules. The misspelling was introduced in the patch for

  http://bugs.python.org/issue999042
which also introduced SC_GLOBAL_IMPLICIT, correctly spelled. EXPLICT was 
fixed in all three modules by Antoine Pitrou in

  http://bugs.python.org/issue15212

In any case, I estimate the impact of backporting a major new feature 
like a new keyword to be at least 10 times that of this spelling fix.


> but that didn't count for the developers.

If you are suggesting that developers casually violate out policy of 
only bug fixes in microreleases, that is unfair and false. It is mostly 
users who push at us to backport their favorite new feature. Antoine 
strongly supports and enforces the policy, as do I.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python declarative

2014-01-15 Thread Terry Reedy

On 1/15/2014 12:33 PM, Chris Angelico wrote:

On Thu, Jan 16, 2014 at 4:02 AM, Sergio Tortosa Benedito
 wrote:

Hi I'm developing a sort of language extension for writing GUI programs
called guilang, right now it's written in Lua but I'm considreing Python
instead (because it's more tailored to alone applications). My question
it's if I can achieve this declarative-thing in python. Here's an
example:

Window "myWindow" {
 title="Hello world";
 Button "myButton" {
 label="I'm a button";
 onClick=exit
 }
}
print(myWindow.myButton.label)


Probably the easiest way to do that would be with dictionaries or
function named arguments. It'd be something like this:

myWindow = Window(
 title="Hello World",
 myButton=Button(
 label="I'm a button",
 onClick=exit
 )
)
print(myWindow.myButton.label)


This is exactly what I was going to suggest.


For this to work, you'd need a Window class that recognizes a number
of keyword arguments (eg for title and other attributes), and then
takes all other keyword arguments and turns them into its children.


I would make the required args positional-or-keyword, with or without a 
default. Something like (tested)


class Window:
def __init__(self, title, *kwds)  # or title='Window title'
self.title = title
self.__dict__.update(kwds)

class Button:
def __init__(self, label, **kwds):
self.label = label
self.__dict__.update(kwds)


>>>
I'm a button


Possible, but potentially messy; if you happen to name your button
"icon", it might be misinterpreted as an attempt to set the window's
icon, and cause a very strange and incomprehensible error.


Puns are always a problem with such interfaces. Validate the args as 
much as possible. An icon should be a bitmap of appropriate size. 
Optional args should perhaps all be widgets (instances of a Widget 
baseclass).


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python declarative

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 9:58 AM, Terry Reedy  wrote:
> class Window:
> def __init__(self, title, *kwds)  # or title='Window title'
> self.title = title
> self.__dict__.update(kwds)

Does that want a second asterisk, matching the Button definition?

>> Possible, but potentially messy; if you happen to name your button
>> "icon", it might be misinterpreted as an attempt to set the window's
>> icon, and cause a very strange and incomprehensible error.
>
> Puns are always a problem with such interfaces. Validate the args as much as
> possible. An icon should be a bitmap of appropriate size. Optional args
> should perhaps all be widgets (instances of a Widget baseclass).

Yeah, but you'd still get back an error saying "icon should be a
bitmap" where the real problem is "icon should be called something
else". It might be worth explicitly adorning properties, or separating
them into two categories. Since the keyword-named-children system has
the other problem of being hard to lay out (how do you specify the
order?), I'd look at keyword args for properties and something
separate for children - either the layout I used above with .add(),
which allows extra args as necessary, or something like this:

myWindow = Window(
 title="Hello World",
 children=[Button(
 label="I'm a button",
 onClick=exit
 )]
)

Or maybe allow "child=" as a shortcut, since a lot of widgets will
have exactly one child.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question about object lifetime and access

2014-01-15 Thread Steven D'Aprano
On Wed, 15 Jan 2014 05:14:59 -0800, Asaf Las wrote:

> I have read somewhere that global objects are referenced from module
> namespace will never have reference count down to 0 even if they are not
> referenced from functions or class methods. Is this true? 

Correct. The global name is a reference, so the reference count will be 
at least 1. In fact, referencing the name from a function or method 
doesn't increase the ref count:

instance = 123.456789  # ref count of float is 1

def test():
print(instance)  # This refers to the *name* "instance", not the float

So the test() function cannot keep the float alive. If you reassign 
global instance, test() will see the new value, not the old, and 
123.456789 is free to be garbage collected.

This sounds more complicated than it actually is. In practice it works 
exactly as you expect global variables to work:

py> test()
123.456789
py> instance = 98765.4321
py> test()
98765.4321


> Does it mean
> that global objects are destroyed when interpreter exits or thread where
> it runs is terminated?

Certainly not! Global objects are no different from any other object. 
They are destroyed when their reference count falls to zero. In the case 
of global objects, that is *often* not until the interpreter exits, but 
it can be before hand.

So long as the object is in use, it will be kept. When it is no longer in 
use, the garbage collector is free to destroy it. So long as *some* 
object or name holds a reference to it, it is considered to be in use.

value = instance = 1.23456  # ref count 2
alist = [1, 2, 3, 4, 5, value]  # ref count now 3
mydict = {"Key": alist} # ref count now 4
value = 42  # rebind a name, ref count of float now 3
mydict.clear()  # ref count now 2
del instance# delete the name, ref count now 1
assert alist[5] == 1.23456
alist[5] = 0# final reference gone, ref count is now 0

At this point the global object 1.23456 is free to be destroyed.

(Note: some Python implementations don't do reference counting, e.g. 
Jython and IronPython use the Java and .Net garbage collectors 
respectively. In their case, the same rule applies: where there are no 
longer any references to an object, it will be garbage collected. The 
only difference is in how soon that occurs: in CPython, it will be 
immediate, in Jython or IronPython it will occur when the garbage 
collector runs.)



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Terry Reedy

On 1/15/2014 11:55 AM, Robin Becker wrote:


The fact that unicoders want to take over the meaning of encoding is not
relevant.


I agree with you that 'encoding' should not be limited to 'byte encoding 
of a (subset of) unicode characters. For instance, .jpg and .png are 
byte encodings of images. In the other hand, it is common in human 
discourse to omit qualifiers in particular contexts. 'Computer virus' 
gets condensed to 'virus' in computer contexts.


The problem with graphemes is that there is no fixed set of unicode 
graphemes. Which is to say, the effective set of graphemes is 
context-specific. Just limiting ourselves to English, 'fi' is usually 2 
graphemes when printing to screen, but often just one when printing to 
paper. This is why the Unicode consortium punted 'graphemes' to 
'application' code.



I'm not anti unicode, that's just an assignment of identity to some
symbols. Coding the values of the ids is a separate issue. It's my
belief that we don't need more than the byte level encoding to represent
unicode. One of the claims made for python3 unicode is that it somehow
eliminates the problems associated with other encodings eg utf8,


The claim is true for the following problems of the way-too-numerous 
unicode byte encodings.


Subseting: only a subset of characters can be encoded.

Shifting: the meaning of a byte depends on a preceding shift character, 
which might be back as the beginning of the sequence.


Varying size: the number of bytes to encode a character depends on the 
character.


Both of the last two problems can turn O(1) operations into O(n) 
operations. 3.3+ eliminates all these problems.


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Steven D'Aprano
On Thu, 16 Jan 2014 02:14:38 +1100, Chris Angelico wrote:

> On Thu, Jan 16, 2014 at 1:55 AM,   wrote:
>> Le mercredi 15 janvier 2014 13:13:36 UTC+1, Ned Batchelder a écrit :
>>
>>
>>> ... more than one codepoint makes up a grapheme ...
>>
>> No
> 
> Yes.
> http://www.unicode.org/faq/char_combmark.html
> 
>>> In Unicode terms, an encoding is a mapping between codepoints and
>>> bytes.
>>
>> No
> 
> Yes.
> http://www.unicode.org/reports/tr17/
> Specifically:
> "Character Encoding Form: a mapping from a set of nonnegative integers
> that are elements of a CCS to a set of sequences of particular code
> units of some specified width, such as 32-bit integers"

Technically Unicode talks about mapping code points and code *units*, but 
since code units are defined in terms of bytes, I think it is fair to cut 
out one layer of indirection and talk about mapping code points to bytes. 
For instance, UTF-32 uses 4-byte code units, and every code point U+ 
through U+10 is mapped to a single code unit, which is always a four-
byte quantity. UTF-8, on the other hand, uses single-byte code units, and 
maps code points to a variable number of code units, so UTF-8 maps code 
points to either 1, 2, 3 or 4 bytes.


> Or are you saying that www.unicode.org is wrong about the definitions of
> Unicode terms?

No, I think he is saying that he doesn't know Unicode anywhere near as 
well as he thinks he does. The question is, will he cherish his 
ignorance, or learn from this thread?




-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python-list@python.org

2014-01-15 Thread Steven D'Aprano
On Wed, 15 Jan 2014 02:25:34 +0100, Florian Lindner wrote:

> Am Dienstag, 14. Januar 2014, 17:00:48 schrieb MRAB:
>> On 2014-01-14 16:37, Florian Lindner wrote:
>> > Hello!
>> >
>> > I'm using python 3.2.3 on debian wheezy. My script is called from my
>> > mail delivery agent (MDA) maildrop (like procmail) through it's
>> > xfilter directive.
>> >
>> > Script works fine when used interactively, e.g. ./script.py <
>> > testmail but when called from maildrop it's producing an infamous
>> > UnicodeDecodeError:

What's maildrop? When using third party libraries, it's often helpful to 
point to give some detail on what they are and where they are from.


>> > File "/home/flindner/flofify.py", line 171, in main
>> >   mail = sys.stdin.read()

What's the value of sys.stdin? If you call this from your script:
 
print(sys.stdin)

what do you get? Is it possible that the mysterious maildrop is messing 
stdin up?


>> > File "/usr/lib/python3.2/encodings/ascii.py", line 26, in decode
>> >   return codecs.ascii_decode(input, self.errors)[0]
>> >
>> > Exception for example is always like
>> >
>> > UnicodeDecodeError: 'ascii' codec can't decode byte 0x82 in position
>> > 869: ordinal not in range(128) 

That makes perfect sense: byte 0x82 is not in the ASCII range. ASCII is 
limited to bytes values 0 through 127, and 0x82 is hex for 130. So the 
error message is telling you *exactly* what the problem is: your email 
contains a non-ASCII character, with byte value 0x82.

How can you deal with this?

(1) "Oh gods, I can't deal with this, I wish the whole world was America 
in 1965 (except even back then, there were English characters in common 
use that can't be represented in ASCII)! I'm going to just drop anything 
that isn't ASCII and hope it doesn't mangle the message *too* badly!"

You need to set the error handler to 'ignore'. How you do that may depend 
on whether or not maildrop is monkeypatching stdin.


(2) "Likewise, but instead of dropping the offending bytes, I'll replace 
them with something that makes it obvious that an error has occurred."

Set the error handler to "replace". You'll still mangle the email, but it 
will be more obvious that you mangled it.


(3) "ASCII? Why am I trying to read email as ASCII? That's not right. 
Email can contain arbitrary bytes, and is not limited to pure ASCII. I 
need to work out which encoding the email is using, but even that is not 
enough, since emails sometimes contain the wrong encoding information or 
invalid bytes. Especially spam, that's particularly poor. (What a 
surprise, that spammers don't bother to spend the time to get their code 
right?) Hmmm... maybe I ought to use an email library that actually gets 
these issues *right*?"

What does the maildrop documentation say about encodings and/or malformed 
email?


>> > I read mail from stdin "mail = sys.stdin.read()"
>> >
>> > Environment when called is:
>> >
>> > locale.getpreferredencoding(): ANSI_X3.4-1968 environ["LANG"]: C

For a modern Linux system to be using the C encoding is not a good sign. 
It's not 1970 anymore. I would expect it should be using UTF-8. But I 
don't think that's relevant to your problem (although a mis-configured 
system may make it worse).


>> > System environment when using shell is:
>> >
>> > ~ % echo $LANG
>> > en_US.UTF-8

That's looking more promising.


>> > As far as I know when reading from stdin I don't need an decode(...)
>> > call, since stdin has a decoding. 

That depends on what stdin actually is. Please print it and show us.

Also, can you do a visual inspection of the email that is failing? If 
it's spam, perhaps you can just drop it from the queue and deal with this 
issue later.


>> > I also tried some decoding/encoding
>> > stuff but changed nothing.

Ah, but did you try the right stuff? (Randomly perturbing your code in 
the hope that the error will go away is not a winning strategy.)


>> > Any ideas to help me?
>> >
>> When run from maildrop it thinks that the encoding of stdin is ASCII.
> 
> Well, true. But what encoding does maildrop actually gives me? It
> obviously does not inherit LANG or is called from the MTA that way.

Who knows? What's maildrop? What does its documentation say about 
encodings? The fact that it is using ASCII apparently by default does not 
give me confidence that it knows how to deal with 8-bit emails, but I 
might be completely wrong.


> I also tried:
> 
> inData = codecs.getreader('utf-8')(sys.stdin) 
> mail = inData.read()
> 
> Failed also. But I'm not exactly an encoding expert.

Failed how? Please copy and paste your exact exception traceback, in full.

Ultimately, dealing with email is a hard problem. So long as you only 
receive 7-bit ASCII mail, you don't realise how hard it is. But the 
people who write the mail libraries -- at least the good ones -- know 
just how hard it really is. You can have 8-bit emails with no encoding 
set, or the wrong encoding, or the right encoding but the contents then 

Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Steven D'Aprano
On Wed, 15 Jan 2014 12:00:51 +, Robin Becker wrote:

> so two 'characters' are 3 (or 2 or more) codepoints.

Yes.


> If I want to isolate so called graphemes I need an algorithm even 
> for python's unicode

Correct. Graphemes are language dependent, e.g. in Dutch "ij" is usually 
a single grapheme, in English it would be counted as two. Likewise, in 
Czech, "ch" is a single grapheme. The Latin form of Serbo-Croation has 
two two-letter graphemes, Dž and Nj (it used to have three, but Dj is now 
written as Đ).

Worse, linguists sometimes disagree as to what counts as a grapheme. For 
instance, some authorities consider the English "sh" to be a separate 
grapheme. As a native English speaker, I'm not sure about that. Certainly 
it isn't a separate letter of the alphabet, but on the other hand I can't 
think of any words containing "sh" that should be considered as two 
graphemes "s" followed by "h". Wait, no, that's not true... compound 
words such as "glasshouse" or "disheartened" are counter examples.


> ie when it really matters, python3 str is just another encoding.

I'm not entirely sure how a programming language data type (str) can be 
considered a transformation.



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python declarative

2014-01-15 Thread Tim Chase
On 2014-01-16 10:09, Chris Angelico wrote:
> myWindow = Window(
>  title="Hello World",
>  children=[Button(
>  label="I'm a button",
>  onClick=exit
>  )]
> )

This also solves the problem that **kwargs are just a dict, which is
inherently unordered.  So with the previous scheme, you'd just get an
unordered bag of controls that Python could then dump into your
containing window as its dict-traversal algorithms saw fit. :-)

-tkc


 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: python-list@python.org

2014-01-15 Thread Ben Finney
Steven D'Aprano  writes:

> On Wed, 15 Jan 2014 02:25:34 +0100, Florian Lindner wrote:
> >> On 2014-01-14 16:37, Florian Lindner wrote:
> >> > I'm using python 3.2.3 on debian wheezy. My script is called from
> >> > my mail delivery agent (MDA) maildrop (like procmail) through
> >> > it's xfilter directive.
> >> >
> >> > Script works fine when used interactively, e.g. ./script.py <
> >> > testmail but when called from maildrop it's producing an infamous
> >> > UnicodeDecodeError:
>
> What's maildrop? When using third party libraries, it's often helpful to 
> point to give some detail on what they are and where they are from.

It's not a library; as he says, it's an MDA program. It is from the
Courier mail application http://www.courier-mta.org/maildrop/>.

From that, I understand Florian to be saying his Python program is
invoked via command-line from some configuration directive for Maildrop.

> What does the maildrop documentation say about encodings and/or
> malformed email?

I think this is the more likely line of enquiry to diagnose the problem.

> For a modern Linux system to be using the C encoding is not a good
> sign.

That's true, but it's likely a configuration problem: the encoding needs
to be set *and* obeyed at an administrative and user-profile level.

> It's not 1970 anymore. I would expect it should be using UTF-8. But I 
> don't think that's relevant to your problem (although a mis-configured 
> system may make it worse).

Since the MDA runs usually not as a system service, but rather at a
user-specific level, I would expect some interaction of the host locale
and the user-specific locale is the problem.

> Who knows? What's maildrop? What does its documentation say about 
> encodings?

I hope the original poster enjoys manpages, since that's how the program
is documented http://www.courier-mta.org/maildrop/documentation.html>.

> The fact that it is using ASCII apparently by default does not give me
> confidence that it knows how to deal with 8-bit emails, but I might be
> completely wrong.

I've found that the problem is often that Python is the party assuming
that stdin and stdout are ASCII, largely because it hasn't been told
otherwise.

-- 
 \“The greatest tragedy in mankind's entire history may be the |
  `\   hijacking of morality by religion.” —Arthur C. Clarke, 1991 |
_o__)  |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


data validation when creating an object

2014-01-15 Thread Rita
I would like to do some data validation when its going to a class.

class Foo(object):
  def __init__(self):
pass

I know its frowned upon to do work in the __init__() method and only
declarations should be there.

So, should i create a function called validateData(self) inside foo?

I would call the object like this

x=Foo()
x.validateData()

Is this the preferred way? Is there a way I can run validateData()
automatically, maybe put it in __init__? Or are there other techniques
people use for this sort of thing?


-- 
--- Get your facts first, then you can distort them as you please.--
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Ben Finney
Rita  writes:

> I would like to do some data validation when its going to a class.
>
> class Foo(object):
>   def __init__(self):
> pass
>
> I know its frowned upon to do work in the __init__() method and only
> declarations should be there.

Who says it's frowned on to do work in the initialiser? Where are they
saying it? That seems over-broad, I'd like to read the context of that
advice.

> So, should i create a function called validateData(self) inside foo?

If you're going to create it, ‘validate_data’ would be a better name
(because it's PEP 8 conformant).

> I would call the object like this
>
> x=Foo()
> x.validateData()

You should also be surrounding the “=” operator with spaces (PEP 8
again) for readability.

> Is this the preferred way? Is there a way I can run validateData()
> automatically, maybe put it in __init__?

It depends entirely on what is being done in those functions.

But in general, we tend not to write our functions small enough or
focussed enough. So general advice would be that, if you think the
function is going to be too long and/or doing too much, you're probably
right :-)

-- 
 \ “Nature hath given men one tongue but two ears, that we may |
  `\  hear from others twice as much as we speak.” —Epictetus, |
_o__)  _Fragments_ |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Cameron Simpson
On 15Jan2014 20:09, Rita  wrote:
> I would like to do some data validation when its going to a class.
> 
> class Foo(object):
>   def __init__(self):
> pass
> 
> I know its frowned upon to do work in the __init__() method and only
> declarations should be there.

This rule of thumb does not mean "do nothing". It may be perfetly
vaid to open file or database connections in __init__, etc, depending
on what the object is for.

The post condition of __init__ is that the object is appropriately
initialised. The definition of appropriate depends on you.

Data validation is very much an appropriate thing to do in __init__.

If it highly recommended to validate your inputs if that is feasible
and easy at __init__ time. It is far better to get a ValueError
exception (the usual exception for invalid values, which an invalid
initialiser certainly is) at object creation time than at some less
convenient time later during use.

> So, should i create a function called validateData(self) inside foo?

It might be a good idea to make a (probably private) method to check
for validity and/or integrity, eg:

  def _is_valid(self):
... checks here, return True if ok ...

because that would allow you to call this at arbitrary other times
if you need to debug.

However, I would also have obvious validity checks in __init__
itself on the supplied values. Eg:

  def __init__(self, size, lifetime):
if size < 1:
  raise ValueError("size must be >= 1, received: %r" % (size,))
if lifetime <= 0:
  raise ValueError("lifetime must be > 0, received: %r" % (lifetime,))

Trivial, fast. Fails early. Note that the exception reports the
receive value; very handy for simple errors like passing utterly
the wrong thing (eg a filename when you wanted a counter, or something
like that).

Certainly also put a test of self._is_valid() at the bottom of
__init__, at least during the debug phase. Provided _is_valid() is
cheap and fast. For example, it would be appropriate to check that
a filename was a string. It would probably (your call) be inappropriate
to open the file or checksum its contents etc etc.

Cheers,
-- 
Cameron Simpson 

I distrust a research person who is always obviously busy on a task.
- Robert Frosch, VP, GM Research
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: 'Straße' ('Strasse') and Python 2

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 11:43 AM, Steven D'Aprano
 wrote:
> Worse, linguists sometimes disagree as to what counts as a grapheme. For
> instance, some authorities consider the English "sh" to be a separate
> grapheme. As a native English speaker, I'm not sure about that. Certainly
> it isn't a separate letter of the alphabet, but on the other hand I can't
> think of any words containing "sh" that should be considered as two
> graphemes "s" followed by "h". Wait, no, that's not true... compound
> words such as "glasshouse" or "disheartened" are counter examples.

Digression: When I was taught basic English during my school days, my
mum used Spalding's book and the 70 phonograms. 25 of them are single
letters (Q is not a phonogram - QU is), and the others are mostly
pairs (there are a handful of 3- and 4-letter phonograms). Not every
instance of "s" followed by "h" is the phonogram "sh" - only the times
when it makes the single sound "sh" (which it doesn't in "glasshouse"
or "disheartened").

Thing is, you can't define spelling and pronunciation in terms of each
other, because you'll always be bitten by corner cases. Everyone knows
how "Thames" is pronounced... right? Well, no. There are (at least)
two rivers of that name, the famous one in London p1[ and another one
further north [2]. The obscure one is pronounced the way the word
looks, the famous one isn't. And don't even get started on English
family names... Majorinbanks, Meux and Cholmodeley, as lampshaded [3]
in this song [4]! Even without names, though, there are the tricky
cases and the ones where different localities pronounce the same word
very differently; Unicode shouldn't have to deal with that by changing
whether something's a single character or two. Considering that
phonograms aren't even ligatures (though there is overlap, eg "Th"),
it's much cleaner to leave them as multiple characters.

ChrisA

[1] https://en.wikipedia.org/wiki/River_Thames
[2] Though it's better known as the Isis. https://en.wikipedia.org/wiki/The_Isis
[3] http://tvtropes.org/pmwiki/pmwiki.php/Main/LampshadeHanging
[4] http://www.stagebeauty.net/plays/th-arca2.html - "Mosh-banks",
"Mow", and "Chumley" are the pronunciations used
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Mark Lawrence

On 16/01/2014 01:09, Rita wrote:

I would like to do some data validation when its going to a class.

class Foo(object):
   def __init__(self):
 pass

I know its frowned upon to do work in the __init__() method and only
declarations should be there.


In the 10+ years that I've been using Python I don't ever recall seeing 
this, could we have a reference please.




So, should i create a function called validateData(self) inside foo?

I would call the object like this

x=Foo()
x.validateData()

Is this the preferred way? Is there a way I can run validateData()
automatically, maybe put it in __init__? Or are there other techniques
people use for this sort of thing?


--
--- Get your facts first, then you can distort them as you please.--





--
My fellow Pythonistas, ask not what our language can do for you, ask 
what you can do for our language.


Mark Lawrence

--
https://mail.python.org/mailman/listinfo/python-list


Re: Python declarative

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 11:48 AM, Tim Chase
 wrote:
> On 2014-01-16 10:09, Chris Angelico wrote:
>> myWindow = Window(
>>  title="Hello World",
>>  children=[Button(
>>  label="I'm a button",
>>  onClick=exit
>>  )]
>> )
>
> This also solves the problem that **kwargs are just a dict, which is
> inherently unordered.  So with the previous scheme, you'd just get an
> unordered bag of controls that Python could then dump into your
> containing window as its dict-traversal algorithms saw fit. :-)

Yeah, I don't really want my window layout to randomize every Python startup :)

Actually... I'm really REALLY glad code like the previous version
wasn't prevalent. It would have made for intense opposition to hash
randomization - as I recall, the strongest voice against randomization
was that tests would start to fail (IMO, a simple way to print a
dictionary with its keys sorted would both solve that and provide an
aesthetically-pleasing display). Imagine if someone spent hours
crafting child object names in order to force the children to be added
in the right order, and then along comes the new Python and it's all
broken...

But I still think it should be a method (as it is in GTK, not familiar
enough with the others), as there's no way with the parameter system
to (a) add children after object creation, or (b) specify parameters
(GTK's boxes let you choose, per-child, whether they'll be expanded to
fill any spare space - by default they all will, ie spare room is
split between them, but often you want to choose one to be expanded
and another not). PyGTK is mostly there, but since its .add() method
returns None, chaining isn't possible. Method chaining like that is
somewhat controversial... I would love to have some kind of syntax
that says "and the value of this expression is the bit before the
dot", but I have no idea what would be clean.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Bind event is giving me a bug.

2014-01-15 Thread Terry Reedy

On 1/15/2014 3:16 PM, eneskri...@gmail.com wrote:

While working with tkinter in python 3.3, I had the following problem.


Please paste working code that people can experiment with.

from tkinter import *


def get_text(event):


If this were a method, (which the indent of the body suggests it once 
was) it would have to have a 'self' parameter, and you would have to 
bind a bound method.



 self.number_of_competitors = entered_text.get()


Since it is just a function, and has no 'self' parameter, this raises 
NameError. I condensed the function to


try:
int(entered_text.get())
root.destroy()
except ValueError:
label.config(text = "Enter the number of competitors. 
Please enter a number.")



 try:
 self.number_of_competitors = int(self.number_of_competitors)
 except:


Bare excepts are bad.


 pass
 if type(self.number_of_competitors) == int:
 root.destroy()
 else:
 label.config(text = "Enter the number of competitors. Please enter 
a number.")
root = Tk()
label = Label(root, text = "Enter the number of competitors.")
label.pack(side = TOP)
entered_text = Entry(root)


Since Entry only allows one line, I would have thought that it should 
take a command=func option invoked by \n. Instead, it seems to swallow 
newlines.



entered_text.pack()
Button(root, text = "Submit", command = get_text).pack()


As near as I can tell, the Button button-press event in *not* bound to 
get_text but to a fixed event handler that calls get_text *without* an 
argument.



root.bind('', get_text)


This does bind to an event so that it does call with an event arg. I 
just removed this and the window acts as it should.


Since get_event ignores event, event=None should make it work either 
way. However, when I try that, the window disappears without being 
touched, as if \n is randomly generated internally. So I would say to 
skip this until you know more than I do.



root.mainloop()

This is a buggy part of the code. When I run it, instead of doing what it 
should do, it responds to all events BUT enter. I'm not sure if this error is 
on tkinters or my side. Please help!


--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 12:25 PM, Cameron Simpson  wrote:
> However, I would also have obvious validity checks in __init__
> itself on the supplied values. Eg:
>
>   def __init__(self, size, lifetime):
> if size < 1:
>   raise ValueError("size must be >= 1, received: %r" % (size,))
> if lifetime <= 0:
>   raise ValueError("lifetime must be > 0, received: %r" % (lifetime,))
>
> Trivial, fast. Fails early. Note that the exception reports the
> receive value; very handy for simple errors like passing utterly
> the wrong thing (eg a filename when you wanted a counter, or something
> like that).

With code like this, passing a filename as the size will raise TypeError on Py3:

>>> size = "test.txt"
>>> size < 1
Traceback (most recent call last):
  File "", line 1, in 
TypeError: unorderable types: str() < int()

Yet another advantage of Py3 :)

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Guessing the encoding from a BOM

2014-01-15 Thread Steven D'Aprano
I have a function which guesses the likely encoding used by text files by 
reading the BOM (byte order mark) at the beginning of the file. A 
simplified version:


def guess_encoding_from_bom(filename, default):
with open(filename, 'rb') as f:
sig = f.read(4)
if sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
return 'utf_16'
elif sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
return 'utf_32'
else:
return default


The idea is that you can call the function with a file name and a default 
encoding to return if one can't be guessed. I want to provide a default 
value for the default argument (a default default), but one which will 
unconditionally fail if you blindly go ahead and use it.

E.g. I want to either provide a default:

enc = guess_encoding_from_bom("filename", 'latin1')
f = open("filename", encoding=enc)


or I want to write:

enc = guess_encoding_from_bom("filename")
if enc == something:
 # Can't guess, fall back on an alternative strategy
 ...
else:
 f = open("filename", encoding=enc)


If I forget to check the returned result, I should get an explicit 
failure as soon as I try to use it, rather than silently returning the 
wrong results.

What should I return as the default default? I have four possibilities:

(1) 'undefined', which is an standard encoding guaranteed to 
raise an exception when used;

(2) 'unknown', which best describes the result, and currently 
there is no encoding with that name;

(3) None, which is not the name of an encoding; or

(4) Don't return anything, but raise an exception. (But 
which exception?)


Apart from option (4), here are the exceptions you get from blindly using 
options (1) through (3):

py> 'abc'.encode('undefined')
Traceback (most recent call last):
  File "", line 1, in 
  File "/usr/local/lib/python3.3/encodings/undefined.py", line 19, in 
encode
raise UnicodeError("undefined encoding")
UnicodeError: undefined encoding

py> 'abc'.encode('unknown')
Traceback (most recent call last):
  File "", line 1, in 
LookupError: unknown encoding: unknown

py> 'abc'.encode(None)
Traceback (most recent call last):
  File "", line 1, in 
TypeError: encode() argument 1 must be str, not None


At the moment, I'm leaning towards option (1). Thoughts?



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Daniel da Silva
On Tue, Jan 14, 2014 at 8:27 PM, Steven D'Aprano <
steve+comp.lang.pyt...@pearwood.info> wrote:
>
> But reading Guido, I think he's saying that wouldn't be a good idea. I
> don't get it -- it's not a violation of the Liskov Substitution
> Principle, because it's more restrictive, not less. What am I missing?
>

Just to be pedantic, this *is* a violation of the Liskov Substution
Principle. According to Wikipedia, the principle states:

 if S is a subtype  of T, then
> objects of type  T may be replaced
> with objects of type S (i.e., objects of type S may be *substituted* for
> objects of type T) without altering any of the desirable properties of that
> program (correctness, task performed, etc.) 
> [0]


 Since S (TextOnlyDict) is more restrictive, it cannot be replaced for T
(dict) because the program may be using non-string keys.


Daniel
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Rita
Unfortunately, I couldn't find the reference but I know I read it
somewhere. Even with a selective search I wasn't able to find it. I think I
read it in context of module/class test case writing.



I will keep your responses in mind therefore I will put logic in __init__
for data validation.

thanks again for the responses.





On Wed, Jan 15, 2014 at 8:46 PM, Chris Angelico  wrote:

> On Thu, Jan 16, 2014 at 12:25 PM, Cameron Simpson  wrote:
> > However, I would also have obvious validity checks in __init__
> > itself on the supplied values. Eg:
> >
> >   def __init__(self, size, lifetime):
> > if size < 1:
> >   raise ValueError("size must be >= 1, received: %r" % (size,))
> > if lifetime <= 0:
> >   raise ValueError("lifetime must be > 0, received: %r" %
> (lifetime,))
> >
> > Trivial, fast. Fails early. Note that the exception reports the
> > receive value; very handy for simple errors like passing utterly
> > the wrong thing (eg a filename when you wanted a counter, or something
> > like that).
>
> With code like this, passing a filename as the size will raise TypeError
> on Py3:
>
> >>> size = "test.txt"
> >>> size < 1
> Traceback (most recent call last):
>   File "", line 1, in 
> TypeError: unorderable types: str() < int()
>
> Yet another advantage of Py3 :)
>
> ChrisA
> --
> https://mail.python.org/mailman/listinfo/python-list
>



-- 
--- Get your facts first, then you can distort them as you please.--
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Python declarative

2014-01-15 Thread Terry Reedy

On 1/15/2014 6:09 PM, Chris Angelico wrote:

On Thu, Jan 16, 2014 at 9:58 AM, Terry Reedy  wrote:

class Window:
 def __init__(self, title, *kwds)  # or title='Window title'
 self.title = title
 self.__dict__.update(kwds)


Does that want a second asterisk, matching the Button definition?


I must have changed to **kwds after copying.



Possible, but potentially messy; if you happen to name your button
"icon", it might be misinterpreted as an attempt to set the window's
icon, and cause a very strange and incomprehensible error.


Puns are always a problem with such interfaces. Validate the args as much as
possible. An icon should be a bitmap of appropriate size. Optional args
should perhaps all be widgets (instances of a Widget baseclass).


Yeah, but you'd still get back an error saying "icon should be a
bitmap" where the real problem is "icon should be called something
else".


One could say so in the message

InterfaceError("The icon object must be a bitmap or else the non-bitmap 
object should be called something else.)



It might be worth explicitly adorning properties, or separating
them into two categories. Since the keyword-named-children system has
the other problem of being hard to lay out (how do you specify the
order?), I'd look at keyword args for properties and something
separate for children - either the layout I used above with .add(),
which allows extra args as necessary, or something like this:

myWindow = Window(
  title="Hello World",
  children=[Button(
  label="I'm a button",
  onClick=exit
  )]
)
Or maybe allow "child=" as a shortcut, since a lot of widgets will
have exactly one child.



--
Terry Jan Reedy

--
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Terry Reedy

On 1/15/2014 8:09 PM, Rita wrote:


I know its frowned upon to do work in the __init__() method and only
declarations should be there.


Dear Python beginners:

Don't believe the Python rules people write unless it is by one of the 
core developers or one of the other experts posting here. Even then, be 
skeptical. Even these people disagree on some guidelines.


PS. The basic guideline is that your program should work correctly, and 
some people have disputed even that ;-)


--
Terry Jan Reedy


--
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-15 Thread Ben Finney
Steven D'Aprano  writes:

> enc = guess_encoding_from_bom("filename")
> if enc == something:
>  # Can't guess, fall back on an alternative strategy
>  ...
> else:
>  f = open("filename", encoding=enc)
>
>
> If I forget to check the returned result, I should get an explicit
> failure as soon as I try to use it, rather than silently returning the
> wrong results.

Yes, agreed.

> What should I return as the default default? I have four possibilities:
>
> (1) 'undefined', which is an standard encoding guaranteed to 
> raise an exception when used;

+0.5. This describes the outcome of the guess.

> (2) 'unknown', which best describes the result, and currently 
> there is no encoding with that name;

+0. This *better* describes the outcome, but I don't think adding a new
name is needed nor very helpful.

> (3) None, which is not the name of an encoding; or

−1. This is too much like a real result and doesn't adequately indicate
the failure.

> (4) Don't return anything, but raise an exception. (But 
> which exception?)

+1. I'd like a custom exception class, sub-classed from ValueError.

-- 
 \   “I love to go down to the schoolyard and watch all the little |
  `\   children jump up and down and run around yelling and screaming. |
_o__) They don't know I'm only using blanks.” —Emo Philips |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Is it possible to get string from function?

2014-01-15 Thread Roy Smith
I realize the subject line is kind of meaningless, so let me explain :-)

I've got some unit tests that look like:

class Foo(TestCase):
  def test_t1(self):
RECEIPT = "some string"

  def test_t2(self):
RECEIPT = "some other string"

  def test_t3(self):
RECEIPT = "yet a third string"

and so on.  It's important that the strings be mutually unique.  In the 
example above, it's trivial to look at them and observe that they're all 
different, but in real life, the strings are about 2500 characters long, 
hex-encoded.  It even turns out that a couple of the strings are 
identical in the first 1000 or so characters, so it's not trivial to do 
by visual inspection.

So, I figured I would write a meta-test, which used introspection to 
find all the methods in the class, extract the strings from them (they 
are all assigned to a variable named RECEIPT), and check to make sure 
they're all different.

Is it possible to do that?  It is straight-forward using the inspect 
module to discover the methods, but I don't see any way to find what 
strings are assigned to a variable with a given name.  Of course, that 
assignment doesn't even happen until the function is executed, so 
perhaps what I want just isn't possible?

It turns out, I solved the problem with more mundane tools:

grep 'RECEIPT = ' test.py | sort | uniq -c

and I could have also solved the problem by putting all the strings in a 
dict and having the functions pull them out of there.  But, I'm still 
interested in exploring if there is any way to do this with 
introspection, as an academic exercise.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Question about object lifetime and access

2014-01-15 Thread Asaf Las
First of all many thanks to all for their detailed answers on subject. 
I really appreciate it!

> Correct. The global name is a reference, so the reference count will be 
> 
> at least 1. In fact, referencing the name from a function or method 
> doesn't increase the ref count:
> -- 
> 
> Steven

i have tried some tests, though accessing object from functions increase 
refcount but only temporary and only if object is used within function. i guess 
as soon as k is bound to string object latter's reference count will increase 
and function return results in unbound of k from object (according to output 
below). 

What is interesting if module namespace can be held accountable for 1 reference 
count what are remaining 3 references counted on string object referenced by 
'p'? (CPython 3.3.2, windows 7, run from within eclipse/PyDev and same output 
on centos 6.5 for python v3.3.3)

from sys import getrefcount

p = "test script"
print("refcnt before func() ", getrefcount(p))

def access_p1():
global p
print("refcnt inside func1()", getrefcount(p))

def access_p2():
global p
k = p
print("refcnt inside func2()", getrefcount(p))

access_p1()
access_p2()

print("refcnt after  func() ", getrefcount(p))

--
Output:

refcnt before func()  4
refcnt inside func1() 4
refcnt inside func2() 5
refcnt after  func()  4
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Roy Smith
Rita  writes:
>> I know its frowned upon to do work in the __init__() method and only
>> declarations should be there.


In article ,
 Ben Finney  wrote:

> Who says it's frowned on to do work in the initialiser? Where are they
> saying it? That seems over-broad, I'd like to read the context of that
> advice.

Weird, I was just having this conversation at work earlier this week.

There are some people who advocate that C++ constructors should not do a 
lot of work and/or should be incapable of throwing exceptions.  The pros 
and cons of that argument are largely C++ specific.  Here's a Stack 
Overflow thread which covers most of the usual arguments on both sides:

http://stackoverflow.com/questions/293967/how-much-work-should-be-done-in
-a-constructor

But, Python is not C++.  I suspect the people who argue for __init__() 
not doing much are extrapolating a C++ pattern to other languages 
without fully understanding the reason why.

That being said, I've been on a tear lately, trying to get our unit test 
suite to run faster.  I came across one slow test which had an 
interesting twist.  The class being tested had an __init__() method 
which read over 900,000 records from a database and took something like 
5-10 seconds to run.  Man, talk about heavy-weight constructors :-)
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Gregory Ewing

Daniel da Silva wrote:

Just to be pedantic, this /is/ a violation of the Liskov Substution 
Principle. According to Wikipedia, the principle states:


 if S is a subtype  of T, then
objects of type  T may be
replaced with objects of type S (i.e., objects of type S may
be /substituted/ for objects of type T) without altering any of the
desirable properties of that program


Something everyone seems to miss when they quote the LSP
is that what the "desirable properties of the program" are
*depends on the program*.

Whenever you create a subclass, there is always *some*
difference between the behaviour of the subclass and
the base class, otherwise there would be no point in
having the subclass. Whether that difference has any
bad consequences for the program depends on what the
program does with the objects.

So you can't just look at S and T in isolation and
decide whether they satisfy the LSP or not. You need
to consider them in context.

In Python, there's a special problem with subclassing
dicts in particular: some of the core interpreter code
assumes a plain dict and bypasses the lookup of
__getitem__ and __setitem__, going straight to the
C-level implementations. If you tried to use a dict
subclass in that context that overrode those methods,
your overridden versions wouldn't get called.

But if you never use your dict subclass in that way,
there is no problem. Or if you don't override those
particular methods, there's no problem either.

If you're giving advice to someone who isn't aware
of all the fine details, "don't subclass dict" is
probably the safest thing to say. But there are
legitimate use cases for it if you know what you're
doing.

The other issue is that people are often tempted to
subclass dict in order to implement what isn't really
a dict at all, but just a custom mapping type. The
downside to that is that you end up inheriting a
bunch of dict-specific methods that don't really
make sense for your type. In that case it's usually
better to start with a fresh class that *uses* a
dict as part of its implementation, and only
exposes the methods that are really needed.

--
Greg
--
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Ben Finney
Roy Smith  writes:

>  Ben Finney  wrote:
>
> > Who says it's frowned on to do work in the initialiser? Where are they
> > saying it? That seems over-broad, I'd like to read the context of that
> > advice.
>
> There are some people who advocate that C++ constructors should not do
> a lot of work and/or should be incapable of throwing exceptions. The
> pros and cons of that argument are largely C++ specific. […]
>
> But, Python is not C++. I suspect the people who argue for __init__()
> not doing much are extrapolating a C++ pattern to other languages
> without fully understanding the reason why.

Even simpler: They are mistaken in what the constructor is named, in
Python.

Python classes have the constructor, ‘__new__’. I would agree with
advice not to do anything but allocate the resources for a new instance
in the constructor.

Indeed, the constructor from ‘object’ does a good enough job that the
vast majority of Python classes never need a custom constructor at all.

(This is probably why many beginning programmers are confused about what
the constructor is called: They've never seen a class with its own
constructor!)

Python instances have an initialiser, ‘__init__’. That function is for
setting up the specific instance for later use. This is commonly
over-ridden and many classes define a custom initialiser, which normally
does some amount of work.

I don't think ‘__init__’ is subject to the conventions of a constructor,
because *‘__init__’ is not a constructor*.

-- 
 \“Absurdity, n. A statement or belief manifestly inconsistent |
  `\with one's own opinion.” —Ambrose Bierce, _The Devil's |
_o__)Dictionary_, 1906 |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 1:13 PM, Steven D'Aprano
 wrote:
> if sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
> return 'utf_16'
> elif sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
> return 'utf_32'

I'd swap the order of these two checks. If the file starts FF FE 00
00, your code will guess that it's UTF-16 and begins with a U+.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)

2014-01-15 Thread Ben Finney
Roy Smith  writes:

> I've got some unit tests that look like:
>
> class Foo(TestCase):
>   def test_t1(self):
> RECEIPT = "some string"
>
>   def test_t2(self):
> RECEIPT = "some other string"
>
>   def test_t3(self):
> RECEIPT = "yet a third string"
>
> and so on.

That looks like a poorly defined class.

Are the test cases pretty much identical other than the data in those
strings? If so, use a collection of strings and generate separate tests
for each one dynamically.

In Python 2 and 3, you can use the ‘testscenarios’ library for that
purpose https://pypi.python.org/pypi/testscenarios>.

In Python 3, the ‘unittest’ module has “subtests” for the same purpose
http://docs.python.org/3.4/library/unittest.html#distinguishing-test-iterations-using-subtests>.

> and I could have also solved the problem by putting all the strings in
> a dict and having the functions pull them out of there. But, I'm still
> interested in exploring if there is any way to do this with
> introspection, as an academic exercise.

Since I don't think your use case is best solved this way, I'll leave
the academic exercise to someone else.

-- 
 \“… Nature … is seen to do all things Herself and through |
  `\ herself of own accord, rid of all gods.” —Titus Lucretius |
_o__) Carus, c. 40 BCE |
Ben Finney

-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Roy Smith
In article ,
 Ben Finney  wrote:

> Roy Smith  writes:
> > But, Python is not C++. I suspect the people who argue for __init__()
> > not doing much are extrapolating a C++ pattern to other languages
> > without fully understanding the reason why.
> 
> Even simpler: They are mistaken in what the constructor is named, in
> Python.
> 
> Python classes have the constructor, ‘__new__’. I would agree with
> advice not to do anything but allocate the resources for a new instance
> in the constructor.

I've always found this distinction to be somewhat silly.

C++ constructors are also really just initializers.  Before your 
constructor is called, something else (operator new, at least for 
objects in the heap) has already allocated memory for the object.  It's 
the constructor's job to initialize the data.  That's really very much 
the same distinction as between __new__() and __init__().
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Dynamic generation of test cases for each input datum (was: Is it possible to get string from function?)

2014-01-15 Thread Roy Smith
In article ,
 Ben Finney  wrote:

> Roy Smith  writes:
> 
> > I've got some unit tests that look like:
> >
> > class Foo(TestCase):
> >   def test_t1(self):
> > RECEIPT = "some string"
> >
> >   def test_t2(self):
> > RECEIPT = "some other string"
> >
> >   def test_t3(self):
> > RECEIPT = "yet a third string"
> >
> > and so on.
> 
> That looks like a poorly defined class.
> 
> Are the test cases pretty much identical other than the data in those
> strings?

No, each test is quite different.  The only thing they have in common is 
they all involve a string representation of a transaction receipt.  I 
elided the actual test code in my example above because it wasn't 
relevant to my question.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 2:46 PM, Roy Smith  wrote:
> So, I figured I would write a meta-test, which used introspection to
> find all the methods in the class, extract the strings from them (they
> are all assigned to a variable named RECEIPT), and check to make sure
> they're all different.

In theory, it should be. You can disassemble the function and find the
assignment. Check out Lib/dis.py - or just call it and process its
output. Names of local variables are found in
test_t1.__code__.co_names, the constants themselves are in
test_1.__code__.co_consts, and then it's just a matter of matching up
which constant got assigned to the slot represented by the name
RECEIPT.

But you might be able to shortcut it enormously. You say the strings
are "about 2500 characters long, hex-encoded". What are the chances of
having another constant, somewhere in the test function, that also
happens to be roughly that long and hex-encoded? If the answer is
"practically zero", then skip the code, skip co_names, and just look
through co_consts.



class TestCase:
  pass # not running this in the full environment

class Foo(TestCase):
  def test_t1(self):
RECEIPT = "some string"

  def test_t2(self):
RECEIPT = "some other string"

  def test_t3(self):
RECEIPT = "yet a third string"

  def test_oops(self):
RECEIPT = "some other string"

unique = {}
for funcname in dir(Foo):
if funcname.startswith("test_"):
for const in getattr(Foo,funcname).__code__.co_consts:
if isinstance(const, str) and const.endswith("string"):
if const in unique:
print("Collision!", unique[const], "and", funcname)
unique[const] = funcname



This depends on your RECEIPT strings ending with the word "string" -
change the .endswith() check to be whatever it takes to distinguish
your critical constants from everything else you might have. Maybe:

CHARSET = set("0123456789ABCDEF") # or use lower-case letters, or
both, according to your hex encoding

if isinstance(const, str) and len(const)>2048 and set(const)<=CHARSET:

Anything over 2KB with no characters outside of that set is highly
likely to be what you want. Of course, this whole theory goes out the
window if your test functions can reference another test's RECEIPT;
though if you can guarantee that this is the *first* such literal (if
RECEIPT="..." is the first thing the function does), then you could
just add a 'break' after the unique[const]=funcname assignment and
it'll check only the first - co_consts is ordered.

An interesting little problem!

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-15 Thread Ethan Furman

On 01/15/2014 07:47 PM, Ben Finney wrote:

Steven D'Aprano writes:


 (4) Don't return anything, but raise an exception. (But
 which exception?)


+1. I'd like a custom exception class, sub-classed from ValueError.


+1

--
~Ethan~
--
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-15 Thread Roy Smith
In article ,
 Chris Angelico  wrote:

> On Thu, Jan 16, 2014 at 2:46 PM, Roy Smith  wrote:
> > So, I figured I would write a meta-test, which used introspection to
> > find all the methods in the class, extract the strings from them (they
> > are all assigned to a variable named RECEIPT), and check to make sure
> > they're all different.
>> [...]
> But you might be able to shortcut it enormously. You say the strings
> are "about 2500 characters long, hex-encoded". What are the chances of
> having another constant, somewhere in the test function, that also
> happens to be roughly that long and hex-encoded?

The chances are exactly zero.

> If the answer is "practically zero", then skip the code, skip 
> co_names, and just look through co_consts.

That sounds like it should work, thanks!

> Of course, this whole theory goes out the
> window if your test functions can reference another test's RECEIPT;

No, they don't do that.
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-15 Thread Chris Angelico
On Thu, Jan 16, 2014 at 4:40 PM, Roy Smith  wrote:
>> But you might be able to shortcut it enormously. You say the strings
>> are "about 2500 characters long, hex-encoded". What are the chances of
>> having another constant, somewhere in the test function, that also
>> happens to be roughly that long and hex-encoded?
>
> The chances are exactly zero.
>
>> If the answer is "practically zero", then skip the code, skip
>> co_names, and just look through co_consts.
>
> That sounds like it should work, thanks!
>
>> Of course, this whole theory goes out the
>> window if your test functions can reference another test's RECEIPT;
>
> No, they don't do that.

Awesome! Makes it easy then.

ChrisA
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Cameron Simpson
On 16Jan2014 12:46, Chris Angelico  wrote:
> On Thu, Jan 16, 2014 at 12:25 PM, Cameron Simpson  wrote:
> > However, I would also have obvious validity checks in __init__
> > itself on the supplied values. Eg:
> >
> >   def __init__(self, size, lifetime):
> > if size < 1:
> >   raise ValueError("size must be >= 1, received: %r" % (size,))
> > if lifetime <= 0:
> >   raise ValueError("lifetime must be > 0, received: %r" % (lifetime,))
> >
> > Trivial, fast. Fails early. Note that the exception reports the
> > receive value; very handy for simple errors like passing utterly
> > the wrong thing (eg a filename when you wanted a counter, or something
> > like that).
> 
> With code like this, passing a filename as the size will raise TypeError on 
> Py3:

I thought of this, but had already dispatched my message:-(
I actually thought Py2 would give me a TypeError, but I see it doesn't.
-- 
Cameron Simpson 

The significant problems we face cannot be solved at the same level of
thinking we were at when we created them.   - Albert Einstein
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Chanelling Guido - dict subclasses

2014-01-15 Thread Devin Jeanpierre
On Wed, Jan 15, 2014 at 8:51 AM, John Ladasky
 wrote:
> On Wednesday, January 15, 2014 12:40:33 AM UTC-8, Peter Otten wrote:
>> Personally I feel dirty whenever I write Python code that defeats duck-
>> typing -- so I would not /recommend/ any isinstance() check.
>
> While I am inclined to agree, I have yet to see a solution to the problem of 
> flattening nested lists/tuples which avoids isinstance().  If anyone has 
> written one, I would like to see it, and consider its merits.

As long as you're the one that created the nested list structure, you
can choose to create a different structure instead, one which doesn't
require typechecking values inside your structure.

For example, os.walk has a similar kind of problem; it uses separate
lists for the subdirectories and the rest of the files, rather than
requiring you to check each child to see if it is a directory. It can
do it this way because it doesn't need to preserve the interleaved
order of directories and files, but there's other solutions for you if
you do want to preserve that order. (Although they won't be as clean
as they would be in a language with ADTs)

-- Devin
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-15 Thread Steven D'Aprano
On Thu, 16 Jan 2014 16:01:56 +1100, Chris Angelico wrote:

> On Thu, Jan 16, 2014 at 1:13 PM, Steven D'Aprano
>  wrote:
>> if sig.startswith((b'\xFE\xFF', b'\xFF\xFE')):
>> return 'utf_16'
>> elif sig.startswith((b'\x00\x00\xFE\xFF', b'\xFF\xFE\x00\x00')):
>> return 'utf_32'
> 
> I'd swap the order of these two checks. If the file starts FF FE 00 00,
> your code will guess that it's UTF-16 and begins with a U+.

Good catch, thank you.


-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Guessing the encoding from a BOM

2014-01-15 Thread Steven D'Aprano
On Thu, 16 Jan 2014 14:47:00 +1100, Ben Finney wrote:

> Steven D'Aprano  writes:
> 
>> enc = guess_encoding_from_bom("filename") if enc == something:
>>  # Can't guess, fall back on an alternative strategy ...
>> else:
>>  f = open("filename", encoding=enc)
>>
>>
>> If I forget to check the returned result, I should get an explicit
>> failure as soon as I try to use it, rather than silently returning the
>> wrong results.
> 
> Yes, agreed.
> 
>> What should I return as the default default? I have four possibilities:
>>
>> (1) 'undefined', which is an standard encoding guaranteed to
>> raise an exception when used;
> 
> +0.5. This describes the outcome of the guess.
> 
>> (2) 'unknown', which best describes the result, and currently
>> there is no encoding with that name;
> 
> +0. This *better* describes the outcome, but I don't think adding a new
> name is needed nor very helpful.

And there is a chance -- albeit a small chance -- that someday the std 
lib will gain an encoding called "unknown".


>> (4) Don't return anything, but raise an exception. (But
>> which exception?)
> 
> +1. I'd like a custom exception class, sub-classed from ValueError.

Why ValueError? It's not really a "invalid value" error, it's more "my 
heuristic isn't good enough" failure. (Maybe the file starts with another 
sort of BOM which I don't know about.)

If I go with an exception, I'd choose RuntimeError, or a custom error 
that inherits directly from Exception.



Thanks to everyone for the feedback.



-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: data validation when creating an object

2014-01-15 Thread Cameron Simpson
On 16Jan2014 15:53, Ben Finney  wrote:
> Roy Smith  writes:
> >  Ben Finney  wrote:
> > > Who says it's frowned on to do work in the initialiser? Where are they
> > > saying it? That seems over-broad, I'd like to read the context of that
> > > advice.
> >
> > There are some people who advocate that C++ constructors should not do
> > a lot of work and/or should be incapable of throwing exceptions. The
> > pros and cons of that argument are largely C++ specific. […]
> 
> Even simpler: They are mistaken in what the constructor is named, in
> Python.
> Python classes have the constructor, ‘__new__’. I would agree with
> advice not to do anything but allocate the resources for a new instance
> in the constructor. [...]
> 
> Python instances have an initialiser, ‘__init__’. That function is for
> setting up the specific instance for later use. This is commonly
> over-ridden and many classes define a custom initialiser, which normally
> does some amount of work.
> 
> I don't think ‘__init__’ is subject to the conventions of a constructor,
> because *‘__init__’ is not a constructor*.

99% of the time this distinction is moot. When I call ClassName(blah,...),
both the constructor and initialiser are called.

Informally, there's a rule of thumb that making an object (allocate,
construct and initialise) shouldn't be needlessly expensive. Beyond
that, what happens depends on the use patterns.

This rule of thumb will be what Rita's encountered, perhaps stated
without any qualification regarding what's appropriate.

Cheers,
-- 
Cameron Simpson 

The problem with keeping an open mind is that my ideas all tend to fall out...
- Bill Garrett 
-- 
https://mail.python.org/mailman/listinfo/python-list


Re: Is it possible to get string from function?

2014-01-15 Thread Steven D'Aprano
On Wed, 15 Jan 2014 22:46:54 -0500, Roy Smith wrote:

> I've got some unit tests that look like:
> 
> class Foo(TestCase):
>   def test_t1(self):
> RECEIPT = "some string"
> 
>   def test_t2(self):
> RECEIPT = "some other string"
> 
>   def test_t3(self):
> RECEIPT = "yet a third string"
> 
> and so on.  It's important that the strings be mutually unique.  In the
> example above, it's trivial to look at them and observe that they're all
> different, but in real life, the strings are about 2500 characters long,
> hex-encoded.  It even turns out that a couple of the strings are
> identical in the first 1000 or so characters, so it's not trivial to do
> by visual inspection.

Is the mapping of receipt string to test fixed? That is, is it important 
that test_t1 *always* runs with "some string", test_t2 "some other 
string", and so forth?

If not, I'd start by pushing all those strings into a global list (or 
possible a class attribute. Then:

LIST_OF_GIANT_STRINGS = [blah blah blah]  # Read it from a file perhaps?
assert len(LIST_OF_GIANT_STRINGS) == len(set(LIST_OF_GIANT_STRINGS))


Then, change each test case to:

def test_t1(self):
RECEIPT = random.choose(LIST_OF_GIANT_STRINGS)


Even if two tests happen to pick the same string on this run, they are 
unlikely to pick the same string on the next run.

If that's not good enough, if the strings *must* be unique, you can use a 
helper like this:

def choose_without_replacement(alist):
random.shuffle(alist)
return alist.pop()

class Foo(TestCase):
def test_t1(self):
RECEIPT = choose_without_replacement(LIST_OF_GIANT_STRINGS)


All this assumes that you don't care which giant string matches which 
test method. If you do, then:

DICT_OF_GIANT_STRINGS = {
'test_t1': ..., 
'test_t2': ..., 
}  # Again, maybe read them from a file.

assert len(list(DICT_OF_GIANT_STRINGS.values())) == \
   len(set(DICT_OF_GIANT_STRINGS.values()))


You can probably build up the dict from the test class by inspection, 
e.g.:

DICT_OF_GIANT_STRINGS = {}
for name in Foo.__dict__:
if name.startswith("test_"):
key = name[5:]
if key.startswith("t"):
DICT_OF_GIANT_STRINGS[name] = get_giant_string(key)

I'm sure you get the picture. Then each method just needs to know it's 
own name:


class Foo(TestCase):
def test_t1(self):
RECEIPT = DICT_OF_GIANT_STRINGS["test_t1"]


which I must admit is much easier to read than 

RECEIPT = "...2500 hex encoded characters..."


-- 
Steven
-- 
https://mail.python.org/mailman/listinfo/python-list


  1   2   >