please critique my thread code

2008-06-15 Thread winston
I wrote a Python program (103 lines, below) to download developer data
from SourceForge for research about social networks.

Please critique the code and let me know how to improve it.

An example use of the program:

prompt> python download.py 1 24

The above command downloads data for the projects with IDs between 1
and 24, inclusive. As it runs, it prints status messages, with a
plus sign meaning that the project ID exists. Else, it prints a minus
sign.

Questions:

--- Are my setup and use of threads, the queue, and "while True" loop
correct or conventional?

--- Should the program sleep sometimes, to be nice to the SourceForge
servers, and so they don't think this is a denial-of-service attack?

--- Someone told me that popen is not thread-safe, and to use
mechanize. I installed it and followed an example on the web site.
There wasn't a good description of it on the web site, or I didn't
find it. Could someone explain what mechanize does?

--- How do I choose the number of threads? I am using a MacBook Pro
2.4GHz Intel Core 2 Duo with 4 GB 667 MHz DDR2 SDRAM, running OS
10.5.3.

Thank you.

Winston



#!/usr/bin/env python

# Winston C. Yang
# Created 2008-06-14

from __future__ import with_statement

import mechanize
import os
import Queue
import re
import sys
import threading
import time

lock = threading.RLock()

# Make the dot match even a newline.
error_pattern = re.compile(".*\n\n.*", re.DOTALL)

def now():
return time.strftime("%Y-%m-%d %H:%M:%S")

def worker():

while True:

try:
id = queue.get()
except Queue.Empty:
continue

request = mechanize.Request("http://sourceforge.net/project/"\
"memberlist.php?group_id=%d" %
id)
response = mechanize.urlopen(request)
text = response.read()

valid_id = not error_pattern.match(text)

if valid_id:
f = open("%d.csv" % id, "w+")
f.write(text)
f.close()

with lock:
print "\t".join((str(id), now(), "+" if valid_id else
"-"))

def fatal_error():
print "usage: python application start_id end_id"
print
print "Get the usernames associated with each SourceForge project
with"
print "ID between start_id and end_id, inclusive."
print
print "start_id and end_id must be positive integers and satisfy"
print "start_id <= end_id."
sys.exit(1)

if __name__ == "__main__":

if len(sys.argv) == 3:

try:
start_id = int(sys.argv[1])

if start_id <= 0:
raise Exception

end_id = int(sys.argv[2])

if end_id < start_id:
raise Exception
except:
fatal_error()
else:
fatal_error()

# Print the start time.
start_time = now()
print start_time

# Create a directory whose name contains the start time.
dir = start_time.replace(" ", "_").replace(":", "_")
os.mkdir(dir)
os.chdir(dir)

queue = Queue.Queue(0)

for i in xrange(32):
t = threading.Thread(target=worker, name="worker %d" % (i +
1))
t.setDaemon(True)
t.start()

for id in xrange(start_id, end_id + 1):
queue.put(id)

# When the queue has size zero, exit in three seconds.
while True:
if queue.qsize() == 0:
time.sleep(3)
break

print now()
--
http://mail.python.org/mailman/listinfo/python-list


Revisiting Generators and Subgenerators

2010-03-25 Thread Winston
I have been reading PEP 380 because I am writing a video game/
simulation in Jython and I need cooperative multitasking. PEP 380 hits
on my problem, but does not quite solve it for me. I have the
following proposal as an alternative to PEP380. I don't know if this
is the right way for me to introduce my idea, but below is my writeup.
Any thoughts?


Proposal for a new Generator Syntax in Python 3K--
A Baton object for generators to allow subfunction to yield, and to
make
them symetric.

Abstract

Generators can be used to make coroutines. But they require the
programmer
to take special care in how he writes his generator. In
particular,
only the generator function may yield a value. We propose a
modification
to generators in Python 3 where a "Baton" object is given to both
sides of a generator. Both sides use the baton object to pass
execution
to the other side, and also to pass values to the other side.

The advantages of a baton object over the current scheme are: (1)
the generator function can pass the baton to a subfunction,
solving the
needs of PEP 380, (2) after creation both sides of the generator
function
are symetric--they both can call yield(), send(), next(). They do
the
same thing. This means programming with generators is the same as
programming with normal functions. No special contortions are
needed
to pass values back up to a yield command at the top.

Motivation
--
Generators make certain programming tasks easier, such as (a) an
iterator
which is of infinite length, (b) using a "trampoline function"
they can
emulate coroutines and cooperative multitasking, (c) they can be
used
to make both sides of a producer-consumer pattern easy to write--
both
sides can appear to be the caller.

On the down side, generators as they currently are implemented in
Python 3.1
require the programmer to take special care in how he writes his
generator. In particular, only the generator function may yield a
value--subfunctions called by the generator function may not yield
a value.

Here are  two use-cases in which generators are commonly used, but
where the
current limitation causes less readable code:

1) a long generator function which the programmer wants to split
into several
functions. The subfunctions should be able to yield a result.
Currently
the subfunctions have to pass values up to the main generator
and have
it yield the results back. Similarly subfunctions cannot
receive values
that the caller sends with generator.send()

2) generators are great for cooperative multitasking. A common use-
case is
agent simulators where many small "tasklets" need to run and
then pass
execution over to other tasklets. Video games are a common
scenario,
as is SimPy. Without cooperative multitasking, each tasklet
must be
contorted to run in a small piece and then return. Generators
help this,
but a more complicated algorithm which is best decomposed into
several
functions must be contorted because the subfuctions cannot
yield or
recive data from the generator.send().

Here is also a nice description of how coroutines make programs
easier to read and write:
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Proposal

If there is a way to make a sub-function of a generator yield and
receive
data from generator.send(), then the two problems above are
solved.

For example, this declares a generator. The first parameter of the
generator
is the "context" which represents the other side of the execution
frame.

a Baton object represents a passing of the execution from one line
of code to another. A program creates a Baton like so:

generator f( baton ):
# compute something
baton.yield( result )
# compute something
baton.yield( result )

baton = f()
while True:
print( baton.yield() )


A generator function, denoted with they keyword "generator"
instead of "def"
will return a "baton". Generators have the following methods:
__call__( args... ) --
This creates a Baton object which is passed back to the
caller,
i.e. the code that executed the Baton() command. Once the
baton
starts working, the two sides are symetric. So we will
call the
first frame, frame A and the code inside 'function' frame
B.
Frame is is returned a baton object. As soon as frame A
calls baton.yield(), frame B begins, i.e. 'function'
starts
to run. function is passed the baton as its first
argument,
and any additional arguments are also passed in. When
frame B
yields, any value that it yields will be returned to frame
A
as the result of it's yield().
Baton

Revisiting Generators and Subgenerators

2010-03-25 Thread Winston
Here's my proposal again, but hopefully with better formatting so you
can read it easier.


-Winston
-

Proposal for a new Generator Syntax in Python 3K--
A Baton object for generators to allow subfunction to yield, and to
make
them symetric.

Abstract

Generators can be used to make coroutines. But they require
the programmer to take special care in how he writes his
generator. In particular, only the generator function may
yield a value. We propose a modification to generators in
Python 3 where a "Baton" object is given to both sides of a
generator. Both sides use the baton object to pass execution
to the other side, and also to pass values to the other side.

The advantages of a baton object over the current scheme are:
(1) the generator function can pass the baton to a
subfunction, solving the needs of PEP 380, (2) after creation
both sides of the generator function are symetric--they both
can call yield(), send(), next(). They do the same thing.
This means programming with generators is the same as
programming with normal functions. No special contortions are
needed to pass values back up to a yield command at the top.

Motivation
--
Generators make certain programming tasks easier, such as (a)
an iterator which is of infinite length, (b) using a
"trampoline function" they can emulate coroutines and
cooperative multitasking, (c) they can be used to make both
sides of a producer-consumer pattern easy to write--both
sides can appear to be the caller.

On the down side, generators as they currently are
implemented in Python 3.1 require the programmer to take
special care in how he writes his generator. In particular,
only the generator function may yield a value--subfunctions
called by the generator function may not yield a value.

Here are  two use-cases in which generators are commonly
used, but where the current limitation causes less readable
code:

1) a long generator function which the programmer wants to
split into several functions. The subfunctions should be able
to yield a result. Currently the subfunctions have to pass
values up to the main generator and have it yield the results
back. Similarly subfunctions cannot receive values that the
caller sends with generator.send()

2) generators are great for cooperative multitasking. A
common use-case is agent simulators where many small
"tasklets" need to run and then pass execution over to other
tasklets. Video games are a common scenario, as is SimPy.
Without cooperative multitasking, each tasklet must be
contorted to run in a small piece and then return. Generators
help this, but a more complicated algorithm which is best
decomposed into several functions must be contorted because
the subfuctions cannot yield or recive data from the
generator.send().

Here is also a nice description of how coroutines make
programs easier to read and write:
http://www.chiark.greenend.org.uk/~sgtatham/coroutines.html

Proposal

If there is a way to make a sub-function of a generator yield
and receive data from generator.send(), then the two problems
above are solved.

For example, this declares a generator. The first parameter
of the generator is the "context" which represents the other
side of the execution frame.

a Baton object represents a passing of the execution from one
line of code to another. A program creates a Baton like so:

generator f( baton ):
# compute something
baton.yield( result )
# compute something
baton.yield( result )

baton = f()
while True:
print( baton.yield() )


A generator function, denoted with they keyword "generator"
instead of "def" will return a "baton". Generators have the
following methods:
__call__( args... ) --
This creates a Baton object which is passed back to
the caller, i.e. the code that executed the Baton()
command. Once the baton starts working, the two sides
are symetric. So we will call the first frame, frame
A and the code inside 'function' frame B. Frame is is
returned a baton object. As soon as frame A calls
baton.yield(), frame B begins, i.e. 'function' starts
to run. function is passed the baton as its first
argument, and any additional arguments are also
passed in. When frame B yields, any value that it
yields will be returned to frame A as the result of
it's yield().

Batons have the following methods:
yield( arg=None ) -- This method will save the current
execution stat

Re: Revisiting Generators and Subgenerators

2010-03-26 Thread Winston
On Mar 26, 7:29 am, Sebastien Binet  wrote:
> > Proposal for a new Generator Syntax in Python 3K--
> > A Baton object for generators to allow subfunction to yield, and to
> > make
> > them symetric.
>
> isn't a Baton what CSP calls a channel ?
>
> there is this interesting PyCSP library (which implements channels
> over greenlets, os-processes (via multiprocessing) or python 
> threads)http://code.google.com/p/pycsp
>
> cheers,
> sebastien.

Thanks for the link. After reading about Greenlets, it seems my Baton
is a Greenlet. It is not passed in to the new greenlet as I wrote
above, but both sides use it to pass execution to the other, and to
send a value on switching.

I'm glad my thinking is matching other people's thinking. Now I have
to search for a greenlet written for Jython.

And thanks to others for their thoughts on this subject.

-Winston
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: interrupted system call w/ Queue.get

2011-03-22 Thread Philip Winston
On Feb 18, 10:23 am, Jean-Paul Calderone
 wrote:
> The exception is caused by a syscall returning EINTR.  A syscall will
> return EINTR when a signal arrives and interrupts whatever that
> syscall
> was trying to do.  Typically a signal won't interrupt the syscall
> unless you've installed a signal handler for that signal.  However,
> you can avoid the interruption by using `signal.siginterrupt` to
> disable interruption on that signal after you've installed the
> handler.
>
> As for the other questions - I don't know, it depends how and why it
> happens, and whether it prevents your application from working
> properly.

We did not try "signal.siginterrupt" because we were not installing
any signals, perhaps some library code is doing it without us knowing
about it.  Plus I still don't know what signal was causing the
problem.

Instead based on Dan Stromberg's reply (http://code.activestate.com/
lists/python-list/595310/) I wrote a drop-in replacement for Queue
called RetryQueue which fixes the problem for us:

from multiprocessing.queues import Queue
import errno

def retry_on_eintr(function, *args, **kw):
while True:
try:
return function(*args, **kw)
except IOError, e:
if e.errno == errno.EINTR:
continue
else:
raise

class RetryQueue(Queue):
"""Queue which will retry if interrupted with EINTR."""
def get(self, block=True, timeout=None):
return retry_on_eintr(Queue.get, self, block, timeout)

As to whether this is a bug or just our own malignant signal-related
settings I'm not sure. Certainly it's not desirable to have your
blocking waits interrupted. I did see several EINTR issues in Python
but none obviously about Queue exactly:
http://bugs.python.org/issue1068268
http://bugs.python.org/issue1628205
http://bugs.python.org/issue10956

-Philip
-- 
http://mail.python.org/mailman/listinfo/python-list


reading argv argument of unittest.main()

2007-05-10 Thread winston . yang
I've read that unittest.main() can take an optional argv argument, and
that if it is None, it will be assigned sys.argv.

Is there a way to pass command line arguments through unittest.main()
to the setUp method of a class derived from unittest.TestCase?

Thank you in advance.

Winston

-- 
http://mail.python.org/mailman/listinfo/python-list


using PyUnit to test with multiple threads

2007-03-28 Thread winston . yang
Is it possible to use PyUnit to test with multiple threads?

I want to send many commands to a database at the same time. The order
of execution of the commands is indeterminate, and therefore, so is
the status message returned.

For example, say that I send the commands "get" and "delete" for a
given record to the database at the same time. If the get executes
before the delete, I expect a success message (assuming that the
record exists in the database). If the delete executes before the get,
I expect a failure message.

Is there a way to write tests in PyUnit for this type of situation?

Thank you in advance.

Winston

-- 
http://mail.python.org/mailman/listinfo/python-list


MakeBot - IDE for learning Python

2006-05-05 Thread Winston Wolff
I have just released a Windows and Macintosh OS X version of MakeBot, an 
IDE intended for students learning Python.  It includes a very nice 
graphics/video game package based on PyGame.  You can read all about it 
here:

http://stratolab.com/misc/makebot/

-Winston
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Revisiting Generators and Subgenerators

2010-03-25 Thread Winston Wolff
Coroutines achieve very similar things to threads, but avoid problems resulting 
from the pre-emptive nature of threads. Specifically, a coroutine indicates 
where it will yield to the other coroutine. This avoids lots of problems 
related to synchronization. Also the lightweight aspect is apparently important 
for some simulations when they have many thousands of agents to simulate--this 
number of threads becomes a problem.

-Winston

Winston Wolff
Stratolab - Games for Learning
tel: (646) 827-2242
web: www.stratolab.com

On Mar 25, 2010, at 5:23 PM, Cameron Simpson wrote:

> 
> Having quickly read the Abstract and Motivation, why is this any better
> than a pair of threads and a pair of Queue objects? (Aside from
> co-routines being more lightweight in terms of machine resources?)
> 
> On the flipside, given that generators were recently augumented to
> support coroutines I can see your motivation within that framework.
> 
> Cheers,
> -- 
> Cameron Simpson  DoD#743
> http://www.cskk.ezoshosting.com/cs/
> 
> C makes it easy for you to shoot yourself in the foot.  C++ makes that
> harder, but when you do, it blows away your whole leg.
> - Bjarne Stroustrup

-- 
http://mail.python.org/mailman/listinfo/python-list


interrupted system call w/ Queue.get

2011-02-17 Thread Philip Winston
We have a multiprocess Python program that uses Queue to communicate
between processes.  Recently we've seen some errors while blocked
waiting on Queue.get:

IOError: [Errno 4] Interrupted system call

What causes the exception?  Is it necessary to catch this exception
and manually retry the Queue operation?  Thanks.

We have some Python 2.5 and 2.6 machines that have run this program
for many 1,000 hours with no errors.  But we have one 2.5 machine and
one 2.7 machine that seem to get the error very often.

-- 
http://mail.python.org/mailman/listinfo/python-list


Regarding the error: TypeError: can’t pickle _thread.lock objects

2017-12-21 Thread Winston Manuel Vijay
Hi,

It would be of immense help, if someone could provide a suitable solution or 
related information that helps to sort out the below stated issue-


Ø  I had installed the Python version 3.6.4

Ø  Then I installed the package: Tensorflow

Ø  Installed g2p.exe by downloading from GitHub

Ø  Then tried running the below command-

g2p-seq2seq --interactive --model  (model_folder_path: is 
the path to an English model 2-layer LSTM with 512 hidden units CMU Sphinx 
dictionary downloaded from the CMU Sphinx website)

Following the above procedure, I encountered the following error: TypeError: 
can’t pickle _thread.lock objects-please find the attached screenshot for your 
reference.


Thanks,
A.Winston Manuel Vijay




This e-mail and any files transmitted with it are for the sole use of the 
intended recipient(s) and may contain confidential and privileged information. 
This email is sent for the intended recipient(s) only. If by an addressing or 
transmission error, this mail has been misdirected to you, you are requested to 
delete this mail immediately. If you are not the intended recipient(s), please 
reply to the sender and destroy all copies of the original message. Any 
unauthorized review, use, disclosure, dissemination, forwarding, printing or 
copying of this email, and/or any action taken in reliance on the contents of 
this e-mail is strictly prohibited and may be unlawful. Where permitted by 
applicable law, this e-mail and other e-mail communications sent to and from 
GSR e-mail addresses may be monitored.
-- 
https://mail.python.org/mailman/listinfo/python-list