Avoiding deadlocks in concurrent programming

2005-06-22 Thread Eloff
This is not really Python specific, but I know Python programmers are
among the best in the world. I have a fair understanding of the
concepts involved, enough to realize that I would benefit from the
experience of others :)

I have a shared series of objects in memory that may be > 100MB. Often
to perform a task for a client several of these objects must be used.
Since many clients can be making requests at once (>100 per second
during peak load) these objects must be protected with some method of
thread syncronization (threading.Lock would do fine.) This is somewhat
complicated by a backup thread which takes the data every 10 minutes
and exports it all to a format that can be saved to disk. Clearly the
backup thread must have exclusive access to all of these objects at
once, and there must be no half-completed transactions underway.

The easiest way from a design stand point is have a single lock and let
clients have exclusive access to everything even although they only
ever need access to a fraction of the data. This makes it easy for the
backup thread, and it ensures that there's no trouble with deadlocks or
with partially completed transactions. However imagine what would
happen if there's 100 clients in 100 threads waiting for access to that
lock. One could be almost finished with it, and then 100 threads could
get switched in and out, all doing nothing since they're all waiting
for the lock held by the one thread.

So I think you would need multiple locks so clients only acquire what
they need. This would let multiple threads access the data at once. But
now I have to deal with deadlocks since clients will usually acquire a
resource and then block acquiring another. It is very likely that one
client locks A, another locks B, then the guy with B waits for A and
the guy with A waits for B. Worse yet the backup thread will go around
trying to lock everything and will no doubt deadlock everybody.

How do you avoid this? I thought instead of blocking forever you could
try non-blocking acquires in a loop with a timeout of a few seconds, if
the timeout is reached, release any locks held and try again later,
with the backup thread being the only one to use blocking acquires
since it would never complete it's task otherwise.

No doubt this is a common problem, how would you people deal with it?

-Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Avoiding deadlocks in concurrent programming

2005-06-22 Thread Eloff
Hi Paul,

>Do you mean a few records of 20+ MB each, or millions of records of a
>few dozen bytes, or what?

Well they're objects with lists and dictionaries and data members and
other objects inside of them. Some are very large, maybe bigger than
20MB, while others are very numerous and small (a hundred thousand tiny
objects in a large list). Some of the large objects could have many
locks inside of them to allow simultaneous access to different parts,
while others would need to be accessed as a unit.

>If the 100 threads are blocked waiting for the lock, they shouldn't
>get awakened until the lock is released.  So this approach is
>reasonable if you can minimize the lock time for each transaction.

Now that is interesting, because if 100 clients have to go through the
system in a second, the server clearly is capable of sending 100
clients through in a second, and it doesn't matter if they all go
through "at once" or one at a time so long as nobody gets stuck waiting
for much longer than a few seconds. It would be very simple and
painless for me to send them all through one at a time. It is also
possible that certain objects are never accessed in the same action,
and those could have seperate locks as an optimization (this would
require carefull analysis of the different actions.)

>You're doing what every serious database implementation needs to do ...
>Are you sure you don't want to just use an RDBMS?

It was considered, but we decided that abstracting the data into tables
to be manipulated with SQL queries is substantially more complex for
the programmer and way too expensive to the system since the average
action would require 20-100 queries.

Thanks,
-Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: MySQLdb reconnect

2005-06-22 Thread Eloff
I don't beleive that it does. You can however call ping() on the
connection which should attempt an automatic reconnection.

See the docs for mysql_ping:

http://dev.mysql.com/doc/mysql/en/mysql-ping.html

I've never tested that, but I have a need for it also so let me know if
it works or not.

-Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Avoiding deadlocks in concurrent programming

2005-06-22 Thread Eloff
Hi Steve,

The backup thread only holds the lock long enough to create an
in-memory representation of the data. It writes to disk on it's own
time after it has released the lock, so this is not an issue.

If you're saying what I think you are, then a single lock is actually
better for performance than multiple locks so long as one avoids
waiting for other resources like I/O.

-Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: Avoiding deadlocks in concurrent programming

2005-06-22 Thread Eloff
Thanks for all of the replies, I'm glad I posted here, you guys have
been very helpful.

>Obviously, I only know what you've told us about your data, but 20-100
>queries?  That doesn't sound right ... RDBMSes are well-
>studied and well-understood; they are also extremely powerful when used
>to their potential.

Well I doubt there'd be even as many as 20 unique queries in any one
action, but 20-100 queries executed is about average, and it can be
over ten thousand for one action, clearly painfull to any RDBM or
programmer.

And as Paul McGuire points out, RDBMs don't help avoid deadlocks, in
fact it makes them easier to create in my opinion.

Paul Rubin, I think you're right, a single lock is the approach I'm
going to take. I don't want to seem stupid, but I don't understand what
you're talking about with the Queues, isn't a Queue just a synchronized
sequence? If you've got the time and you don't mind, I'd love a more
detailed explanation.

Niel, thanks for the link, I read through the article.

>I try to pick up crumbs of knowledge from my co-workers, and one of the
>smarter ones gave me this rubric for testing for deadlocks.

That's true, I'm going to remember that one.

Thanks a lot guys, I'm leaving on a trip tomorrow so I won't be able to
reply to this thread again, but I will read any other posts on it when
I come back.

-Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: "also" to balance "else" ?

2005-06-13 Thread Eloff
My first reaction was that this is terrible, else clauses on loops are
confusing enough. But if I think about it more, I'm warming up to the
idea. Also/Else for loops is clear, symmetrical, and would be useful.

Reversing the meanign of else will break code, but it's not used that
frequently, and it was a confusing thing to begin with, nothing wrong
in breaking something (slowly!) if it was 'broken' to begin with.

Alifs look evil, I couldn't deduce the control flow very easily, but
then they are a new idea. I'll keep an open mind on that one.

I think the best thing would be to compare the also/else syntax to what
identical functionality looks like in python now [hint to someone with
more time than me right now ;)]. I'd vote for whichever is the more
concise and readable of the two.

-Dan

-- 
http://mail.python.org/mailman/listinfo/python-list


What do _ast Load | Store | Del | AugLoad | AugStore | Param mean?

2008-10-11 Thread Eloff
In the _ast module Attribute, Subscript, Name, List, Tuple all have an
expr_context associated with them which is defined as:

expr_context = Load | Store | Del | AugLoad | AugStore | Param

I have no idea what they mean, and what's the difference between them.
I'm porting _ast to IronPython and this is the only part I've had
difficulty understanding. The only clue I've found is that all these
expressions are unique in that they can occur in assignment context.

Any insights at all?

Thanks,
-Dan
--
http://mail.python.org/mailman/listinfo/python-list


Re: What do _ast Load | Store | Del | AugLoad | AugStore | Param mean?

2008-10-11 Thread Eloff
On Oct 11, 9:22 pm, Benjamin <[EMAIL PROTECTED]> wrote:
> Load:
> a = 2
> Module(body=[Assign(targets=[Name(id='a', ctx=Store())],
> value=Num(n=2))])
>
> Store:
> print a
> Module(body=[Print(dest=None, values=[Name(id='a', ctx=Load())],
> nl=True)])
>
> Del:
> del a
> Module(body=[Delete(targets=[Name(id='a', ctx=Del())])])
>
> Param:
> def f(a): pass
> Module(body=[FunctionDef(name='f', args=arguments(args=[Name(id='a',
> ctx=Param())], vararg=None, kwarg=None, defaults=[]), body=[Pass()],
> decorator_list=[])])
>
> I hope that helps!
>

Very much! Thank you for such a clear explanation!

Cheers,
-Dan



--
http://mail.python.org/mailman/listinfo/python-list


How about adding slice notation to iterators/generators?

2009-10-15 Thread Eloff
I was just working with a generator for a tree that I wanted to skip
the first result (root node.)

And it occurs to me, why do we need to do:

import sys
from itertools import islice

my_iter = islice(my_iter, 1, sys.maxint)

When we could simply add slice operations to generators?

for x in my_iter[1:]:
pass

The way I figure it, only if there is no __getitem__ defined, and the
object has an __iter__ (i.e. not a list, but a iter(list) would be
ok), then there should be a default __getitem__ that is really just
islice, with similar limitations (no negative indices.)

The idea might need a bit of polish, but fundamentally it seems like
it could make it easier to deal with slicing generators?

At the least, stop=sys.maxint, step=1, and islice needs to accept
kwargs. So you could do islice(my_iter, start=1), that's an oversight
imho.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How about adding slice notation to iterators/generators?

2009-10-16 Thread Eloff
On Oct 16, 3:54 am, Terry Reedy  wrote:
> There is already an obvious standard way to do this.
>
> it = 
> next(it) #toss first item
> for item in it:
>   
>

That fails if there is no first item. You're taking one corner case
and saying there's an easy way to do it, which is more or less true,
but you miss my point that I'm suggesting we make the general case
easier.

By giving iterators a default, overridable, __getitem__ that is just
syntactic sugar for islice, they would share a slightly larger
interface subset with the builtin container types. In a duck-typed
language like python, that's almost always a good thing. You could use
iterators in more situations which expect something more like a list.

As long as it breaks no rationally existing code, I can think of no
good reason why not to do this in a future python.

-Dan
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: How about adding slice notation to iterators/generators?

2009-10-18 Thread Eloff
On Oct 16, 7:38 pm, Carl Banks  wrote:

> You would burden everyone who writes a custom iterator to provide a
> __getitem__ method just because you're too lazy to type out the word
> islice?

No, of course not. That would be stupid. Custom iterators are
iterators, so they would also get the default __getitem__ which would
work just perfectly for them. If they needed to override it, they
could. Remember, my proposal was to add a default __getitem__ for
iterators that is really just islice.

Ryles makes a good point though:

>I think Python programmers have learned to expect certain things from
>objects that support __getitem__. For example, indexing and slicing is
>repeatable on the same object:

That indexing/slicing iterators is not repeatable is likely to negate
much of the advantage of supporting a larger interface (because it
would be supported inconsistently, passing an iterator to something
that expects __getitem__ could be surprising.)

That does not mean it has no value in working with iterators, it
offers the same value islice does, just more conveniently.

Steven:
>If you want to support slicing, go right ahead, but please, I beg you,
>don't force me to support it in my iterators!

As I said to Carl Banks, forcing you do something differently is not
my aim here. Your other points apply equally to islice, which already
exists, and do not invalidate its purpose.

The main use case I'm thinking about here is skipping elements or
limiting results from iterators, usually in for loops:

for x in my_iter[5:]:
...

for x in my_iter[:5]:
...

Clearly that's easier and more readable than using islice. Nobody has
yet to provide a concrete reason why that would be a _bad thing_. That
doesn't make it a _good thing_, either.

-Dan
-- 
http://mail.python.org/mailman/listinfo/python-list


Find first matching substring

2009-07-17 Thread Eloff
Almost every time I've had to do parsing of text over the last 5 years
I've needed this function:

def find_first(s, subs, start=None, end=None):
results = [s.find(sub, start, end) for sub in subs]
results = [r for r in results if r != -1]
if results:
return min(results)

return -1

It finds the first matching substring in the target string, if there
is more than one match, it returns the position of the match that
occurs at the lowest index in the string.

Has anyone else had problems where they could have applied this
function?

It seems to me that python's find (and rfind, index, rindex) could be
modified (the same way that startswith and endswith have been) to
behave this way if a tuple were passed. Do other's agree that this
would be desirable?

Thanks,
Dan
-- 
http://mail.python.org/mailman/listinfo/python-list


Dynamically pass a function arguments from a dict

2005-02-23 Thread Dan Eloff
You can take a dictionary of key/value pairs and pass it to a function as  
keyword arguments:

def func(foo,bar):
print foo, bar
args = {'foo':1, 'bar':2}
func(**args)
will print "1 2"
But what if you try passing those arguments to a function
def func2(bar,zoo=''):
print bar, zoo
How can you determine that func2 will only accept
bar and zoo, but not foo and call the function with
bar as an argument?
I know CherryPy does this, but it's not obvious from the source
how they pulled it off.
-Dan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Dynamically pass a function arguments from a dict

2005-02-23 Thread Dan Eloff
Awesome, wrapping that into a function:
def getargs(func):
numArgs = func.func_code.co_argcount
names = func.func_code.co_varnames
return names[:numArgs]
short, concise, and it works :)
variables declared inside the function body are also in co_varnames
but after the arguments only it seems
A good module (huge, very comprehensive)
for this kinda thing can be found at:
http://lfw.org/python/inspect.py
Thanks very much!
-Dan
--
http://mail.python.org/mailman/listinfo/python-list


Re: How Do I get Know What Attributes/Functions In A Class?

2005-02-23 Thread Dan Eloff
Use the dir() function
dir(myClass)
returns a list of strings
-Dan
--
http://mail.python.org/mailman/listinfo/python-list


Re: Python v PHP for web, and restarting Apache?

2006-11-17 Thread sandra . eloff
On Nov 17, 12:07 pm, "walterbyrd" <[EMAIL PROTECTED]> wrote:
> I think I have read somewhere that using Python to develop
> web-applications requires some restarting of the Apache server, whereas
> PHP does not.

It depends what you do. CGI's operate much like PHP. mod_python has
auto-reloading (and by implication most frameworks that build on it.) A
poorly designed web application may require a restart. (Some people
also disable auto-reload for production servers for performance
advantages.)

> Also, I seem to remember reading something about PHP being able to
> recover from Apache restarting  more easily than Python.

I can think of no reason why that would be true. Perhaps these are
poorly designed python applications?

> I am not trying to suggest anything here. I'm just asking.

Ask away. The only bad question is an unasked question. Most of us can
act like adults here (although we all forget that from time to time.)

Python is much better suited to writing and mainting large web
applications though. Being both an experienced php and python
programmer, I can tell you I don't use php any more unless I have no
other choice.

-Sandra

-- 
http://mail.python.org/mailman/listinfo/python-list


How to test if an object IS another object?

2005-06-12 Thread dan . eloff
If two objects are of equal value you can compare them with ==. What I
want to do is find out if two objects are actually just references to
the same object, how can I do this in Python?

Thanks

-- 
http://mail.python.org/mailman/listinfo/python-list