RE: Fastest first

Avi Gross Mon, 24 Dec 2018 13:33:00 -0800

Stefan,

Yes, techniques like the one you mention are some of the ways I am aware of but 
they may not be the easiest way to automate.


I was thinking of a style of solving problems. I note python (on various 
platforms) has an amazing number of ways to do things in parallel and quite a 
few methods those tasks can communicate with the parent(s) or each other.

My goal was not so much to kill processes but to make a killing.

OK, maybe not. Just a pun.

The obvious goal is to have a framework where you have multiple parties trying 
to solve a problem using divergent methods with no obvious way in advance to 
figure out which would be first with a good answer. Some of the algorithms used 
might be heuristic and you might not want the first one that finishes but 
rather the first one that returns a value say within 2% of the best possible 
result. An example would be designing an airplane that comes within 2% of the 
results of one with negligible friction flying in fairly empty space. If the 
best current plane only is within 20% of that then any algorithm getting within 
10% might be good enough.

And some algorithms may not reliably terminate on the data supplied. Eventually 
they need to be stopped even if no other solution comes in.

Your message seems to be talking about threads running within the same process 
as the parent and being time switched. If writing code for that environment, I 
am aware of many ways of building the module as described. Yes, you want the 
threads to regularly yield back control for many reasons including some 
dominating at the expense of others. So if they are regularly pausing and 
yielding, you can set it up so they get a message and perform carefully 
controlled apoptosis. They might carefully clean up what is needed, such as 
closing open files, release any locks, delete any objects exclusively used by 
them, perhaps scribble away some data, and exit. Sure, the parent could just 
kill them but that can leave a mess.

And your other suggestions also make sense in some scenarios. Yes, a thread 
that seems to be making slow progress might even suspend itself for a while and 
check to see if others have made more progress.

Let me be very concrete here. I am thinking of an example such as wanting to 
calculate pi to some enormous number of decimal places. Don't ask why or it 
will spoil the mood.

There are a stupendous number of ways to calculate pi. Many involve summing the 
results of an infinite series. Some do very little on each iteration and some 
do very complex things. Most of them can be viewed as gradually having more and 
more digits trailing off to the "right" that no longer change with each 
iteration while the ones further to the right are clearly not significant yet 
since they keep changing. So an algorithm might stop every hundred iterations 
and calculate how many digits seem quite firm since the last check. One 
algorithm might report to a central bulleting board that they are getting five 
more digits and another might report 0 and yet another might report 352. It 
might make sense for only the fastest ones to continue at full pace. But if an 
algorithm later slows down, perhaps another one should be revived. A quick 
search online shows lots of ways to do such a set of analyses but note many do 
not return pi directly but something like pi/4 or 1/pi. There is even a formula 
that calculates the nth digit of pi without calculating any that came before!

Clearly you might be able to start hundreds of such methods at once. It may be 
better not to have them all within one process. Since none of them would 
terminate before reaching say the billionth digit, and even the fastest may run 
for a very long time, you can imagine an algorithm where each one checks their 
progress periodically and the slowest one quite every 10 minutes till only the 
fastest survive and perhaps are now running faster.

But none of this applies to my original question. I am talking about whether 
anyone has created a general framework for a much simpler scenario. Loosely 
speaking, I might be asked to create a list of objects or functions that 
somehow can be started independently in the same way. I hand this over to some 
module that takes my list and does the rest and eventually returns results. 
That module might accept additional choices and instructions that fine tune 
things. But the overall code might look like:

Import some module
Make a new object that will run things for me as some kind of controller.
Invoke incantations on that controller that include setting things and 
especially getting it the list of things to do.
I ask it to get started and either wait or go do something else.
At some later point I find it has completed. The results of the problem are 
somewhere I can reach. Perhaps there are logs I can search to see more details 
like which methods quit or were stopped or what their last result was before ...

Please see the above as an outline or example, not exact details or designs.

I simply think such a utility may already have been built out there, as well as 
many variations. Just as an example, someone recently mentioned to me that 
there is a module with a function called islice() that might do exactly what 
was wanted. So, of course, I read about everything else in that module: 
https://docs.python.org/2/library/itertools.html

And I found quite a variety of other functions to create combinations of 
iterables and ways to combine them to make even more. Others have already 
considered what might be of use and many are being shared. And many are well 
debugged so unless I am trying to deliberately figure out how to do something 
by myself (which I admit I often do) this is a good way to go.

So although I am happy to discuss ways to do this kind of task, I am really 
asking if this is a common enough way of looking at solving problems that many 
solutions have already appeared including some that are out there and ready.

Clearly though, some such solutions would be quite specific to particular 
methods of dividing up tasks such as using processes versus threads or even 
distributing computation across the cloud. Some might be even more dynamic and 
allow some of the processes running to throw additional ones into the mix as 
variants of their methods.

I will stop here even if it means I have to kill myself. 😉

Avi
-----Original Message-----
From: Python-list <python-list-bounces+avigross=verizon....@python.org> On 
Behalf Of Stefan Behnel
Sent: Monday, December 24, 2018 9:46 AM
To: python-list@python.org
Subject: Re: Fastest first

Avi Gross schrieb am 17.12.18 um 01:00:
> SHORT VERSION: a way to automatically run multiple algorithms in 
> parallel and kill the rest when one returns an answer.

One (somewhat seasonal) comment on this: it doesn't always have to be about 
killing (processes or threads). You might also consider a cooperative 
implementation, where each of the algorithms is allowed to advance by one 
"step" in each "round", and is simply discarded when a solution is found 
elsewhere, or when it becomes unlikely that this specific algorithm will 
contribute a future solution. This could be implemented via a sequence of 
generators or coroutines in Python. Such an approach is often used in 
simulations (e.g. SimPy and other "discrete event" simulators), where exact 
control over the concurrency pattern is desirable.

Stefan

--
https://mail.python.org/mailman/listinfo/python-list

-- 
https://mail.python.org/mailman/listinfo/python-list

RE: Fastest first

Reply via email to