I thought I'd share this read with you guys: http://coopsoft.com/ar/CalamityArticle.html
I'm not sure how closely these problems relate with what [math] is trying to do, but it's a interesting read. Gary On Fri, Apr 17, 2015 at 9:01 AM, Gilles <gil...@harfang.homelinux.org> wrote: > On Fri, 17 Apr 2015 08:35:42 -0700, Phil Steitz wrote: > >> On 4/17/15 3:14 AM, Gilles wrote: >> >>> Hello. >>> >>> On Thu, 16 Apr 2015 17:06:21 -0500, James Carman wrote: >>> >>>> Consider me poked! >>>> >>>> So, the Java answer to "how do I run things in multiple threads" >>>> is to >>>> use an Executor (java.util). This doesn't necessarily mean that you >>>> *have* to use a separate thread (the implementation could execute >>>> inline). However, in order to accommodate the separate thread case, >>>> you would need to code to a Future-like API. Now, I'm not saying to >>>> use Executors directly, but I'd provide some abstraction layer above >>>> them or in lieu of them, something like: >>>> >>>> public interface ExecutorThingy { >>>> Future<T> execute(Function<T> fn); >>>> } >>>> >>>> One could imagine implementing different ExecutorThingy >>>> implementations which allow you to parallelize things in different >>>> ways (simple threads, JMS, Akka, etc, etc.) >>>> >>> >>> I did not understand what is being suggested: parallelization of a >>> single algorithm or concurrent calls to multiple instances of an >>> algorithm? >>> >> >> Really both. It's probably best to look at some concrete examples. >> > > Certainly... > > The two I mentioned in my apachecon talk are: >> >> 1. Threads managed by some external process / application gathering >> statistics to be aggregated. >> >> 2. Allowing multiple threads to concurrently execute GA >> transformations within the GeneticAlgorithm "evolve" method. >> > > I could not view the presentation from the link previously mentioned > (it did not work with my browser...). > Can I download the PDF file from somewhere? > > It would be instructive to think about how to handle both of these >> use cases using something like what James is suggesting. What is >> nice about his idea is that it could give us a way to let users / >> systems decide whether they want to have [math] algorithms spawn >> threads to execute concurrently or to allow an external execution >> framework to handle task distribution across threads. >> > > Some (all?) cases of "external" parallelism are trivial for the CM > developers: the user must chop his data, pass the chunks as arguments > to the CM methods, then collect and reassemble the results, all by > himself. > IIUC the scenario, this cannot be deemed a "feature". > > Since 2. above is a good example of "internal" parallelism and it >> also has data sharing / transfer challenges, maybe its best to start >> with that one. >> > > That's the scenario where usage is simple and performance can match > the user's machine capability when running CM algorithms that are > inherently parallel. > > There is an example in CM: see > testTravellerSalesmanSquareTourParallelSolver() > in > org.apache.commons.math4.ml.neuralnet.sofm.KohonenTrainingTaskTest > > I have just started thinking about this and would >> love to get better ideas than my own hacking about how to do it >> >> a) Using Spark with RDD's to maintain population state data >> b) Hadoop with HDFS (or something else?) >> > > I have zero experience with this but I'm interested to know more. :-) > > Regards, > Gilles > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- E-Mail: garydgreg...@gmail.com | ggreg...@apache.org Java Persistence with Hibernate, Second Edition <http://www.manning.com/bauer3/> JUnit in Action, Second Edition <http://www.manning.com/tahchiev/> Spring Batch in Action <http://www.manning.com/templier/> Blog: http://garygregory.wordpress.com Home: http://garygregory.com/ Tweet! http://twitter.com/GaryGregory