On 04/17/2015 05:35 PM, Phil Steitz wrote: > On 4/17/15 3:14 AM, Gilles wrote: >> Hello. >> >> On Thu, 16 Apr 2015 17:06:21 -0500, James Carman wrote: >>> Consider me poked! >>> >>> So, the Java answer to "how do I run things in multiple threads" >>> is to >>> use an Executor (java.util). This doesn't necessarily mean that you >>> *have* to use a separate thread (the implementation could execute >>> inline). However, in order to accommodate the separate thread case, >>> you would need to code to a Future-like API. Now, I'm not saying to >>> use Executors directly, but I'd provide some abstraction layer above >>> them or in lieu of them, something like: >>> >>> public interface ExecutorThingy { >>> Future<T> execute(Function<T> fn); >>> } >>> >>> One could imagine implementing different ExecutorThingy >>> implementations which allow you to parallelize things in different >>> ways (simple threads, JMS, Akka, etc, etc.) >> >> I did not understand what is being suggested: parallelization of a >> single algorithm or concurrent calls to multiple instances of an >> algorithm? > > Really both. It's probably best to look at some concrete examples. > The two I mentioned in my apachecon talk are: > > 1. Threads managed by some external process / application gathering > statistics to be aggregated. > > 2. Allowing multiple threads to concurrently execute GA > transformations within the GeneticAlgorithm "evolve" method. > > It would be instructive to think about how to handle both of these > use cases using something like what James is suggesting. What is > nice about his idea is that it could give us a way to let users / > systems decide whether they want to have [math] algorithms spawn > threads to execute concurrently or to allow an external execution > framework to handle task distribution across threads.
I since a more viable option is to take advantage of the ForkJoin mechanism that we can use now in math 4. For example, the GeneticAlgorithm could be quite easily changed to use a ForkJoinTask to perform each evolution, I will try to come up with an example soon as I plan to work on the genetics package anyway. The idea outlined above sounds nice but it is very unclear how an algorithm or function would perform its parallelization in such a way, and whether it would still be efficient. Thomas > Since 2. above is a good example of "internal" parallelism and it > also has data sharing / transfer challenges, maybe its best to start > with that one. I have just started thinking about this and would > love to get better ideas than my own hacking about how to do it > > a) Using Spark with RDD's to maintain population state data > b) Hadoop with HDFS (or something else?) > > Phil >> >> >> Gilles >> >>>> [...] >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> >> > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org