Hi All Please find my comments below:
>The build fails because of CheckStyle errors: >https://app.travis-ci.com/github/apache/commons-math/builds/246683712 --Fixed the issues >>>> [...] >> >> > >> >> >I did not suggest to remove any Javadoc, only to rephrase it as: [...] >> >As hinted by my comment is the previous message, I've still to >> >clarify my own expectations; but I vaguely sense some lost >> >opportunity for simpler usage simpler and increased performance >> >through the caller just needs to specify the number of "worker >> >threads". >> -- We should have both options. Users can execute the algorithm by >> specifying only the number of worker threads with a single population as >> well as optimize multiple populations in a parallel fashion. > >Another misunderstanding (probably); we must figure out where >the parallelism will be implemented. >IIUC the current state of the code, optimizing multiple populations >in parallel would be the same as launching multiple JVMs; I'd want >to explore low-level parallelism (i.e. at the "Chromosome" level). -- I have implemented both muti-threading and multi-population parallelism. > >> For the multi-population option the common thread pool would be reused for >> all populations. >> > >> >Do we at least agree that >> >1. Adding/retrieving a "Chromosome" to/from a "Population" >> >must be thread-safe (and is not trivial) >> >2. Fitness computation is where most time is usually spent >> >(so that multi-threading must be achieved at that granularity) >> >? >> --The way I am thinking of designing a task is that it should accept the >> current population and return an instance of ChromosomePair. >> The chromosomes within this pair would be added to the population by the >> caller thread. Population won't be updated by multiple threads. > >Then, it would not be a multi-threaded library. >This kind of parallelism does not need support from the library, >and can be implemented at the application level. > >> The code snippet below shows the body of the method which will be executed >> inside the task. >> --CUT-- >> >> //selection >> ChromosomePair pair = getSelectionPolicy().select(current); >> >> // crossover >> if (randGen.nextDouble() < getCrossoverRate()) { >> // apply crossover policy to create two offspringoport >> pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond()); >> } >> >> // mutation >> if (randGen.nextDouble() < getMutationRate()) { >> // apply mutation policy to the chromosomes >> pair = new ChromosomePair( >> getMutationPolicy().mutate(pair.getFirst()), >> getMutationPolicy().mutate(pair.getSecond())); >> } >> >> return pair; >> >> --CUT-- > >One of the issue with above code is, again, that "randGen" >must be thread-safe (and it is not, usually). >Also, it doesn't say anything about how to ensure that >the fitness computation is thread-safe (and if you assume >that it will be computed outside that "task", then the >performance gain will be very low). > -- Current implementation is using a Thread local version of random number generator. >> > >> >I'd surmise that "multiple instances of AbstractGeneticAlgorithm" >> >is an application concern; unless I'm missing something, it's >> >not what I've in mind when talking about multi-threading. >> >Actually, I was wondering whether we could implement the >> >analog of what is in the "commons-math-neuralnet" module, >> >where >> >* "Neuron" is the counterpart "Chromosome" >> >* "Network" is the counterpart of "Population". >> --"multiple instance of AbstractGeneticAlgorithm" is related to parallel GA >> with multiple populations not multi-threading. > >Yes, as I also mentioned above. >But I'm interested in where multi-threading can be implemented to >be used in both cases (single population and multiple populations). > >> Users can also implement parallel GA in a synchronous manner although that >> won't be a recommended way. >> Multi-threading is only a way to improve performance using a user's multi >> core CPU. >> The threads in the thread pool would only be used to execute the task as >> mentioned in the previous comment. > >That's where I've some doubt. >But be free to implement benchmarks that demonstrate the >expected performance improvement. > >> I think we have some misunderstanding over here. It is better to do an >> implementation first and start the discussion. >> It would be more productive. > >Agreed. ;-) --I have checked-in the code for multi-threading along with parallel GA. Kindly review and share your thoughts. Thanks & Regards --Avijit Basak On Sun, 20 Feb 2022 at 05:25, Gilles Sadowski <gillese...@gmail.com> wrote: > Hello. > > Le ven. 18 févr. 2022 à 11:32, Avijit Basak <avijit.ba...@gmail.com> a > écrit : > > > > Hi All > > > > Please see my comments below. > > > >> [...] > > > > --I have modified the exception class name to > > GeneticIllegalArgumentException. > > The build fails because of CheckStyle errors: > https://app.travis-ci.com/github/apache/commons-math/builds/246683712 > > >>> [...] > > >> > > > >> >I did not suggest to remove any Javadoc, only to rephrase it as: > > >> >---CUT--- > > >> >/** "Message template". */ > > >> >---CUT--- > > >> > > > >> -- Just for confirmation. Do you need the comment to be changed from > > >> ---CUT--- > > >> /** Error message for "out of range" condition. */ > > >> ---CUT--- > > >> *to* > > >> ---CUT--- > > >> /** "out of range" */ > > >> ---CUT--- > > > > > >No, sorry, I just meant that e.g. > > >---CUT--- > > >/** Message template. */ > > >private static final String OUT_OF_RANGE = "Out of range: min={0} > max={1}"; > > >---CUT--- > > >is sufficient. Since the field is private, the comment will mostly ever > > >be read while looking at the source code. > > > > --I think we have a misunderstanding here. The field is public, not > > private. > > Sorry, my mistake. > > > Will a hard coded "Message Template" comment be sufficient? > > You are right, better be somewhat more specific. > However, it would even better if we can manage to keep > the internal classes private. [This can be explored further > when the design has stabilized.] > > > > > >> JDK(*ProviderBuilder.RandomSourceInternal.JDK*), > > >> ... > > >> } > > >> ---CUT--- > > >> I am not sure how to achieve this for our algorithm classes. > > > > > >I'm a bit lost here; I don't get the relationship of this with my > > >remark above (constant vs adaptive)... Maybe you could file > > >JIRA report to clarify and further discussion? > > > > --Let me try to clarify my original understanding. I think the proposal > is > > to introduce an Enum and declare each type of algorithm classes as > > instances of enum. So it might follow this template: > > --CUT-- > > enum GeneticAlgorithms { > > SGA([args...]),//Simple genetic algorithm equivalent to > > GeneticAlgorithm class > > AGA([args...]) //Adaptive genetic algorithm. > > } > > --CUT-- > > Where [args...] are selection, crossover, mutation operators and > > probalities. > > I didn't mean anything like that; AFAICT there may have > been some mixing between two comments. ;-) > > > > > As the AdaptiveGeneticAlgorithm is a generalization of GeneticAlgorithm > it > > would be possible to represent the later by former using constant > crossover > > and mutation rate generators. > > But the problem lies in a different area. > > In enumeration the required arguments ([args...]) need to be specified > > during the declaration using constant values. > > But for our implementation those arguments need to be provided by the > > client programs. I don't know how to achieve that using enum. > > Let me know if you have any different views regarding this. > > We can revisit this later. > > > >> [...] > > > > > >As hinted by my comment is the previous message, I've still to > > >clarify my own expectations; but I vaguely sense some lost > > >opportunity for simpler usage simpler and increased performance > > >through the caller just needs to specify the number of "worker > > >threads". > > -- We should have both options. Users can execute the algorithm by > > specifying only the number of worker threads with a single population as > > well as optimize multiple populations in a parallel fashion. > > Another misunderstanding (probably); we must figure out where > the parallelism will be implemented. > IIUC the current state of the code, optimizing multiple populations > in parallel would be the same as launching multiple JVMs; I'd want > to explore low-level parallelism (i.e. at the "Chromosome" level). > > > For the multi-population option the common thread pool would be reused > for > > all populations. > > > > > >Do we at least agree that > > >1. Adding/retrieving a "Chromosome" to/from a "Population" > > >must be thread-safe (and is not trivial) > > >2. Fitness computation is where most time is usually spent > > >(so that multi-threading must be achieved at that granularity) > > >? > > --The way I am thinking of designing a task is that it should accept the > > current population and return an instance of ChromosomePair. > > The chromosomes within this pair would be added to the population by the > > caller thread. Population won't be updated by multiple threads. > > Then, it would not be a multi-threaded library. > This kind of parallelism does not need support from the library, > and can be implemented at the application level. > > > The code snippet below shows the body of the method which will be > executed > > inside the task. > > --CUT-- > > > > //selection > > ChromosomePair pair = getSelectionPolicy().select(current); > > > > // crossover > > if (randGen.nextDouble() < getCrossoverRate()) { > > // apply crossover policy to create two offspringoport > > pair = getCrossoverPolicy().crossover(pair.getFirst(), pair.getSecond()); > > } > > > > // mutation > > if (randGen.nextDouble() < getMutationRate()) { > > // apply mutation policy to the chromosomes > > pair = new ChromosomePair( > > getMutationPolicy().mutate(pair.getFirst()), > > getMutationPolicy().mutate(pair.getSecond())); > > } > > > > return pair; > > > > --CUT-- > > One of the issue with above code is, again, that "randGen" > must be thread-safe (and it is not, usually). > Also, it doesn't say anything about how to ensure that > the fitness computation is thread-safe (and if you assume > that it will be computed outside that "task", then the > performance gain will be very low). > > > > > > >I'd surmise that "multiple instances of AbstractGeneticAlgorithm" > > >is an application concern; unless I'm missing something, it's > > >not what I've in mind when talking about multi-threading. > > >Actually, I was wondering whether we could implement the > > >analog of what is in the "commons-math-neuralnet" module, > > >where > > >* "Neuron" is the counterpart "Chromosome" > > >* "Network" is the counterpart of "Population". > > --"multiple instance of AbstractGeneticAlgorithm" is related to parallel > GA > > with multiple populations not multi-threading. > > Yes, as I also mentioned above. > But I'm interested in where multi-threading can be implemented to > be used in both cases (single population and multiple populations). > > > Users can also implement parallel GA in a synchronous manner although > that > > won't be a recommended way. > > Multi-threading is only a way to improve performance using a user's multi > > core CPU. > > The threads in the thread pool would only be used to execute the task as > > mentioned in the previous comment. > > That's where I've some doubt. > But be free to implement benchmarks that demonstrate the > expected performance improvement. > > > I think we have some misunderstanding over here. It is better to do an > > implementation first and start the discussion. > > It would be more productive. > > Agreed. ;-) > > Gilles > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >