Re: [MATH][GA] Issues in "commons-math4-ga2" design

Avijit Basak Sat, 15 Oct 2022 07:40:44 -0700

Hi All

        Please see my comments below. Kindly share further thoughts.


> [...]
>I'm not sure what you mean: The examples just run a GA-like algorithm,
>but (AFAICT) do not compare the output to some expected outcome.
-- I have some code changes in the "examples-ga-math-functions" module to
compare results of two modules "commons-math4-ga" and "commons-math4-ga2".
A graphical approach using JFreeChart has been adopted for the same. A new
value "COMPARE" has been introduced for the "--api" input argument to
initiate the comparison.
The "commons-math4-ga" module consistently provided better results than
"commons-math4-ga2".
The code is kept in my repo
https://github.com/avijit-basak/commons-math/tree/feature/MATH-1563_comparison
.
I did not raise a PR till now. This is only kept in my repo for comparison.
Could you please check if the feature__MATH-1563__genetic_algorithm branch
does contain changes from master of apache repo.

>> This variant design is more appropriate for a *generalized population
based
>> stochastic optimizer* which can accommodate other algorithms like
>> multi-agent gradient descent/simulated annealing, genetic
algorithm(already
>> implemented), particle swarm optimization and large neighbourhood search
>> etc.
>> If we want to stick to this new design I would rather suggest *renaming*
of
>> the existing interfaces so that the API can be more generic and can be
used
>> for all other algorithms. GA should be a specific implementation for that
>> API.
>> However, we might have to think more on the multiple operator scenarios.
>
>An interesting suggestion.  If the generalized API can be achieved
>easily, I'm all for it.
>However, I wonder how useful it will be, as every actual optimizer
>implementation may
> * require substantial adaptations to fit the common API
> * need extensions to provide access to specific features (which
>   would decrease the usefulness of the common API for users).
[...]
-- We can avoid that for now as that will be a bigger task.

>
>[1] My main argument for the "GA variant" is that it is much simpler, for
>     what seems equivalent functionality (bugs, or misinterpretation of
>     expected behaviour, notwithstanding): Current counts of lines of
>     code is 696 vs 2038.
-- The variant only contains options for binary genotype but the
"commons-math4-ga" module provides options for other genotypes too. So, we
may not compare the lines of code. However, considering the optimization
result and options of genotypes I would still vote for "commons-math4-ga"
instead of its new variant.

Thanks & Regards
--Avijit Basak

On Thu, 29 Sept 2022 at 22:42, Gilles Sadowski <[email protected]> wrote:

> Hello.
>
> Le jeu. 29 sept. 2022 à 14:07, Avijit Basak <[email protected]> a
> écrit :
> >
> > Hi All
> >
> >          Please find my comments below:
> >
> > >
> > >> Hi All
> > >>
> > >>          The newly proposed design of "commons-math4-ga2" has two
> primary
> > >> issues which I would like to mention here.
> > >>
> > >> *1) GA logic*: The design does not conform to the basic genetic
> algorithm
> > >I understand the concern about providing the standard ("historical") GA.
> > >The theorem assumes the standard GA, but the example shows that
> > >convergence is also achieved with the variant.
> >
> > -- Yes the new variant can accommodate the standard GA too.
> >
> > >
> > >>     However, the new design proposed as part of "commons-math4-ga2"
> > >> deviates from the basic logic. It does not distinguish the operators
> i.e.
> > >> crossover and mutation and treats them uniformly. The order of
> > >> operator application is also not considered.
> > >
> > >All intended as "features". ;-)
> > >[One being that, in the variant implementation, it is possible to apply
> > >any number of operators, not just one specific crossover followed by
> > >one mutation.]
> > >
> > >Shouldn't we be able (IIUC) to define the standard GA procedure by
> > >an extension of the API like the following (untested):
> > >---CUT---
> > >public class CrossoverThenMutate<G>
> > >    extends AbstractCrossover<G> {
> > >    private AbstractCrossover<G> c;
> > >    private AbstractMutation<G> m;
> > > [...]
> > >    private List<G> mutate(G parent,
> > >                                          UniformRandomProvider rng) {
> > >        final List<G> p = new ArrayList<G>(1);
> > >        p.add(parent);
> > >        return m.apply(p, rng);
> > >    }
> > >}
> > >---CUT---
> > >
> > >AFAICT, a standard GA would thus be performed if this combined
> > >operator would be used as a unique operator in the GA variant.
> >
> > --If we consider this approach we may need to modify our examples which
> > assume the standard GA.
>
> I'm not sure what you mean: The examples just run a GA-like algorithm,
> but (AFAICT) do not compare the output to some expected outcome.
>
> > This variant design is more appropriate for a *generalized population
> based
> > stochastic optimizer* which can accommodate other algorithms like
> > multi-agent gradient descent/simulated annealing, genetic
> algorithm(already
> > implemented), particle swarm optimization and large neighbourhood search
> > etc.
> > If we want to stick to this new design I would rather suggest *renaming*
> of
> > the existing interfaces so that the API can be more generic and can be
> used
> > for all other algorithms. GA should be a specific implementation for that
> > API.
> > However, we might have to think more on the multiple operator scenarios.
>
> An interesting suggestion.  If the generalized API can be achieved
> easily, I'm all for it.
> However, I wonder how useful it will be, as every actual optimizer
> implementation may
>  * require substantial adaptations to fit the common API
>  * need extensions to provide access to specific features (which
>    would decrease the usefulness of the common API for users).
> I'm mentioning this because we tried to design such a common API
> for the optimizers implemented in package "o.a.c.m.optim", with
> eventual shortcomings.
> Another counter-argument is that the "abstract" optimization recipe
> that would be defined in terms of the high-level API is generally
> fairly simple (compared to an algorithm concrete implementation);
> we'd just save a few lines of code that can otherwise be easily
> provided in the documentation.
>
> Anyways, let us know whether you want to explore this further
> (through providing actual code).
> I think (?) that it could be done in a separate (maven) module
> which the GA module would depend on.
> [IIUC, another "population-based" algorithm to depend on this
> API would be the "CMAESOptimizer" that currently is adapted
> to the "optim" API which I mentioned above...]
>
> > >
> > >> Along with that it executes
> > >> parent selection two times instead of one.
> > >
> > >That would also be taken care of with the above combined operator.
> > >
> > >> These are clear deviations from the standard approach used so far and
> > would
> > >> require a fix.
> > >>
> > >>
> > >> *2) Determination of mutation probability*: The newly proposed design
> of
> > >> "commons-math4-ga2" determines the probability of mutation at the
> > algorithm
> > >> level. Same approach was used in math 3.x implementation. However,
> this
> > >> approach considers the probability of mutation at the chromosome level
> > not
> > >> at the allele/gene level. I have found a considerable difference in
> the
> > >> quality of optimization between two cases. Determining the mutation
> > >> probability at the gene/allele level has given a
> > >> considerably better result.
> > >
> > >A runnable test case (that creates a comparison) would be quite useful
> > >to illustrate the feature.
> > >
> > >> Usage of mutation probability at the chromosome
> > >> level would only ensure mutation of a single allele irrespective of
> > >> probability
> > >
> > >?
> > >In the basic implementation for the "binary" genotype (in class
> > >"o.a.c.m.ga2.gene.binary.Mutation"), there is a loop over all the
> > >alleles.
> > >
> > >> or chromosome size. There is no such limitation in case the
> > >> mutation probability is decided at the allele level and can be easily
> > >> controlled by users for fine tuning. This has helped to improve the
> > >> optimization quality thus providing better results. This is only
> related
> > to
> > >> mutation not crossover. But we can maintain an uniform approach and
> let
> > the
> > >> operator decide on the probability.
> > >
> > >I don't understand.
> > >Please refer to the class mentioned above and describe the required
> > >modifications.
> > -- E.g. assume the user is having a chromosome population of size 10 and
> > chromosome length is 10.
> > mutation probability      no of alleles modified per chromosome       no
> of
> > alleles modified in population
> >          .2                                                     2
> >                                                        20
> >          .1                                                     1
> >                                                        10
> >          .05                                                   --
> >                                                        5
> >          .02                                                   --
> >                                                        2
> >          .2                                                     2
> >                                                        20
> >
> > This way users can have more freedom over allele variation in the entire
> > population.
>
> We really need code (and unit tests to assert the expected
> functionality)...
>
> I hope that a "beta" release of CM can occur as soon as the components
> which it depends on have had their own ("beta" or not).
> So the question is:  What to release as the GA module?
> More to the point, what exactly do we need to change, or add, in the "GA
> variant" code proposal in order to make it suitable for general usage?[1]
> [I'm referring here to actual functionality (operators, representations,
> ...),
> not to the hypothetical framework possibly shared by other algorithms.]
>
> Regards,
> Gilles
>
> [1] My main argument for the "GA variant" is that it is much simpler, for
>      what seems equivalent functionality (bugs, or misinterpretation of
>      expected behaviour, notwithstanding): Current counts of lines of
>      code is 696 vs 2038.
>
> > > [...]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Re: [MATH][GA] Issues in "commons-math4-ga2" design

Reply via email to