Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sman81
> Would't it be simpler to deal with the serialization issue by bumping the 
> serialVersionID?
simpler yes, but it's a different thing
The PR makes the serialized forms for commons-csv versions 1.9.0 and 1.10.0 
compatible.

Given that serialization has been broken for several versions in commons-csv 
and given that "fixing" it now is a labor of love which does not add much value,
I am in favor of throwing it out altogether now rather than in some future 
(major) version. Clearly inform about it in the release notes and be done with 
it.

> Also note the PR will throw an NPE in the builder
> when instead of using the validate() method.

setDuplicateHeaderMode(null) is illegal and should be fail-fast IMO

kind regards,
Markus

From: Gary Gregory 
Sent: Thursday, October 20, 2022 16:43
To: Commons Developers List 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
Would't it be simpler to deal with the serialization issue by bumping the
serialVersionID? We can just say that you only serialized and deserialize
for the same version. Also note the PR will throw an NPE in the builder
when instead of using the validate() method.

Gary

On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:

> I've commented on the PR.
> TY.
> Gary
>
> On 2022/10/19 16:51:57 Gary Gregory wrote:
> > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> wrote:
> > >
> > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> wrote:
> > > >
> > > > My +1
> > > >
> > > > Gary
> > >
> > > Gary,
> > >
> > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> > >
> > > AllowDuplicates enum may be set to the incorrect value when setting
> > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > valid.
> >
> > I will re-read later tonight...
> >
> > Gary
> >
> > >
> > > Alex
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>
-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sebb
On Fri, 21 Oct 2022 at 11:57,  wrote:
>
> > Would't it be simpler to deal with the serialization issue by bumping the 
> > serialVersionID?
> simpler yes, but it's a different thing
> The PR makes the serialized forms for commons-csv versions 1.9.0 and 1.10.0 
> compatible.
>
> Given that serialization has been broken for several versions in commons-csv 
> and given that "fixing" it now is a labor of love which does not add much 
> value,
> I am in favor of throwing it out altogether now rather than in some future 
> (major) version. Clearly inform about it in the release notes and be done 
> with it.

+1

> > Also note the PR will throw an NPE in the builder
> > when instead of using the validate() method.
>
> setDuplicateHeaderMode(null) is illegal and should be fail-fast IMO
>
> kind regards,
> Markus
>
> From: Gary Gregory 
> Sent: Thursday, October 20, 2022 16:43
> To: Commons Developers List 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> Would't it be simpler to deal with the serialization issue by bumping the
> serialVersionID? We can just say that you only serialized and deserialize
> for the same version. Also note the PR will throw an NPE in the builder
> when instead of using the validate() method.
>
> Gary
>
> On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:
>
> > I've commented on the PR.
> > TY.
> > Gary
> >
> > On 2022/10/19 16:51:57 Gary Gregory wrote:
> > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> > wrote:
> > > >
> > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> > wrote:
> > > > >
> > > > > My +1
> > > > >
> > > > > Gary
> > > >
> > > > Gary,
> > > >
> > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1.
> > > >
> > > > AllowDuplicates enum may be set to the incorrect value when setting
> > > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > > valid.
> > >
> > > I will re-read later tonight...
> > >
> > > Gary
> > >
> > > >
> > > > Alex
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary D. Gregory
I updated git master with Javadoc to signal that Serialization in CSVFormat is 
not supported from one version to the next. I also bumped the serial version ID 
from 1 to 2. All of this is noted in changes.xml.

Gary

On 2022/10/21 11:27:43 sebb wrote:
> On Fri, 21 Oct 2022 at 11:57,  wrote:
> >
> > > Would't it be simpler to deal with the serialization issue by bumping the 
> > > serialVersionID?
> > simpler yes, but it's a different thing
> > The PR makes the serialized forms for commons-csv versions 1.9.0 and 1.10.0 
> > compatible.
> >
> > Given that serialization has been broken for several versions in 
> > commons-csv and given that "fixing" it now is a labor of love which does 
> > not add much value,
> > I am in favor of throwing it out altogether now rather than in some future 
> > (major) version. Clearly inform about it in the release notes and be done 
> > with it.
> 
> +1
> 
> > > Also note the PR will throw an NPE in the builder
> > > when instead of using the validate() method.
> >
> > setDuplicateHeaderMode(null) is illegal and should be fail-fast IMO
> >
> > kind regards,
> > Markus
> >
> > From: Gary Gregory 
> > Sent: Thursday, October 20, 2022 16:43
> > To: Commons Developers List 
> > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> >
> > Would't it be simpler to deal with the serialization issue by bumping the
> > serialVersionID? We can just say that you only serialized and deserialize
> > for the same version. Also note the PR will throw an NPE in the builder
> > when instead of using the validate() method.
> >
> > Gary
> >
> > On Wed, Oct 19, 2022, 18:27 Gary D. Gregory  wrote:
> >
> > > I've commented on the PR.
> > > TY.
> > > Gary
> > >
> > > On 2022/10/19 16:51:57 Gary Gregory wrote:
> > > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert 
> > > wrote:
> > > > >
> > > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory 
> > > wrote:
> > > > > >
> > > > > > My +1
> > > > > >
> > > > > > Gary
> > > > >
> > > > > Gary,
> > > > >
> > > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 
> > > > > RC1.
> > > > >
> > > > > AllowDuplicates enum may be set to the incorrect value when setting
> > > > > the allow duplicates boolean. Have you reviewed this? I believe it is
> > > > > valid.
> > > >
> > > > I will re-read later tonight...
> > > >
> > > > Gary
> > > >
> > > > >
> > > > > Alex
> > > > >
> > > > > -
> > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > > >
> > > >
> > > > -
> > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > > For additional commands, e-mail: dev-h...@commons.apache.org
> > > >
> > > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary D. Gregory
On 2022/10/20 22:56:05 Alex Herbert wrote:
> On Thu, 20 Oct 2022 at 23:43, Alex Herbert  wrote:
> >
> > I did not have time to track through whether this behaviour changed
> > after the initial implementation of the flag. I would think not as the
> > original behaviour is from 1.0. This would map to:
> >
> > true -> ALLOW_ALL
> > false -> ALLOW_EMPTY
> > new -> DISALLOW
> >
> > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > behavioural compatibility (to 1.7, or back to 1.0).
> 
> PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> for allowDuplicates=false and does not change any tests.
> 
> So the test suite is currently not enforcing behavioural
> compatibility. This seems like a glaring hole in the tests and should
> be addressed to prevent regressions.

In git master, there are many tests for senders of 
withAllowDuplicateHeaderNames(boolean) and 
setAllowDuplicateHeaderNames(boolean). 

I filled in the coverage in testGetAllowDuplicateHeaderNames() for 
getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or 
wrong. 

Let's reassess now git master now please. If the PR matters still, it should be 
rebased on git master.

Gary

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sman81
I've removed serialization stuff from the PR and rebased it to master
https://github.com/apache/commons-csv/pull/276

kind regards,
Markus

From: Gary D. Gregory 
Sent: Friday, October 21, 2022 14:18
To: dev@commons.apache.org 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
On 2022/10/20 22:56:05 Alex Herbert wrote:
> On Thu, 20 Oct 2022 at 23:43, Alex Herbert  wrote:
> >
> > I did not have time to track through whether this behaviour changed
> > after the initial implementation of the flag. I would think not as the
> > original behaviour is from 1.0. This would map to:
> >
> > true -> ALLOW_ALL
> > false -> ALLOW_EMPTY
> > new -> DISALLOW
> >
> > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > behavioural compatibility (to 1.7, or back to 1.0).
> 
> PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> for allowDuplicates=false and does not change any tests.
> 
> So the test suite is currently not enforcing behavioural
> compatibility. This seems like a glaring hole in the tests and should
> be addressed to prevent regressions.

In git master, there are many tests for senders of 
withAllowDuplicateHeaderNames(boolean) and 
setAllowDuplicateHeaderNames(boolean). 

I filled in the coverage in testGetAllowDuplicateHeaderNames() for 
getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or 
wrong. 

Let's reassess now git master now please. If the PR matters still, it should be 
rebased on git master.

Gary

> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
> 
> 

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary Gregory
Hi Markus,

The PR has conflicts and does not test its changes.

I'd like feedback from the community on whether or not git master as it is
now is OK for the duplicate header behavior or if changing the default is
needed and if doing so is just ping-pongonging from one incompatibility to
another, based on previous versions.

TY!

Gary

On Fri, Oct 21, 2022, 08:52  wrote:

> I've removed serialization stuff from the PR and rebased it to master
> https://github.com/apache/commons-csv/pull/276
>
> kind regards,
> Markus
>
> From: Gary D. Gregory 
> Sent: Friday, October 21, 2022 14:18
> To: dev@commons.apache.org 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> On 2022/10/20 22:56:05 Alex Herbert wrote:
> > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> wrote:
> > >
> > > I did not have time to track through whether this behaviour changed
> > > after the initial implementation of the flag. I would think not as the
> > > original behaviour is from 1.0. This would map to:
> > >
> > > true -> ALLOW_ALL
> > > false -> ALLOW_EMPTY
> > > new -> DISALLOW
> > >
> > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > behavioural compatibility (to 1.7, or back to 1.0).
> >
> > PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> > for allowDuplicates=false and does not change any tests.
> >
> > So the test suite is currently not enforcing behavioural
> > compatibility. This seems like a glaring hole in the tests and should
> > be addressed to prevent regressions.
>
> In git master, there are many tests for senders of
> withAllowDuplicateHeaderNames(boolean) and
> setAllowDuplicateHeaderNames(boolean).
>
> I filled in the coverage in testGetAllowDuplicateHeaderNames() for
> getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or
> wrong.
>
> Let's reassess now git master now please. If the PR matters still, it
> should be rebased on git master.
>
> Gary
>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [math] contribution proposal for multivariate functions optimization (2)

2022-10-21 Thread François Laferrière
  Hello Alex,
Ichecked the architecture of MultivariateOptimizer family to compareto what I 
have done so far. I think that what I have done can berefactored to fit in the 
general framework as extension ofGradientMultivariateOptimizer even though, 
main differences are :   
   -
There is no need to provide explicit gradient function, as it can be computed 
with finite difference (secant approximation to partial derivative).
 
   -
There is a need to compute the Hessian matrix. In the same way, it is computed 
with finite differences.
 
   -
My approach uses of an extension of MultivariateFunction that have methods to 
provide gradient and hessian. But that is perhaps not a so good idea: it would 
be better to provide generic gradient and hessian classes that are operator 
applied to plain MultivariateFunction.


I also have already written tests to cover at least all nominalcases and a few 
error cases. Now that I know where to put my code, Iwill integrate it in ACM 
clone (with all tests) and try to refactorit until it fit seamlessly in ACM 
classes and concepts.

This may take a while.

Yours truly 
François Laferrière

Le jeudi 20 octobre 2022 à 20:18:46 UTC+2, Alex Herbert 
 a écrit :  
 
 Hi,

Thanks for the interest in Commons Math.

Currently all the optimisation code is in commons-math-legacy. I think
the gradient based methods would fit in:

org.apache.commons.math4.legacy.optim.nonlinear.scalar.gradient

Can your implementations be adapted to work with the existing
interfaces? The decision to move the entire 'optim' package to a new
module allows a redesign of interfaces. The old and new can coexist
but ideally we would want to support only one optimisation
architecture. Have a look at the current classes and let us know what
you think.

Regards,

Alex



On Thu, 20 Oct 2022 at 15:36, François Laferrière
 wrote:
>
> Hello,
> Sorry, previous message was a mess
> Based on Apache common math, I have implemented some commonplace optimization 
> algorithms that could be integrated in ACM. This includes
>
>    - Gradient Descent (en.wikipedia.org/wiki/Gradient_descent)
>
>    - Newton Raphson 
>(https://en.wikipedia.org/wiki/Newton's_method_in_optimization)
>
>    - BFGS 
>(https://en.wikipedia.org/wiki/Broyden–Fletcher–Goldfarb–Shanno_algorithm)
>
> They are implemented in such a way that other algorithms of the same family 
> (Newton) can be implemented easily from existing building blocks.
> I clone http://gitbox.apache.org/repos/asf/commons-math.git but I am a bit 
> lost in the module structure. Should I put my code in one existing 
> commons-math4-* module (if so which one?) or should I create a new module 
> (for instance commons-math-optimization) ?
> Many thanks in advance
> François Laferrière
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org

  

Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread sman81
Hi Gary,

the PR did not have conflicts 30 minutes ago. So we must have encountered a 
race condition between you updating master and me rebasing and force-pushing my 
pr.
I've done that again just now. There are no conflicts.
At the same time I see parts of my PR have dribbled into master already, so if 
you rather implement the changes yourself, please go ahead. I am fine with 
closing my pr.

regards,
Markus


From: Gary Gregory 
Sent: Friday, October 21, 2022 15:17
To: Commons Developers List 
Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 
 
Hi Markus,

The PR has conflicts and does not test its changes.

I'd like feedback from the community on whether or not git master as it is
now is OK for the duplicate header behavior or if changing the default is
needed and if doing so is just ping-pongonging from one incompatibility to
another, based on previous versions.

TY!

Gary

On Fri, Oct 21, 2022, 08:52  wrote:

> I've removed serialization stuff from the PR and rebased it to master
> https://github.com/apache/commons-csv/pull/276
>
> kind regards,
> Markus
>
> From: Gary D. Gregory 
> Sent: Friday, October 21, 2022 14:18
> To: dev@commons.apache.org 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> On 2022/10/20 22:56:05 Alex Herbert wrote:
> > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> wrote:
> > >
> > > I did not have time to track through whether this behaviour changed
> > > after the initial implementation of the flag. I would think not as the
> > > original behaviour is from 1.0. This would map to:
> > >
> > > true -> ALLOW_ALL
> > > false -> ALLOW_EMPTY
> > > new -> DISALLOW
> > >
> > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to
> > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > behavioural compatibility (to 1.7, or back to 1.0).
> >
> > PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> > for allowDuplicates=false and does not change any tests.
> >
> > So the test suite is currently not enforcing behavioural
> > compatibility. This seems like a glaring hole in the tests and should
> > be addressed to prevent regressions.
>
> In git master, there are many tests for senders of
> withAllowDuplicateHeaderNames(boolean) and
> setAllowDuplicateHeaderNames(boolean).
>
> I filled in the coverage in testGetAllowDuplicateHeaderNames() for
> getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right or
> wrong.
>
> Let's reassess now git master now please. If the PR matters still, it
> should be rebased on git master.
>
> Gary
>
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>
-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [math] contribution proposal for multivariate functions optimization (2)

2022-10-21 Thread Gilles Sadowski
Hello.

I haven't looked at the code, but thanks for your interest in contributing
to Commons Math.
The more straightforward path is indeed to adapt your code to what is
currently in the "...legacy" module.  However, you can really consider
the other venue, i.e. a design specific to this family of optimizers to be
defined in a separate module.  [In the current "legacy" code we tried
to share as much code as possible, but this has led to various design
issues.]  Which is best is certainly not clear-cut.
My hope is that starting from a clean slate may help us make progress
on several aspects of the library that we wanted to change anyways[1]
(as per the JIRA records).

Regards,
Gilles

[1] For example, usage of external matrix algebra implementations.

Le ven. 21 oct. 2022 à 15:30, François Laferrière
 a écrit :
>
>   Hello Alex,
> Ichecked the architecture of MultivariateOptimizer family to compareto what I 
> have done so far. I think that what I have done can berefactored to fit in 
> the general framework as extension ofGradientMultivariateOptimizer even 
> though, main differences are :
>-
> There is no need to provide explicit gradient function, as it can be computed 
> with finite difference (secant approximation to partial derivative).
>
>-
> There is a need to compute the Hessian matrix. In the same way, it is 
> computed with finite differences.
>
>-
> My approach uses of an extension of MultivariateFunction that have methods to 
> provide gradient and hessian. But that is perhaps not a so good idea: it 
> would be better to provide generic gradient and hessian classes that are 
> operator applied to plain MultivariateFunction.
>
>
> I also have already written tests to cover at least all nominalcases and a 
> few error cases. Now that I know where to put my code, Iwill integrate it in 
> ACM clone (with all tests) and try to refactorit until it fit seamlessly in 
> ACM classes and concepts.
>
> This may take a while.
>
> Yours truly
> François Laferrière
>
> Le jeudi 20 octobre 2022 à 20:18:46 UTC+2, Alex Herbert 
>  a écrit :
>
>  Hi,
>
> Thanks for the interest in Commons Math.
>
> Currently all the optimisation code is in commons-math-legacy. I think
> the gradient based methods would fit in:
>
> org.apache.commons.math4.legacy.optim.nonlinear.scalar.gradient
>
> Can your implementations be adapted to work with the existing
> interfaces? The decision to move the entire 'optim' package to a new
> module allows a redesign of interfaces. The old and new can coexist
> but ideally we would want to support only one optimisation
> architecture. Have a look at the current classes and let us know what
> you think.
>
> Regards,
>
> Alex
>
>
>
> On Thu, 20 Oct 2022 at 15:36, François Laferrière
>  wrote:
> >
> > Hello,
> > Sorry, previous message was a mess
> > Based on Apache common math, I have implemented some commonplace 
> > optimization algorithms that could be integrated in ACM. This includes
> >
> >- Gradient Descent (en.wikipedia.org/wiki/Gradient_descent)
> >
> >- Newton Raphson 
> > (https://en.wikipedia.org/wiki/Newton's_method_in_optimization)
> >
> >- BFGS 
> > (https://en.wikipedia.org/wiki/Broyden–Fletcher–Goldfarb–Shanno_algorithm)
> >
> > They are implemented in such a way that other algorithms of the same family 
> > (Newton) can be implemented easily from existing building blocks.
> > I clone http://gitbox.apache.org/repos/asf/commons-math.git but I am a bit 
> > lost in the module structure. Should I put my code in one existing 
> > commons-math4-* module (if so which one?) or should I create a new module 
> > (for instance commons-math-optimization) ?
> > Many thanks in advance
> > François Laferrière
> >
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>

-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.apache.org



Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Gary Gregory
Ok, I don't see any conflict anymore but the PR title does not match the
code: "CSVFormat.duplicateHeaderMode requires default DISALLOW"

I am ok with the changes as they stand now but we have to resolve if the
default should be changed.

Gary

On Fri, Oct 21, 2022, 09:44  wrote:

> Hi Gary,
>
> the PR did not have conflicts 30 minutes ago. So we must have encountered
> a race condition between you updating master and me rebasing and
> force-pushing my pr.
> I've done that again just now. There are no conflicts.
> At the same time I see parts of my PR have dribbled into master already,
> so if you rather implement the changes yourself, please go ahead. I am fine
> with closing my pr.
>
> regards,
> Markus
>
>
> From: Gary Gregory 
> Sent: Friday, October 21, 2022 15:17
> To: Commons Developers List 
> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
>
> Hi Markus,
>
> The PR has conflicts and does not test its changes.
>
> I'd like feedback from the community on whether or not git master as it is
> now is OK for the duplicate header behavior or if changing the default is
> needed and if doing so is just ping-pongonging from one incompatibility to
> another, based on previous versions.
>
> TY!
>
> Gary
>
> On Fri, Oct 21, 2022, 08:52  wrote:
>
> > I've removed serialization stuff from the PR and rebased it to master
> > https://github.com/apache/commons-csv/pull/276
> >
> > kind regards,
> > Markus
> >
> > From: Gary D. Gregory 
> > Sent: Friday, October 21, 2022 14:18
> > To: dev@commons.apache.org 
> > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> >
> > On 2022/10/20 22:56:05 Alex Herbert wrote:
> > > On Thu, 20 Oct 2022 at 23:43, Alex Herbert 
> > wrote:
> > > >
> > > > I did not have time to track through whether this behaviour changed
> > > > after the initial implementation of the flag. I would think not as
> the
> > > > original behaviour is from 1.0. This would map to:
> > > >
> > > > true -> ALLOW_ALL
> > > > false -> ALLOW_EMPTY
> > > > new -> DISALLOW
> > > >
> > > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7]
> to
> > > > change the use of the flag to 'false -> DISALLOW' is not maintaining
> > > > behavioural compatibility (to 1.7, or back to 1.0).
> > >
> > > PS. I just verified that PR 276 changes the DuplicateHeaderMode value
> > > for allowDuplicates=false and does not change any tests.
> > >
> > > So the test suite is currently not enforcing behavioural
> > > compatibility. This seems like a glaring hole in the tests and should
> > > be addressed to prevent regressions.
> >
> > In git master, there are many tests for senders of
> > withAllowDuplicateHeaderNames(boolean) and
> > setAllowDuplicateHeaderNames(boolean).
> >
> > I filled in the coverage in testGetAllowDuplicateHeaderNames() for
> > getAllowDuplicateHeaderNames() to reflect the _current_ behavior, right
> or
> > wrong.
> >
> > Let's reassess now git master now please. If the PR matters still, it
> > should be rebased on git master.
> >
> > Gary
> >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > > For additional commands, e-mail: dev-h...@commons.apache.org
> > >
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> > For additional commands, e-mail: dev-h...@commons.apache.org
> >
> >
> -
> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
> For additional commands, e-mail: dev-h...@commons.apache.org
>
>


Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1

2022-10-21 Thread Alex Herbert
>From my investigation the behaviour from 1.0 to 1.6 was to throw on
duplicates, but ignore empty duplicates (so allowing pad columns in a
CSV).

The allowDuplicates flag was added in 1.7 to allow duplicates. DEFAULT
behaviour was true! So the behavioural compatibility was broken in
release 1.7. Note there is also an allowMissingColumnNames flag. Logic
was:

- Store column headers as we go
- For each new column header, check in the existing headers
- If present:
-- If not empty then throw if allowDuplicateHeaderNames=true
-- If empty then throw if allowMissingColumnNames=false

There is some logic failure here. It would allow the first missing
column name for example; only the second empty missing column name
would trigger an exception.

The current master has updated this logic somewhat but I think the
concept is still the same. There are tests to show that the default
enum value is ALLOW_ALL. This is the same behaviour as 1.7, but not
1.0 - 1.6.

Setting the boolean flag to true -> ALLOW_ALL.
Setting the boolean flag to false -> ALLOW_EMPTY.

So I think the default settings are correct, and the use of the old
flag flips the behaviour correctly. But when investigating I found
another issue.

I looked at the CSVFormat tests and found two fixtures missing the
@Test annotation:

testDuplicateHeaderElementsTrue()
testDuplicateHeaderElementsTrue_Deprecated()

I added these to master.

The tests for duplicate headers use the builder and setHeader("A",
"A"). So there are no tests for empty headers. Trying this throws an
exception in CSVFormat.validate for all of these:

CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false).setHeader("A",
"", "B", "").build();
CSVFormat.DEFAULT.builder().setDuplicateHeaderMode(DuplicateHeaderMode.ALLOW_EMPTY).setHeader("A",
"", "B", "").build();
CSVFormat.DEFAULT.builder().setAllowDuplicateHeaderNames(false)
.setAllowMissingColumnNames(true).setHeader("A", "", "B", "").build();

So it seems the logic when building a CSVFormat with explicit headers
is different from when parsing a format in CSVParser.createHeaders().
That method has a lot of logic to set up the header map. If a
CSVFormat is created with an explicit header the behaviour is
different with respect to duplicate headers, and
allowMissingColumnNames which is not even considered in
CSVFormat.validate().

This is the current behaviour in master for the parser:

@Test
public void testEmptyColumnsAndDuplicateHeadersNotAllowed() throws Exception {
assertThrows(IllegalArgumentException.class,
() -> CSVParser.parse("a,,c,\n1,2,3,4\nw,x,y,z",
CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false)));
}

@Test
public void 
testEmptyColumnsAndDuplicateHeadersNotAllowedWithMissingColumnsAllowed()
throws Exception {
try (CSVParser parser = CSVParser.parse("a,,c,\n1,2,3,4\nw,x,y,z",

CSVFormat.DEFAULT.withHeader().withAllowDuplicateHeaderNames(false).withAllowMissingColumnNames(true)))
{
// noop
}
}

So the duplicate headers cannot be considered on its own, it requires
use of the allow missing column names flag.

Also, if a CSVFormat has an explicit header then the CSVParser will
not construct a header Map to allow column names to
look up column indices.

I am not sure of the best path to fix this. I would document the
CSVParser header map as not being created if the header is not read
from the CSV input (current behaviour). Then maybe change the logic in
CSVFormat.validate() to respect the ALLOW_ALL, ALLOW_EMPTY, ALLOW_NONE
duplicates choice, possibly also considering allowEmptyColumnNames
which defaults to false. Ideally we should get the same exceptions
when building a CSVFormat with explicitly set headers, and parsing a
CSV file and getting the header from the file. Currently I do not
think that is true.

Alex

On Fri, 21 Oct 2022 at 15:54, Gary Gregory  wrote:
>
> Ok, I don't see any conflict anymore but the PR title does not match the
> code: "CSVFormat.duplicateHeaderMode requires default DISALLOW"
>
> I am ok with the changes as they stand now but we have to resolve if the
> default should be changed.
>
> Gary
>
> On Fri, Oct 21, 2022, 09:44  wrote:
>
> > Hi Gary,
> >
> > the PR did not have conflicts 30 minutes ago. So we must have encountered
> > a race condition between you updating master and me rebasing and
> > force-pushing my pr.
> > I've done that again just now. There are no conflicts.
> > At the same time I see parts of my PR have dribbled into master already,
> > so if you rather implement the changes yourself, please go ahead. I am fine
> > with closing my pr.
> >
> > regards,
> > Markus
> >
> >
> > From: Gary Gregory 
> > Sent: Friday, October 21, 2022 15:17
> > To: Commons Developers List 
> > Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
> >
> > Hi Markus,
> >
> > The PR has conflicts and does not test its changes.
> >
> > I'd like feedback from the community on whether or not git master as it is
> > now is OK for the duplic