[math] contribution proposal for multivariate functions optimization
Hello, Based on Apache common math, I have implemented some commonplace optimization algorithms that could be integrated in ACM. This includes Gradient Descent Gradient descent - Wikipedia | | | | | | | | | | | Gradient descent - Wikipedia Gradient descent is generally attributed to Augustin-Louis Cauchy, who first suggested it in 1847.[1] Jacques Ha... | | | Newton Raphson Newton's method in optimization - Wikipedia | | | | | | | | | | | Newton's method in optimization - Wikipedia The central problem of optimization is minimization of functions. Let us first consider the case of univariate f... | | | BFGS Broyden–Fletcher–Goldfarb–Shanno algorithm - Wikipedia | | | | Broyden–Fletcher–Goldfarb–Shanno algorithm - Wikipedia Since the updates of the BFGS curvature matrix do not require matrix inversion, its computational complexity is ... | | | They are implemented in such a way that other algorithms of the same family (Newton) can be implemented easily from existing building blocks. I clone http://gitbox.apache.org/repos/asf/commons-math.git but I am a bit lost in the module structure. Should I put my code in one existing commons-math4-* module (if so which one?) or should I create a new module (for instance commons-math-optimization) ? Many thanks in advance François Laferrière
[math] contribution proposal for multivariate functions optimization (2)
Hello, Sorry, previous message was a mess Based on Apache common math, I have implemented some commonplace optimization algorithms that could be integrated in ACM. This includes - Gradient Descent (en.wikipedia.org/wiki/Gradient_descent) - Newton Raphson (https://en.wikipedia.org/wiki/Newton's_method_in_optimization) - BFGS (https://en.wikipedia.org/wiki/Broyden–Fletcher–Goldfarb–Shanno_algorithm) They are implemented in such a way that other algorithms of the same family (Newton) can be implemented easily from existing building blocks. I clone http://gitbox.apache.org/repos/asf/commons-math.git but I am a bit lost in the module structure. Should I put my code in one existing commons-math4-* module (if so which one?) or should I create a new module (for instance commons-math-optimization) ? Many thanks in advance François Laferrière
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
Would't it be simpler to deal with the serialization issue by bumping the serialVersionID? We can just say that you only serialized and deserialize for the same version. Also note the PR will throw an NPE in the builder when instead of using the validate() method. Gary On Wed, Oct 19, 2022, 18:27 Gary D. Gregory wrote: > I've commented on the PR. > TY. > Gary > > On 2022/10/19 16:51:57 Gary Gregory wrote: > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert > wrote: > > > > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory > wrote: > > > > > > > > My +1 > > > > > > > > Gary > > > > > > Gary, > > > > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1. > > > > > > AllowDuplicates enum may be set to the incorrect value when setting > > > the allow duplicates boolean. Have you reviewed this? I believe it is > > > valid. > > > > I will re-read later tonight... > > > > Gary > > > > > > > > Alex > > > > > > - > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
On Thu, 20 Oct 2022 at 15:43, Gary Gregory wrote: > > Would't it be simpler to deal with the serialization issue by bumping the > serialVersionID? We can just say that you only serialized and deserialize > for the same version. Are we willing to continue supporting serialisation going forward? It is not easy to ensure compatibility and avoid security issues. Are there any use-cases for allowing serialisation? > Also note the PR will throw an NPE in the builder > when instead of using the validate() method. > > Gary > > On Wed, Oct 19, 2022, 18:27 Gary D. Gregory wrote: > > > I've commented on the PR. > > TY. > > Gary > > > > On 2022/10/19 16:51:57 Gary Gregory wrote: > > > On Wed, Oct 19, 2022 at 10:01 AM Alex Herbert > > wrote: > > > > > > > > On Wed, 19 Oct 2022 at 14:57, Gary D. Gregory > > wrote: > > > > > > > > > > My +1 > > > > > > > > > > Gary > > > > > > > > Gary, > > > > > > > > PR #276 highlights a behavioural compatibility error in the 1.10.0 RC1. > > > > > > > > AllowDuplicates enum may be set to the incorrect value when setting > > > > the allow duplicates boolean. Have you reviewed this? I believe it is > > > > valid. > > > > > > I will re-read later tonight... > > > > > > Gary > > > > > > > > > > > Alex > > > > > > > > - > > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > > > > > - > > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
On Thu, 20 Oct 2022 at 17:05, sebb wrote: > > On Thu, 20 Oct 2022 at 15:43, Gary Gregory wrote: > > > > Would't it be simpler to deal with the serialization issue by bumping the > > serialVersionID? We can just say that you only serialized and deserialize > > for the same version. > > Are we willing to continue supporting serialisation going forward? > It is not easy to ensure compatibility and avoid security issues. > > Are there any use-cases for allowing serialisation? Serialisation was broken for CSVRecord before and partially fixed in 1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The release notes for 1.8 state that serialisation will not be supported going forward. So this breakage of serialisation for CSVFormat has precedent. I think Gary's suggestion to change the serial version ID commits to the same path as not supporting serialisation from 2.0. Regarding the release, I am not concerned about serialisation as it seems to be a lost cause. The issue is the behavioural compatibility of switching from a boolean flag for duplicate headers to an enum with 3 options. We should get this correct to avoid a future release having to explain a behavioural compatibility change. Alex - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [math] contribution proposal for multivariate functions optimization (2)
Hi, Thanks for the interest in Commons Math. Currently all the optimisation code is in commons-math-legacy. I think the gradient based methods would fit in: org.apache.commons.math4.legacy.optim.nonlinear.scalar.gradient Can your implementations be adapted to work with the existing interfaces? The decision to move the entire 'optim' package to a new module allows a redesign of interfaces. The old and new can coexist but ideally we would want to support only one optimisation architecture. Have a look at the current classes and let us know what you think. Regards, Alex On Thu, 20 Oct 2022 at 15:36, François Laferrière wrote: > > Hello, > Sorry, previous message was a mess > Based on Apache common math, I have implemented some commonplace optimization > algorithms that could be integrated in ACM. This includes > >- Gradient Descent (en.wikipedia.org/wiki/Gradient_descent) > >- Newton Raphson > (https://en.wikipedia.org/wiki/Newton's_method_in_optimization) > >- BFGS > (https://en.wikipedia.org/wiki/Broyden–Fletcher–Goldfarb–Shanno_algorithm) > > They are implemented in such a way that other algorithms of the same family > (Newton) can be implemented easily from existing building blocks. > I clone http://gitbox.apache.org/repos/asf/commons-math.git but I am a bit > lost in the module structure. Should I put my code in one existing > commons-math4-* module (if so which one?) or should I create a new module > (for instance commons-math-optimization) ? > Many thanks in advance > François Laferrière > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
Hi All (below) On 2022/10/20 18:08:31 Alex Herbert wrote: > On Thu, 20 Oct 2022 at 17:05, sebb wrote: > > > > On Thu, 20 Oct 2022 at 15:43, Gary Gregory wrote: > > > > > > Would't it be simpler to deal with the serialization issue by bumping the > > > serialVersionID? We can just say that you only serialized and deserialize > > > for the same version. > > > > Are we willing to continue supporting serialisation going forward? > > It is not easy to ensure compatibility and avoid security issues. > > > > Are there any use-cases for allowing serialisation? > > Serialisation was broken for CSVRecord before and partially fixed in > 1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The > release notes for 1.8 state that serialisation will not be supported > going forward. So this breakage of serialisation for CSVFormat has > precedent. > > I think Gary's suggestion to change the serial version ID commits to > the same path as not supporting serialisation from 2.0. I think this is the simplest solution. We can Javadoc the class and say that we do not support serialization from one version to the next and that it will be removed in 2.0. Check? If this OK, then I'll update git master and we can move on to the duplicate headers enum. > > Regarding the release, I am not concerned about serialisation as it > seems to be a lost cause. The issue is the behavioural compatibility > of switching from a boolean flag for duplicate headers to an enum with > 3 options. We should get this correct to avoid a future release having > to explain a behavioural compatibility change. Duplicate headers enum: We are leaning towards canceling RC1, updating the PR, or creating a new PR. I'll wait to cancel RC1 until it is clear what is being proposed, with a PR. Gary > > Alex > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
On Thu, 20 Oct 2022 at 22:45, Gary D. Gregory wrote: > > Hi All (below) > > On 2022/10/20 18:08:31 Alex Herbert wrote: > > On Thu, 20 Oct 2022 at 17:05, sebb wrote: > > > > > > On Thu, 20 Oct 2022 at 15:43, Gary Gregory wrote: > > > > > > > > Would't it be simpler to deal with the serialization issue by bumping > > > > the > > > > serialVersionID? We can just say that you only serialized and > > > > deserialize > > > > for the same version. > > > > > > Are we willing to continue supporting serialisation going forward? > > > It is not easy to ensure compatibility and avoid security issues. > > > > > > Are there any use-cases for allowing serialisation? > > > > Serialisation was broken for CSVRecord before and partially fixed in > > 1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The > > release notes for 1.8 state that serialisation will not be supported > > going forward. So this breakage of serialisation for CSVFormat has > > precedent. > > > > I think Gary's suggestion to change the serial version ID commits to > > the same path as not supporting serialisation from 2.0. > > I think this is the simplest solution. We can Javadoc the class and say that > we do not support serialization from one version to the next and that it will > be removed in 2.0. Check? If this OK, then I'll update git master and we can > move on to the duplicate headers enum. +1 > > > > > Regarding the release, I am not concerned about serialisation as it > > seems to be a lost cause. The issue is the behavioural compatibility > > of switching from a boolean flag for duplicate headers to an enum with > > 3 options. We should get this correct to avoid a future release having > > to explain a behavioural compatibility change. > > Duplicate headers enum: We are leaning towards canceling RC1, updating the > PR, or creating a new PR. I'll wait to cancel RC1 until it is clear what is > being proposed, with a PR. I think we need to check what happened before we had the duplicate headers flag. This is what I have found: CSV-239 added the flag (1.7.0) [1] in commit [2]. This was added to allow the CSVRecord getHeaderNames to return all headers including repeats. Before that duplicates threw an exception (see CSV-236 [3], which predates CSV-239). Throwing an exception for duplicate headers is mentioned in the changes log for release 1.0 [5]. Note the original behavior was to throw for non-empty duplicates due to a fix implemented in CSV-121 for release 1.0 [6]. So this is the original behaviour. CSV-264 added the enum (1.10.0) [4]. So if the duplicate headers flag has been in since 1.7 then we should just map the behaviour to the new enum. The commit when the boolean flag was added has this text in CSVParser: "This will always allow a duplicate header if the header is empty" So behaviour when the flag was added: true - allow duplicates false - only allow empty duplicates, throw for non-empy duplicates I did not have time to track through whether this behaviour changed after the initial implementation of the flag. I would think not as the original behaviour is from 1.0. This would map to: true -> ALLOW_ALL false -> ALLOW_EMPTY new -> DISALLOW Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to change the use of the flag to 'false -> DISALLOW' is not maintaining behavioural compatibility (to 1.7, or back to 1.0). The original review by Markus Span also found inconsistent settings of the quotedNullString. But this was removed from PR #276 and I lost track of whether that change was required. Alex [1] https://issues.apache.org/jira/browse/CSV-236 [2] https://github.com/apache/commons-csv/commit/030fb8e37c4024b24fac2b5404300449a6741699 [3] https://issues.apache.org/jira/browse/CSV-236 [4] https://issues.apache.org/jira/browse/CSV-264 [5] https://commons.apache.org/proper/commons-csv/changes-report.html [6] https://issues.apache.org/jira/browse/CSV-121 [7] https://github.com/apache/commons-csv/pull/276 - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
On Thu, 20 Oct 2022 at 23:43, Alex Herbert wrote: > > I did not have time to track through whether this behaviour changed > after the initial implementation of the flag. I would think not as the > original behaviour is from 1.0. This would map to: > > true -> ALLOW_ALL > false -> ALLOW_EMPTY > new -> DISALLOW > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to > change the use of the flag to 'false -> DISALLOW' is not maintaining > behavioural compatibility (to 1.7, or back to 1.0). PS. I just verified that PR 276 changes the DuplicateHeaderMode value for allowDuplicates=false and does not change any tests. So the test suite is currently not enforcing behavioural compatibility. This seems like a glaring hole in the tests and should be addressed to prevent regressions. - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1
I had just started to look into this and was going to call out the same thing. I'm concerned with those changes, especially the ones regarding the allowDuplicates change, I made a note in my ticket for work to make sure we have appropriate test cases on our end, with the RC, we didn't see any issues with the compatibility between 1.9.0 and 1.10.0.RC. David On Thu, Oct 20, 2022 at 5:56 PM Alex Herbert wrote: > On Thu, 20 Oct 2022 at 23:43, Alex Herbert > wrote: > > > > I did not have time to track through whether this behaviour changed > > after the initial implementation of the flag. I would think not as the > > original behaviour is from 1.0. This would map to: > > > > true -> ALLOW_ALL > > false -> ALLOW_EMPTY > > new -> DISALLOW > > > > Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to > > change the use of the flag to 'false -> DISALLOW' is not maintaining > > behavioural compatibility (to 1.7, or back to 1.0). > > PS. I just verified that PR 276 changes the DuplicateHeaderMode value > for allowDuplicates=false and does not change any tests. > > So the test suite is currently not enforcing behavioural > compatibility. This seems like a glaring hole in the tests and should > be addressed to prevent regressions. > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > >