On Thu, 20 Oct 2022 at 22:45, Gary D. Gregory <ggreg...@apache.org> wrote: > > Hi All (below) > > On 2022/10/20 18:08:31 Alex Herbert wrote: > > On Thu, 20 Oct 2022 at 17:05, sebb <seb...@gmail.com> wrote: > > > > > > On Thu, 20 Oct 2022 at 15:43, Gary Gregory <garydgreg...@gmail.com> wrote: > > > > > > > > Would't it be simpler to deal with the serialization issue by bumping > > > > the > > > > serialVersionID? We can just say that you only serialized and > > > > deserialize > > > > for the same version. > > > > > > Are we willing to continue supporting serialisation going forward? > > > It is not easy to ensure compatibility and avoid security issues. > > > > > > Are there any use-cases for allowing serialisation? > > > > Serialisation was broken for CSVRecord before and partially fixed in > > 1.8 to support 1.0 to 1.6. Fields from 1.7 are not supported. The > > release notes for 1.8 state that serialisation will not be supported > > going forward. So this breakage of serialisation for CSVFormat has > > precedent. > > > > I think Gary's suggestion to change the serial version ID commits to > > the same path as not supporting serialisation from 2.0. > > I think this is the simplest solution. We can Javadoc the class and say that > we do not support serialization from one version to the next and that it will > be removed in 2.0. Check? If this OK, then I'll update git master and we can > move on to the duplicate headers enum.
+1 > > > > > Regarding the release, I am not concerned about serialisation as it > > seems to be a lost cause. The issue is the behavioural compatibility > > of switching from a boolean flag for duplicate headers to an enum with > > 3 options. We should get this correct to avoid a future release having > > to explain a behavioural compatibility change. > > Duplicate headers enum: We are leaning towards canceling RC1, updating the > PR, or creating a new PR. I'll wait to cancel RC1 until it is clear what is > being proposed, with a PR. I think we need to check what happened before we had the duplicate headers flag. This is what I have found: CSV-239 added the flag (1.7.0) [1] in commit [2]. This was added to allow the CSVRecord getHeaderNames to return all headers including repeats. Before that duplicates threw an exception (see CSV-236 [3], which predates CSV-239). Throwing an exception for duplicate headers is mentioned in the changes log for release 1.0 [5]. Note the original behavior was to throw for non-empty duplicates due to a fix implemented in CSV-121 for release 1.0 [6]. So this is the original behaviour. CSV-264 added the enum (1.10.0) [4]. So if the duplicate headers flag has been in since 1.7 then we should just map the behaviour to the new enum. The commit when the boolean flag was added has this text in CSVParser: "This will always allow a duplicate header if the header is empty" So behaviour when the flag was added: true - allow duplicates false - only allow empty duplicates, throw for non-empy duplicates I did not have time to track through whether this behaviour changed after the initial implementation of the flag. I would think not as the original behaviour is from 1.0. This would map to: true -> ALLOW_ALL false -> ALLOW_EMPTY new -> DISALLOW Which is what we currently have in 1.10.0 RC1. Thus the PR #276 [7] to change the use of the flag to 'false -> DISALLOW' is not maintaining behavioural compatibility (to 1.7, or back to 1.0). The original review by Markus Span also found inconsistent settings of the quotedNullString. But this was removed from PR #276 and I lost track of whether that change was required. Alex [1] https://issues.apache.org/jira/browse/CSV-236 [2] https://github.com/apache/commons-csv/commit/030fb8e37c4024b24fac2b5404300449a6741699 [3] https://issues.apache.org/jira/browse/CSV-236 [4] https://issues.apache.org/jira/browse/CSV-264 [5] https://commons.apache.org/proper/commons-csv/changes-report.html [6] https://issues.apache.org/jira/browse/CSV-121 [7] https://github.com/apache/commons-csv/pull/276 --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org