Hello > This is the logic from the current builder: > DuplicateHeaderMode mode = allowDuplicateHeaderNames ? > DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY
This is true only if Builder.setAllowDuplicateHeaderNames is actually called, which is at the library user's discretion. Otherwise CSVFormat.Builder.duplicateHeaderMode remains null. Also, the method Builder.setDuplicateHeaderMode(DuplicateHeaderMode) accepts null. A non-null parameter should be enforced. Besides, shouldn't the above statement be: allowDuplicateHeaderNames ? ALLOW_ALL : DISALLOW ? duplicateHeaderMode should default to DISALLOW for backward compatibility. Found another small bug in setNullString where member quotedNullString was inconsistently written (missing write in setQuote): this.quotedNullString = quoteCharacter + nullString + quoteCharacter; All of the above in pull request https://github.com/apache/commons-csv/pull/276 @Alex would you review my PR please? Kind regards, Markus ________________________________ From: Alex Herbert <alex.d.herb...@gmail.com> Sent: Monday, October 17, 2022 13:46 To: Commons Developers List <dev@commons.apache.org> Subject: Re: [VOTE] Release Apache Commons CSV 1.10.0 based on RC1 On Mon, 17 Oct 2022 at 11:00, <sma...@outlook.de> wrote: > > Hello > > CSV-264 (Add DuplicateHeaderMode) introduces bugs that should be fixed before > shipping 1.10.0 IMO > - missing default > - broken serialization of class CSVFormat > > I raised these issues in CSV-302. > > The serialization issue is caught by Revapi. I had suggested to include > Revapi in the project (CSV-303) or alternatively add revapi and drop clirr in > commons-parent but that more or less fell on deaf ears. > > mvn org.revapi:revapi-maven-plugin:check > > [INFO] > ------------------------------------------------------------------------ > [INFO] BUILD FAILURE > [INFO] > ------------------------------------------------------------------------ > [INFO] Total time: 3.560 s > [INFO] Finished at: 2022-10-17T11:30:21+02:00 > [INFO] > ------------------------------------------------------------------------ > [ERROR] Failed to execute goal org.revapi:revapi-maven-plugin:0.14.7:check > (default-cli) on project commons-csv: The following API problems caused the > build to fail: > [ERROR] java.field.serialVersionUIDUnchanged: field > org.apache.commons.csv.CSVFormat.serialVersionUID: The class changed in an > incompatible way with regards to serialization but the serialVersionUID field > stayed unchanged. This might be ok and/or desired but is suspicious. > https://revapi.org/revapi-java/differences.html#java.field.serialVersionUIDUnchanged This is an allowed source and binary compatible error; it has potentially breaking semantic severity which do apply here. The CSVFormat object is deserialized using the defaultReadObject method. This will be able to deserialize older versions of CSVFormat. The new fields added in this version will be absent from the ObjectInputStream and set to their default initialised value. IIUC this is the change: remove: private boolean allowDuplicateHeaderNames add: private DuplicateHeaderMode duplicateHeaderMode Deserialization would mean that the new field 'duplicateHeaderMode' is set to null and the previous field 'allowDuplicateHeaderNames' is ignored. To fix this would require setting duplicateHeaderMode appropriately using the field allowDuplicateHeaderName. This is the logic from the current builder: DuplicateHeaderMode mode = allowDuplicateHeaderNames ? DuplicateHeaderMode.ALLOW_ALL : DuplicateHeaderMode.ALLOW_EMPTY However, serialization was broken previously in CSV. This was for the CSVRecord. The release notes contain this in the description for 1.8: "This release fixes serialization compatibility of CSVRecord with versions 1.0 to 1.6. New fields added since 1.7 are not serialized. Support for Serializable is scheduled to be removed in version 2.0." The Jira ticket is 248: https://issues.apache.org/jira/browse/CSV-248 In that case we made sure we could (de)serialize without throwing an exception to minimise errors for downstream users. But we did not fix behavioural compatibility across versions. In this instance I do not think exceptions need to be addressed as missing fields will be ignored and the new field will be initialized as null. However I have not verified this with a serialization test. If this new field is null then the format will behave as if duplicates is set to DISALLOW. I do not know the user case for serializing a CSVFormat. It would not be within the library. A reference is kept in CSVParser and CSVPrinter but these are not Serializable. The serialization support dates back to the legacy codebase used to create the project. Serialization can be replaced by other more configurable libraries to read/write binary objects. Perhaps we just need to add to the release notes that CSVFormat serialization is no longer supported. > [ERROR] java.method.exception.checkedRemoved: method > java.util.List<org.apache.commons.csv.CSVRecord> > org.apache.commons.csv.CSVParser::getRecords(): Method no longer throws > checked exceptions: java.io.IOException. > https://revapi.org/revapi-java/differences.html#java.method.exception.checkedRemoved This is a source incompatible error. It is binary compatible. Alex --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org