Hi All, Alex, more below: On 2022/10/22 21:23:13 Alex Herbert wrote: > On Sat, 22 Oct 2022 at 20:05, Gary D. Gregory <ggreg...@apache.org> wrote: > > > > Thank you for the new tests Alex! > > > > Here is one area that is easy to overlook: As Commons CSV has evolved, _not > > all settings in CSVFormat_ apply to both writing and parsing. To wit, the > > existing Javadoc for CSVFormat.getAllowMissingColumnNames(): "Gets whether > > missing column names are allowed when parsing the header line." > > > > I've updated the Javadoc for > > CSVFormat.Builder.setAllowMissingColumnNames(boolean) to state this is a > > parser setting. > > > > I've also updated the test since the commented-out test data is valid for > > parsing. > > > > In git master now. > > OK, thanks for the explanation. If the CSVFormat ignores the > allowEmptyColumnsNames in the validate() method it should act as if > the setting is true, i.e. only the DuplicateHeaderMode should be used > to validate the header.
That's how it is now, allowEmptyColumnsNames cannot be considered in the validate method because a CSVFormat does not know if it is going to be used for parsing or writing. So the test can be updated to stream the full > set of cases (reinstating those you commented out) and filter to those > where the flag is set to true, e.g. using: I already had added a new test data provider method to do that so, all I had to do now is make sure that when I deleted the commented out test elements, these were covered in the 2nd data provider called duplicateHeaderParseOnlyData(). > > --- > > static Stream<Arguments> duplicateHeaderAllowsMissingColumnsNamesData() { > return duplicateHeaderData().filter(arg -> > Boolean.TRUE.equals(arg.get()[1])); > } I did not do this because it seemed (to me) clearer (less magical) to create a duplicateHeaderParseOnlyData() and list explicitly test cases there. WDYT? > > @ParameterizedTest > @MethodSource(value = {"duplicateHeaderAllowsMissingColumnsNamesData"}) > public void testCSVFormat(final DuplicateHeaderMode duplicateHeaderMode, ... > > --- > > This finds the following failure: > > Arguments.of(DuplicateHeaderMode.ALLOW_EMPTY, true, new String[] {" > ", " "}, true), > > This is due to the difference in the CSVParser and CSVFormat in the > use of the trim() method to detect an empty header: > > CSVParser: > final boolean emptyHeader = header == null || header.trim().isEmpty(); > > CSVFormat: > final boolean empty = header == null || header.isEmpty(); > > This means that the definition of 'empty' is different for the > CSVFormat and the CSVParser. Should this be corrected in CSVFormat to > use trim()? I refactored the logic in CSVFormat#isBlank(String) where the "blank" concept vs. "empty" is the same as in Apache Commons Lang where "empty" means size 0 and "blank" means "trims to size 0"; see org.apache.commons.lang3.StringUtils.isBlank(CharSequence). > > If so then I can push the updated test and fix to master. That would > remove the requirement to comment out any of the test cases and moving > those to a different test stream method. Ah, well, let's have you review git master now and feel free to refactor. I think we are close if not done for another RC. WDYT? TY! Gary > > Alex > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org