+Damon as well. On Thu, Oct 17, 2024 at 11:49 PM Ahmet Altay via dev <dev@beam.apache.org> wrote:
> Adding @Yi Hu <ya...@google.com> who might know more about the expected > behavior. > > If you see a gap, feel free to fix it with a PR. And thank you for your > contributions! > > On Wed, Oct 9, 2024 at 12:06 AM LDesire <two_som...@icloud.com> wrote: > >> In the CsvIO.parseRows method, does it matter if the number of CSV >> headers is not the same as the number of fields in the Schema? >> >> I'm looking at that method and I don't see any logic anywhere that >> validates this. >> >> I've looked for related tests, but they don't seem to be validated >> properly. >> >> ``` >> >> @Test >> *public void *givenMismatchedCsvFormatAndSchema_throws() { >> Pipeline pipeline = Pipeline.*create*(); >> CSVFormat csvFormat = >> CSVFormat.*DEFAULT* >> .withHeader(‘a_string’, ‘an_integer’, ‘a_double’) >> .withAllowDuplicateHeaderNames(*true*); >> Schema schema = >> Schema.*builder*().addStringField(‘a_string’).addDoubleField(‘a_double’).build(); >> *assertThrows*(IllegalArgumentException.*class*, () -> >> CsvIO.*parseRows*(schema, csvFormat)); >> pipeline.run(); >> } >> >> ``` >> >> The above test always passes the assertThrows test because >> withAllowDuplicateHeaderNames is true. >> >> In other words, it doesn't seem to be validating properly because the >> exception is thrown in a different part of the test than intended. >> >> If this is unintended, would it be okay if I add logic to validate that >> the number of CSV headers is the same as the number of fields in the Schema? >> >