Adding @Yi Hu <ya...@google.com> who might know more about the expected
behavior.

If you see a gap, feel free to fix it with a PR. And thank you for your
contributions!

On Wed, Oct 9, 2024 at 12:06 AM LDesire <two_som...@icloud.com> wrote:

> In the CsvIO.parseRows method, does it matter if the number of CSV headers
> is not the same as the number of fields in the Schema?
>
> I'm looking at that method and I don't see any logic anywhere that
> validates this.
>
> I've looked for related tests, but they don't seem to be validated
> properly.
>
> ```
>
> @Test
> *public void *givenMismatchedCsvFormatAndSchema_throws() {
>   Pipeline pipeline = Pipeline.*create*();
>   CSVFormat csvFormat =
>       CSVFormat.*DEFAULT*
>           .withHeader(‘a_string’, ‘an_integer’, ‘a_double’)
>           .withAllowDuplicateHeaderNames(*true*);
>   Schema schema = 
> Schema.*builder*().addStringField(‘a_string’).addDoubleField(‘a_double’).build();
>   *assertThrows*(IllegalArgumentException.*class*, () -> 
> CsvIO.*parseRows*(schema, csvFormat));
>   pipeline.run();
> }
>
> ```
>
> The above test always passes the assertThrows test because
> withAllowDuplicateHeaderNames is true.
>
> In other words, it doesn't seem to be validating properly because the
> exception is thrown in a different part of the test than intended.
>
> If this is unintended, would it be okay if I add logic to validate that
> the number of CSV headers is the same as the number of fields in the Schema?
>

Reply via email to