Hello Facu,

Thank you for bringing this up. We've had this GitHub issue: 
https://github.com/apache/beam/issues/32485 that tracks this very issue. I 
wrote a candidate solution to what might work to refactor CsvIOParse to achieve 
this feature.

Feel free to continue the communication on that issue, if you would like to do 
a pull request. I would be grateful to collaborate with you, if you have the 
time and energy.

Thank you, again,

Best,

Damon

On 2024/11/26 22:30:05 Facundo Tomatis wrote:
> Hello everyone!
> 
> I've been developing a csv connector that wraps CsvIO, the read 
> operation outputs PCollection<Row> and the write operation takes 
> PCollection<Row>. I am having issues setting the encoding of the 
> resulting file and the input file, for example I would like to write a 
> CSV with ISO-8859-1 encoding or windows-1250 and more, and read from 
> those encodings as well.
> 
> Reading the source code I found out that Row's String fields (generated 
> with RowCoder.of(schema)) have a StringUtf8Encoder associated, is there 
> a way to change this encoder to be a custom encoder while maintaining 
> PCollection<Row>?
> 
> Thanks for your time.
> 
> Facu.
> 
> 

Reply via email to