Re: [Question] Change default file encoding in Dataflow runners

2023-06-17 Thread Bruno Volpato via user
Hi Ramana, Interesting -- I see it too when not using Runner v2. Runner v2 shows UTF-8 as expected, but without it, I get ANSI_X3.4-1968 for file.encoding. I'd say it's probably undesired, but we'd need to look further. Curious why it would cause data corruption. Are you relying on Charset.defaul

Re: [Question] Change default file encoding in Dataflow runners

2023-06-15 Thread Ramana Venkata
Hi Bruno, I have added a log statement in a DoFn. logger.info(System.getProperty('file.encoding')) and that showed ANSI as the file encoding. There isn't anything in our code that sets ANSI file encoding. I will check with Google Support. On Fri, Jun 16, 2023 at 7:27 AM Bruno Volpato via user w

Re: [Question] Change default file encoding in Dataflow runners

2023-06-15 Thread Bruno Volpato via user
Hi Ramana, Curious where you got ANSI_X3.4-1968 from -- I don't think there's any trace of this encoding anywhere in Dataflow Workers (as far as I am aware and looked around). The default encoding for JVM is UTF-8, and Dataflow doesn't appear to set it anywhere. I was able to check using: $ docke

[Question] Change default file encoding in Dataflow runners

2023-06-15 Thread Ramana Venkata
Hi, I accidentally discovered that the default file encoding in my Dataflow runners is ANSI_X3.4-1968. We expected it to be UTF-8, and as a result, some of our data has been corrupted. I came across this Stack Overflow answer (link: https://stackoverflow.com/a/362006), but to the best of my knowl