Hi Ramana,
Interesting -- I see it too when not using Runner v2.
Runner v2 shows UTF-8 as expected, but without it, I get ANSI_X3.4-1968 for
file.encoding.
I'd say it's probably undesired, but we'd need to look further.
Curious why it would cause data corruption. Are you relying
on Charset.defaul
Hi Bruno,
I have added a log statement in a DoFn.
logger.info(System.getProperty('file.encoding'))
and that showed ANSI as the file encoding. There isn't anything in our code
that sets ANSI file encoding. I will check with Google Support.
On Fri, Jun 16, 2023 at 7:27 AM Bruno Volpato via user
w
Hi Ramana,
Curious where you got ANSI_X3.4-1968 from -- I don't think there's any
trace of this encoding anywhere in Dataflow Workers (as far as I am aware
and looked around).
The default encoding for JVM is UTF-8, and Dataflow doesn't appear to set
it anywhere. I was able to check using:
$ docke
Hi,
I accidentally discovered that the default file encoding in my Dataflow
runners is ANSI_X3.4-1968. We expected it to be UTF-8, and as a result,
some of our data has been corrupted.
I came across this Stack Overflow answer (link:
https://stackoverflow.com/a/362006), but to the best of my knowl