Makes sense.
I can look into expanding on what we have at following location and adding
links to some of the existing work as a first step.
https://beam.apache.org/roadmap/connectors-multi-sdk/
Created https://issues.apache.org/jira/browse/BEAM-8553
We also need more detailed documentation for c
The only combination that I can think of is to use this hack[1] combined
with a JvmInitialier[2].
1: https://stackoverflow.com/a/14987992/4368200
2:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/harness/JvmInitializer.java
On Mon, Nov 4, 2019 at 1:40
Thanks, Eddie.
Just to add to the discussion, I logged the following information:
Charset.defaultCharset(): US-ASCII
System.getProperty("file.encoding"): ANSI_X3.4-1968
OutputStreamWriter writer = new OutputStreamWriter(new
ByteArrayOutputStream()); writer..getEncoding(): ASCII
In our case, a
Adding to what Jeff just pointed out previously I'm dealing with the same issue
writing Parquet files using the ParquetIO module in Dataflow and same stuff
happens, even forcing all String objects with UTF-8. Maybe it is related to
behind the scenes decoding/encoding within the previously mentio