[ https://issues.apache.org/jira/browse/BEAM-10785?focusedWorklogId=776666&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776666 ]
ASF GitHub Bot logged work on BEAM-10785: ----------------------------------------- Author: ASF GitHub Bot Created on: 01/Jun/22 05:19 Start Date: 01/Jun/22 05:19 Worklog Time Spent: 10m Work Description: pabloem commented on PR #17518: URL: https://github.com/apache/beam/pull/17518#issuecomment-1143128237 my apologies on the delay to review this, but it is still not clear to me why we need to add this feature to WriteToBQ. I see in https://issues.apache.org/jira/browse/BEAM-10785 that non-ascii characters are replaced when formatting in JSON - is that correct? Does this cause a problem when inserting into BigQuery? Can you explain the problem in more detail? Generally, I'd prefer if we did not have to define a new parameter - but rather fix the existing coder if it has any issues. Can you please share the use case that this would address? Issue Time Tracking ------------------- Worklog Id: (was: 776666) Time Spent: 1h 40m (was: 1.5h) > Support for coder argument in WriteToBigQuery > --------------------------------------------- > > Key: BEAM-10785 > URL: https://issues.apache.org/jira/browse/BEAM-10785 > Project: Beam > Issue Type: Bug > Components: io-py-gcp > Reporter: Nakamura Yu > Assignee: Seunghwan Hong > Priority: P1 > Time Spent: 1h 40m > Remaining Estimate: 0h > > When using WriteToBigQuery to transfer data to BigQuery, non-ascii characters > are replaced with replacement characters. > This was due to the RowAsDictJsonCoder being set as the coder for the > BigQueryBatchFileLoads called inside WriteToBigQuery. > I want to add coder to the argument of WriteToBigQuery so that I can set a > coder other than RowAsDictJsonCoder. > If no problem, I will create a Pull Request next weekend. -- This message was sent by Atlassian Jira (v8.20.7#820007)