[ https://issues.apache.org/jira/browse/BEAM-13184?focusedWorklogId=712116&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-712116 ]
ASF GitHub Bot logged work on BEAM-13184: ----------------------------------------- Author: ASF GitHub Bot Created on: 20/Jan/22 14:48 Start Date: 20/Jan/22 14:48 Worklog Time Spent: 10m Work Description: steveniemitz edited a comment on pull request #15863: URL: https://github.com/apache/beam/pull/15863#issuecomment-1017579519 random musings from me, because we've tried to do something like this as well w/ our own SQL-ish IO. If you introduce an (implicit) reshuffle between the producer of the rows being written and the writer, you'll possibly break an implicit contract that users have been relying on that mutations produced are applied in-order to the JDBC destination. For example, if a GBK is triggering every 10 seconds and the next transform is a JdbcIO, by default that GBK trigger will fuse w/ the JdbcIO writer and apply "inline", so all triggers will apply in order. If you apply batching (with autosharding or not), multiple mutations for the same row may be grouped into multiple different batches, which will then be applied in a non-deterministic order. This can cause older firings to overwrite newer ones depending on the order they're applied in. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@beam.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 712116) Time Spent: 8h 50m (was: 8h 40m) > Support autosharding for JdbcIO writers > --------------------------------------- > > Key: BEAM-13184 > URL: https://issues.apache.org/jira/browse/BEAM-13184 > Project: Beam > Issue Type: Improvement > Components: io-java-jdbc > Reporter: Pablo Estrada > Assignee: Pablo Estrada > Priority: P2 > Fix For: 2.37.0 > > Time Spent: 8h 50m > Remaining Estimate: 0h > > This should improve efficiency for Jdbc writers on streaming pipelines -- This message was sent by Atlassian Jira (v8.20.1#820001)