[ 
https://issues.apache.org/jira/browse/BEAM-13184?focusedWorklogId=712116&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-712116
 ]

ASF GitHub Bot logged work on BEAM-13184:
-----------------------------------------

                Author: ASF GitHub Bot
            Created on: 20/Jan/22 14:48
            Start Date: 20/Jan/22 14:48
    Worklog Time Spent: 10m 
      Work Description: steveniemitz edited a comment on pull request #15863:
URL: https://github.com/apache/beam/pull/15863#issuecomment-1017579519


   random musings from me, because we've tried to do something like this as 
well w/ our own SQL-ish IO.  
   
   If you introduce an (implicit) reshuffle between the producer of the rows 
being written and the writer, you'll possibly break an implicit contract that 
users have been relying on that mutations produced are applied in-order to the 
JDBC destination.
   
   For example, if a GBK is triggering every 10 seconds and the next transform 
is a JdbcIO, by default that GBK trigger will fuse w/ the JdbcIO writer and 
apply "inline", so all triggers will apply in order.  If you apply batching 
(with autosharding or not), multiple mutations for the same row may be grouped 
into multiple different batches, which will then be applied in a 
non-deterministic order.  This can cause older firings to overwrite newer ones 
depending on the order they're applied in.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@beam.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
-------------------

    Worklog Id:     (was: 712116)
    Time Spent: 8h 50m  (was: 8h 40m)

> Support autosharding for JdbcIO writers
> ---------------------------------------
>
>                 Key: BEAM-13184
>                 URL: https://issues.apache.org/jira/browse/BEAM-13184
>             Project: Beam
>          Issue Type: Improvement
>          Components: io-java-jdbc
>            Reporter: Pablo Estrada
>            Assignee: Pablo Estrada
>            Priority: P2
>             Fix For: 2.37.0
>
>          Time Spent: 8h 50m
>  Remaining Estimate: 0h
>
> This should improve efficiency for Jdbc writers on streaming pipelines



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to