[ 
https://issues.apache.org/jira/browse/FLINK-37604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yanquan Lv reassigned FLINK-37604:
----------------------------------

    Assignee: Sergei Morozov

> Flink CDC pipeline composer doesn't assign UIDs to operators
> ------------------------------------------------------------
>
>                 Key: FLINK-37604
>                 URL: https://issues.apache.org/jira/browse/FLINK-37604
>             Project: Flink
>          Issue Type: Improvement
>          Components: Flink CDC
>    Affects Versions: cdc-3.2.0
>            Reporter: Sergei Morozov
>            Assignee: Sergei Morozov
>            Priority: Major
>              Labels: pull-request-available
>
> From the 
> [documentation|https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/savepoints/#assigning-operator-ids]:
> {quote}It is *highly recommended* that you specify operator IDs via the 
> *{{uid(String)}}* method.
> {quote}
> The pipeline composer currently doesn't specify the UIDs.
> Steps to reproduce:
>  # Enable debug logging on 
> {{{}org.apache.flink.streaming.api.graph.StreamGraphHasherV2{}}}.
>  # Start a CDC pipeline
>  # Observe messages like the following  in the log
> {noformat}
> Generated hash 'cbc357ccb763df2852fee8c4fc7d55f2' for node 'Source: Value 
> Source-1' {id: 1, parallelism: 1, user function: }
> Generated hash '78be0dd8677bc2711e2a56947a5ea048' for node 'PrePartition-3' 
> {id: 3, parallelism: 1, user function: }
> Generated hash '0deb1b26a3d9eb3c8f0c11f7110b2903' for node 'PostPartition-5' 
> {id: 5, parallelism: 1, user function: 
> org.apache.flink.cdc.runtime.partitioning.PostPartitionProcessor}
> Generated hash 'c63277377682ee6b89910f3fdc3b3a1e' for node 'Sink Writer: 
> Sink-6' {id: 6, parallelism: 1, user function: }
> Generated hash '25f5e74a29ec629de3d4a0e9f84185e9' for node 'Sink Committer: 
> Sink-9' {id: 9, parallelism: 1, user function: }
> {noformat}
> It means that Flink generates these UIDs based on the job graph (the input 
> and output nodes of a given node). If the job graph changes (by adding more 
> operators), these hashes will be regenerated, and Flink will be unable to 
> restore the state of the affected operators.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to