zhangyang created FLINK-29167:
---------------------------------

             Summary:  Time out-of-order optimization for merging multiple data 
streams into one data stream
                 Key: FLINK-29167
                 URL: https://issues.apache.org/jira/browse/FLINK-29167
             Project: Flink
          Issue Type: Improvement
          Components: API / DataStream
    Affects Versions: 1.14.2
            Reporter: zhangyang
             Fix For: 1.14.2


Problem Description: 

     I have many demand scenarios and need to combine more than 2 data streams 
(DataStreams) into one data stream. The business behind the data stream 
processing requires the time sequence of events to complete the scene 
requirements, so I use the union operator of flink to The confluence is 
completed, but the data after the confluence does not guarantee its original 
event time sequence.
{code:java}
dataStream0 = dataStream0.union(dataStreamArray);  {code}
Design suggestion: 

    When designing the source code, you can merge into the stream in the order 
of the array in the dataStreamArray instead of random order.

 

Solution suggestion: 

   At present, I use windowAll to sort the data after the confluence in 
chronological order, and complete the overall scene realization, but the 
parallelism of windowAll can only be 1, which affects the performance of the 
entire directed acyclic graph. In addition, there are two confluence scene 
sorting scenes. I haven't thought of a good remedy, so I can only think that 
the union of the union is the sequence, which can save a lot of unnecessary 
trouble for the event-time stream merging.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to