becketqin commented on a change in pull request #6594: [FLINK-9311] [pubsub] 
Added PubSub source connector with support for checkpointing (ATLEAST_ONCE)
URL: https://github.com/apache/flink/pull/6594#discussion_r269677433
 
 

 ##########
 File path: 
flink-streaming-java/src/main/java/org/apache/flink/streaming/api/functions/source/SerializedCheckpointData.java
 ##########
 @@ -172,9 +178,44 @@ public int getNumIds() {
                                ids.add(serializer.deserialize(deser));
                        }
 
-                       deque.addLast(new Tuple2<Long, 
Set<T>>(checkpoint.checkpointId, ids));
+                       map.put(checkpoint.checkpointId, ids);
                }
 
+               return map;
+       }
+
+       /**
+        * Combines multiple ArrayDeques with checkpoint data by checkpointId.
+        * This could happen when a job rescales to a lower parallelism and 
states are multiple tasks are combined.
+        *
+        * @param data The data to be combined.
+        * @param <T> The type of the elements.
+        * @return An ArrayDeque of combined element checkpoints.
+        */
+       public static <T> ArrayDeque<Tuple2<Long, Set<T>>> 
combine(List<Map<Long, Set<T>>> data) {
+               Map<Long, Set<T>> accumulator = new TreeMap<>();
+               for (Map<Long, Set<T>> element : data) {
+                       accumulator = combine(accumulator, element);
+               }
+
+               //Convert map to deque sorted by checkpointId
+               ArrayDeque<Tuple2<Long, Set<T>>> deque = new 
ArrayDeque<>(accumulator.size());
+               accumulator.entrySet()
+                               .stream()
+                               .sequential()
+                               .sorted(Comparator.comparing(Map.Entry::getKey))
 
 Review comment:
   The `TreeMap` is already sorted by key. Is this sort necessary?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to