syhily commented on code in PR #20725: URL: https://github.com/apache/flink/pull/20725#discussion_r960280066
########## flink-connectors/flink-connector-pulsar/src/main/java/org/apache/flink/connector/pulsar/source/enumerator/PulsarSourceEnumStateSerializer.java: ########## @@ -54,57 +55,37 @@ private PulsarSourceEnumStateSerializer() { @Override public int getVersion() { - // We use PulsarPartitionSplitSerializer's version because we use reuse this class. - return PulsarPartitionSplitSerializer.CURRENT_VERSION; + return CURRENT_VERSION; } @Override public byte[] serialize(PulsarSourceEnumState obj) throws IOException { - // VERSION 0 serialization try (ByteArrayOutputStream baos = new ByteArrayOutputStream(); DataOutputStream out = new DataOutputStream(baos)) { serializeSet( out, obj.getAppendedPartitions(), SPLIT_SERIALIZER::serializeTopicPartition); - serializeSet( - out, - obj.getPendingPartitionSplits(), - SPLIT_SERIALIZER::serializePulsarPartitionSplit); Review Comment: This is almost the same logic like Kafka. Assign states is useless because it would be changed after scaling. The `appendedPartition` in the checkpoint would assume that all operations that happened before the snapshot have successfully completed. This means no pending splits existed before checkpoint. See: https://github.com/apache/flink/blob/8b8245ba46b25c2617d91cff3d3a44b99879d9f2/flink-core/src/main/java/org/apache/flink/api/connector/source/SplitEnumerator.java#L73 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org