fredia commented on PR #24079: URL: https://github.com/apache/flink/pull/24079#issuecomment-1893527794
> I've been able to pin down the issue to https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorStateRestoreOperation.java#L190 > > the problem is that the sorting breaks on empty state ... we should always restore in the order state was written > > fix in: [8d02807](https://github.com/apache/flink/commit/8d02807e3b8b12c010cde64e2518fe49044eda14) > > cc @fredia @ruibinx since you've worked on #23938 @dmvk Thanks for the investigation and fix, sorry for ignoring the case of `length=0` https://github.com/apache/flink/pull/23938#discussion_r1432321260 at that time. > It seems that we should not really have snappy headers per-state, but we should write them just once at the begining ... I have an inelegant solution: before deserializing each state, record the `startPos` first, and then when constructing `CompressibleFSDataInputStream` each time, seek to `startPos` first, so that the order of our offset is no longer important, and compatibility does not need to be specially considered. ```java void restore() { for (OperatorStateHandle stateHandle : stateHandles) { ... long startPos = in.getPos(); for (String stateName : toRestore) { in.seek(startPos); try (final CompressibleFSDataInputStream compressedIn = new CompressibleFSDataInputStream( in, compressionDecorator)) { // deserialize } } ... } } ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org