fredia commented on PR #24079:
URL: https://github.com/apache/flink/pull/24079#issuecomment-1893527794

   > I've been able to pin down the issue to 
https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/state/OperatorStateRestoreOperation.java#L190
   > 
   > the problem is that the sorting breaks on empty state ... we should always 
restore in the order state was written
   > 
   > fix in: 
[8d02807](https://github.com/apache/flink/commit/8d02807e3b8b12c010cde64e2518fe49044eda14)
   > 
   > cc @fredia @ruibinx since you've worked on #23938
   
   @dmvk Thanks for the investigation and fix, sorry for ignoring the case of 
`length=0` https://github.com/apache/flink/pull/23938#discussion_r1432321260 at 
that time. 
   
   
   > It seems that we should not really have snappy headers per-state, but we 
should write them just once at the begining ...
   
   
   I have an inelegant solution: before deserializing each state, record the 
`startPos` first, and then when constructing `CompressibleFSDataInputStream` 
each time, seek to `startPos` first, so that the order of our offset is no 
longer important, and compatibility does not need to be specially considered.
   
   ```java
     void restore() {
           for (OperatorStateHandle stateHandle : stateHandles) {
               ...
               long startPos = in.getPos();
               for (String stateName : toRestore) {
                   in.seek(startPos);
                   try (final CompressibleFSDataInputStream compressedIn =
                                new CompressibleFSDataInputStream(
                                        in,
                                        compressionDecorator)) {
                       // deserialize
                   }
               }
               ...
           }
       }
   ```
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@flink.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to