[ 
https://issues.apache.org/jira/browse/FLINK-8715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16451953#comment-16451953
 ] 

ASF GitHub Bot commented on FLINK-8715:
---------------------------------------

Github user StefanRRichter commented on a diff in the pull request:

    https://github.com/apache/flink/pull/5885#discussion_r183996477
  
    --- Diff: 
flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java
 ---
    @@ -1125,59 +1125,62 @@ private void 
restoreKeyGroupsShardWithTemporaryHelperInstance(
         * that we checkpointed, i.e. is already in the map of column families.
         */
        @SuppressWarnings("rawtypes, unchecked")
    -   protected <N, S> ColumnFamilyHandle getColumnFamily(
    +   protected <N, S> Tuple2<ColumnFamilyHandle, 
RegisteredKeyedBackendStateMetaInfo<N, S>> getColumnFamilyAndStateSerializer(
    --- End diff --
    
    I think this method has grown way too complex over time, and looking at the 
`Tuple2` return type it becomes more and more clear that this code is mixing up 
2 different concerns and could be untangled a bit. I would suggest to separate 
this into: 
    1) checking if this is a new state (does the map contain the name string), 
this is like a inlined check in current calling code. 
    2) If yes, do the serializer checks and configuration magic and create the 
`RegisteredKeyedBackendStateMetaInfo`. this goes to a separate method that is 
called by the current caller.
    3) Request the column family, either by new registration or the existing 
one. can use the result from step 1 or recheck. this goes in another separate 
method called by the current caller.
    x) Optional: helper method that does steps 1-3 if we otherwise duplicate 
them too much.


> RocksDB does not propagate reconfiguration of serializer to the states
> ----------------------------------------------------------------------
>
>                 Key: FLINK-8715
>                 URL: https://issues.apache.org/jira/browse/FLINK-8715
>             Project: Flink
>          Issue Type: Bug
>          Components: State Backends, Checkpointing
>    Affects Versions: 1.3.2
>            Reporter: Arvid Heise
>            Assignee: Tzu-Li (Gordon) Tai
>            Priority: Blocker
>             Fix For: 1.5.0
>
>
> Any changes to the serializer done in #ensureCompability are lost during the 
> state creation.
> In particular, 
> [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBValueState.java#L68]
>  always uses a fresh copy of the StateDescriptor.
> An easy fix is to pass the reconfigured serializer as an additional parameter 
> in 
> [https://github.com/apache/flink/blob/master/flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/RocksDBKeyedStateBackend.java#L1681]
>  , which can be retrieved through the side-output of getColumnFamily
> {code:java}
> kvStateInformation.get(stateDesc.getName()).f1.getStateSerializer()
> {code}
> I encountered it in 1.3.2 but the code in the master seems unchanged (hence 
> the pointer into master). I encountered it in ValueState, but I suspect the 
> same issue can be observed for all kinds of RocksDB states.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to