[ https://issues.apache.org/jira/browse/FLINK-12693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17120452#comment-17120452 ]
Lisheng Sun commented on FLINK-12693: ------------------------------------- hi [~banmoy] According to test's result, the performance of calculation hash in CopyOnWriteStateMap is much worse than JDK HashMap. Could you tell what the new hash algorithm is for. Thank you. CopyOnWriteStateMap#computeHashForOperationAndDoIncrementalRehash#compositeHash#bitMix {code:java} public static int bitMix(int in) { in ^= in >>> 16; in *= 0x85ebca6b; in ^= in >>> 13; in *= 0xc2b2ae35; in ^= in >>> 16; return in; } {code} HashMap#hash {code:java} static final int hash(Object key) { int h; return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16); }{code} > Store state per key-group in CopyOnWriteStateTable > -------------------------------------------------- > > Key: FLINK-12693 > URL: https://issues.apache.org/jira/browse/FLINK-12693 > Project: Flink > Issue Type: Sub-task > Components: Runtime / State Backends > Reporter: Yu Li > Assignee: PengFei Li > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Since we propose to use KeyGroup as the unit of spilling/loading, the first > step is to store state per key-groups. Currently {{NestedMapsStateTable}} > natively supports this, so we only need to refine {{CopyOnWriteStateTable}} > The main efforts required here is to extract the customized hash-map out of > {{CopyOnWriteStateTable}} then use such a hash-map as the state holder for > each KeyGroup. Whereafter we could extract some common logic out into > {{StateTable}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)