[ https://issues.apache.org/jira/browse/FLINK-13702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16963838#comment-16963838 ]
Dawid Wysakowicz edited comment on FLINK-13702 at 10/31/19 10:23 AM: --------------------------------------------------------------------- I agree the solution suggested by [~lzljs3620320] should work, it is very fragile though. It is based on the assumption that {{MemorySegments}} are immutable which in general case is not true. You can mutate MemorySegment via put methods. It is also possible that with this solution different code paths might work against different MemorySegments for the same {{LazyBinaryFormat}} instances. It does improve the current situation, I agree. We can add that solution, but we should keep in mind it is basically a hack and add a proper documentation with the assumptions that we do there. I was also thinking if it would make sense to make {{MemorySegment[] getSegments}} return {{UnmodifiableMemorySegment}}. This would require more changes, as we would have to extract {{MemorySegment}} interface. Still that would not make the {{MemorySegment}} fully immutable as it still gives direct access to underlying memory array via {{getHeapMemory}}. I would keep mind open for a better solution in a long run though. As for the BinaryGeneric I think we could: * remove the {{copy}} method from the class and move it to the {{Serializer}} as [~lzljs3620320] was suggesting * in the {{materialize}} method call {{duplicate}}. This would ensure that every caller thread uses its own copy of the serializer. * additionally we can think of keeping {{TypeSerializerSnapshot}} instead of {{Serializer} and call {{restoreSerializer}} in the {{materialize}} method. Which should create a new copy of a Serializer. Unfortunately thats not the case for some of the {{Serializer}}s in Blink planner (this should be fixed independently anyway). To be on the safe side we would still call duplicate. What do you think? was (Author: dawidwys): I agree the solution suggested by [~lzljs3620320] should work, it is very fragile though. It is based on the assumption that {{MemorySegments}} are immutable which in general case is not true. You can mutate MemorySegment via put methods. It is also possible that with this solution different code paths might work against different MemorySegments for the same {{LazyBinaryFormat}} instances. It does improve the current situation, I agree. We can add that solution, but we should keep in mind it is basically a hack and add a proper documentation with the assumptions that we do there. I was also thinking if it would make sense to make {{MemorySegment[] getSegments}} return {{UnmodifiableMemorySegment}}. This would require more changes, as we would have to extract {{MemorySegment}} interface. Still that would not make the {{MemorySegment}} fully immutable as it still gives direct access to underlying memory array via {{getHeapMemory}}. As for the BinaryGeneric I think we could: * remove the {{copy}} method from the class and move it to the {{Serializer}} as [~lzljs3620320] was suggesting * in the {{materialize}} method call {{duplicate}}. This would ensure that every caller thread uses its own copy of the serializer. > BaseMapSerializerTest.testDuplicate fails on Travis > --------------------------------------------------- > > Key: FLINK-13702 > URL: https://issues.apache.org/jira/browse/FLINK-13702 > Project: Flink > Issue Type: Bug > Components: Table SQL / Planner > Affects Versions: 1.10.0 > Reporter: Till Rohrmann > Assignee: Dawid Wysakowicz > Priority: Critical > Labels: test-stability > > The {{BaseMapSerializerTest.testDuplicate}} fails on Travis with an > {{java.lang.IndexOutOfBoundsException}}. > https://api.travis-ci.org/v3/job/570973199/log.txt -- This message was sent by Atlassian Jira (v8.3.4#803005)