It could be just straightforward to create a SortedMapCoder for TreeMap.
Just add checks on map instances and then change verifyDeterministic.

If this is a common need we could just submit it into Beam repo.

[1]:
https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java#L146

On Thu, Jul 11, 2019 at 2:26 PM Mike Pedersen <m...@mikepedersen.dk> wrote:

> There isn't a coder for deterministic maps in Beam, so even if your
> datastructure is deterministic, Beam will assume the serialized bytes
> aren't deterministic.
>
> You could make one using the MapCoder as a guide:
> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java
> Just change it such that the exception in VerifyDeterministic is removed
> and when decoding it instantiates a TreeMap or such instead of a HashMap.
>
> Alternatively, you could just represent your key as a sorted list of KV
> pairs. Lookups could be done using binary search if necessary.
>
> Mike
>
> Den tor. 11. jul. 2019 kl. 22.41 skrev Shannon Duncan <
> joseph.dun...@liveramp.com>:
>
>> So I'm working on essentially doing a word-count on a complex data
>> structure.
>>
>> I tried just using a HashMap as the Structure, but that didn't work
>> because it is non-deterministic.
>>
>> However when Given a LinkedHashMap or TreeMap which is deterministic the
>> SDK complains that it's non-deterministic when trying to use it as a key
>> for GroupByKey.
>>
>> What would be an appropriate Map style data structure that would be
>> deterministic enough for Apache Beam to accept it as a key?
>>
>> Thanks,
>> Shannon
>>
>

Reply via email to