Will do. Thanks. A new coder for deterministic Maps would be great in the
future. Thank you!

On Thu, Jul 11, 2019 at 4:58 PM Rui Wang <ruw...@google.com> wrote:

> I think Mike refers to ListCoder
> <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ListCoder.java>
>  which
> is deterministic if its element is the same. Maybe you can search the repo
> for examples of ListCoder?
>
>
> -Rui
>
> On Thu, Jul 11, 2019 at 2:55 PM Shannon Duncan <joseph.dun...@liveramp.com>
> wrote:
>
>> So ArrayList doesn't work either, so just a standard List?
>>
>> On Thu, Jul 11, 2019 at 4:53 PM Rui Wang <ruw...@google.com> wrote:
>>
>>> Shannon, I agree with Mike on List is a good workaround if your element
>>> within list is deterministic and you are eager to make your new pipeline
>>> working.
>>>
>>>
>>> Let me send back some pointers to adding new coder later.
>>>
>>>
>>> -Rui
>>>
>>> On Thu, Jul 11, 2019 at 2:45 PM Shannon Duncan <
>>> joseph.dun...@liveramp.com> wrote:
>>>
>>>> I just started learning Java today to attempt to convert our python
>>>> pipelines to Java to take advantage of key features that Java has. I have
>>>> no idea how I would create a new coder and include it in for beam to
>>>> recognize.
>>>>
>>>> If you can point me in the right direction of where it hooks together I
>>>> might be able to figure that out. I can duplicate MapCoder and try to make
>>>> changes, but how will beam know to pick up that coder for a groupByKey?
>>>>
>>>> Thanks!
>>>> Shannon
>>>>
>>>> On Thu, Jul 11, 2019 at 4:42 PM Rui Wang <ruw...@google.com> wrote:
>>>>
>>>>> It could be just straightforward to create a SortedMapCoder for
>>>>> TreeMap. Just add checks on map instances and then change
>>>>> verifyDeterministic.
>>>>>
>>>>> If this is a common need we could just submit it into Beam repo.
>>>>>
>>>>> [1]:
>>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java#L146
>>>>>
>>>>> On Thu, Jul 11, 2019 at 2:26 PM Mike Pedersen <m...@mikepedersen.dk>
>>>>> wrote:
>>>>>
>>>>>> There isn't a coder for deterministic maps in Beam, so even if your
>>>>>> datastructure is deterministic, Beam will assume the serialized bytes
>>>>>> aren't deterministic.
>>>>>>
>>>>>> You could make one using the MapCoder as a guide:
>>>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java
>>>>>> Just change it such that the exception in VerifyDeterministic is
>>>>>> removed and when decoding it instantiates a TreeMap or such instead of a
>>>>>> HashMap.
>>>>>>
>>>>>> Alternatively, you could just represent your key as a sorted list of
>>>>>> KV pairs. Lookups could be done using binary search if necessary.
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> Den tor. 11. jul. 2019 kl. 22.41 skrev Shannon Duncan <
>>>>>> joseph.dun...@liveramp.com>:
>>>>>>
>>>>>>> So I'm working on essentially doing a word-count on a complex data
>>>>>>> structure.
>>>>>>>
>>>>>>> I tried just using a HashMap as the Structure, but that didn't work
>>>>>>> because it is non-deterministic.
>>>>>>>
>>>>>>> However when Given a LinkedHashMap or TreeMap which is deterministic
>>>>>>> the SDK complains that it's non-deterministic when trying to use it as a
>>>>>>> key for GroupByKey.
>>>>>>>
>>>>>>> What would be an appropriate Map style data structure that would be
>>>>>>> deterministic enough for Apache Beam to accept it as a key?
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Shannon
>>>>>>>
>>>>>>

Reply via email to