Will do. Thanks. A new coder for deterministic Maps would be great in the future. Thank you!
On Thu, Jul 11, 2019 at 4:58 PM Rui Wang <ruw...@google.com> wrote: > I think Mike refers to ListCoder > <https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/ListCoder.java> > which > is deterministic if its element is the same. Maybe you can search the repo > for examples of ListCoder? > > > -Rui > > On Thu, Jul 11, 2019 at 2:55 PM Shannon Duncan <joseph.dun...@liveramp.com> > wrote: > >> So ArrayList doesn't work either, so just a standard List? >> >> On Thu, Jul 11, 2019 at 4:53 PM Rui Wang <ruw...@google.com> wrote: >> >>> Shannon, I agree with Mike on List is a good workaround if your element >>> within list is deterministic and you are eager to make your new pipeline >>> working. >>> >>> >>> Let me send back some pointers to adding new coder later. >>> >>> >>> -Rui >>> >>> On Thu, Jul 11, 2019 at 2:45 PM Shannon Duncan < >>> joseph.dun...@liveramp.com> wrote: >>> >>>> I just started learning Java today to attempt to convert our python >>>> pipelines to Java to take advantage of key features that Java has. I have >>>> no idea how I would create a new coder and include it in for beam to >>>> recognize. >>>> >>>> If you can point me in the right direction of where it hooks together I >>>> might be able to figure that out. I can duplicate MapCoder and try to make >>>> changes, but how will beam know to pick up that coder for a groupByKey? >>>> >>>> Thanks! >>>> Shannon >>>> >>>> On Thu, Jul 11, 2019 at 4:42 PM Rui Wang <ruw...@google.com> wrote: >>>> >>>>> It could be just straightforward to create a SortedMapCoder for >>>>> TreeMap. Just add checks on map instances and then change >>>>> verifyDeterministic. >>>>> >>>>> If this is a common need we could just submit it into Beam repo. >>>>> >>>>> [1]: >>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java#L146 >>>>> >>>>> On Thu, Jul 11, 2019 at 2:26 PM Mike Pedersen <m...@mikepedersen.dk> >>>>> wrote: >>>>> >>>>>> There isn't a coder for deterministic maps in Beam, so even if your >>>>>> datastructure is deterministic, Beam will assume the serialized bytes >>>>>> aren't deterministic. >>>>>> >>>>>> You could make one using the MapCoder as a guide: >>>>>> https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/coders/MapCoder.java >>>>>> Just change it such that the exception in VerifyDeterministic is >>>>>> removed and when decoding it instantiates a TreeMap or such instead of a >>>>>> HashMap. >>>>>> >>>>>> Alternatively, you could just represent your key as a sorted list of >>>>>> KV pairs. Lookups could be done using binary search if necessary. >>>>>> >>>>>> Mike >>>>>> >>>>>> Den tor. 11. jul. 2019 kl. 22.41 skrev Shannon Duncan < >>>>>> joseph.dun...@liveramp.com>: >>>>>> >>>>>>> So I'm working on essentially doing a word-count on a complex data >>>>>>> structure. >>>>>>> >>>>>>> I tried just using a HashMap as the Structure, but that didn't work >>>>>>> because it is non-deterministic. >>>>>>> >>>>>>> However when Given a LinkedHashMap or TreeMap which is deterministic >>>>>>> the SDK complains that it's non-deterministic when trying to use it as a >>>>>>> key for GroupByKey. >>>>>>> >>>>>>> What would be an appropriate Map style data structure that would be >>>>>>> deterministic enough for Apache Beam to accept it as a key? >>>>>>> >>>>>>> Thanks, >>>>>>> Shannon >>>>>>> >>>>>>