Hi Fabian, My GroupReduce function sum one column of input rows of each group.
My key fields is array of multiple type, in this case is string and long. The result that i'm posting is just represents sampling of output dataset. Thank you in advance ! Anissa Le jeu. 22 août 2019 à 11:24, Fabian Hueske <fhue...@gmail.com> a écrit : > Hi Anissa, > > This looks strange. If I understand your code correctly, your GroupReduce > function is summing up a field. > Looking at the results that you posted, it seems as if there is some data > missing (the total sum does not seem to match). > > For groupReduce it is important that the grouping keys are deterministic. > Since you provide a String array as key definition, there is no > KeyExtractor function. > However, something that can cause random results are key attributes with > random hash values. > What is the type of your key fields? > > Another thing you might want to check is if the input (inputTable) to the > groupReduce function is the same with both parallelism settings. > > Best, Fabian >