We have a cogroup where sometimes we cogroup like this: Dataset z = larger.coGroup(small).where...
The strategy is printed as hash on key and a sort asc on the other key. Which is which? Naively, we'd want to hash larger and sort the small? Or is that wrong? What factors would impact the performance of the cogroup? We use cogroup to calculate a new set of records for a key from the previous calculated set with some modifications from (small). We're temporally milestoning records using cogroup btw, that's the use case. Thanks Billy Newport Data Architecture, Goldman, Sachs & Co. 30 Hudson | 37th Floor | Jersey City, NJ Tel: +1 (212) 8557773 | Cell: +1 (507) 254-0134 Email: billy.newp...@gs.com<mailto:edward.new...@gs.com>, KD2DKQ