Hi Arnaud,
To join two datasets, the community recommends using join operation rather than
cogroup operation. For left join, you can use leftOuterJoin method. Flinkās
optimizer decides distributed join execution strategy using some statistics of
the datasets such as size of the dataset. Additio
Hello,
I have a very big dataset A to left join with a dataset B that is half its
size. That is to say, half of A records will be matched with one record of B,
and the other half with null values.
I used a CoGroup for that, but my batch fails because yarn kills the container
due to memory prob