Hello Flinkers! I run into some strange behavior when reading from a folder of input files.
When the number of input files in the folder exceeds the number of task slots I noticed that the size of my datasets varies with each run. It seems as if the transformations don't wait for all input files to be read. When I have equal or more task slots than there are files, there are no problems. I'm using a custom input format. Could there be a problem with my custom input format, and if so what could I be forgetting? Kind regards and thank you for your time! Pieter