Hey, I have a very specific use case. I have a history of records stored as Parquet in S3. I would like to read and process them with Flink. The issue is that the number of files is quite large ( >100k). If I provide the full list of files to HadoopInputFormat that I am using it will fail with AskTimeoutException, which Is weird since I am using YARN and setting the -yD akka.ask.timeout=600s, even thought according to the logs the setting is processed properly, the job execution still with AskTimeoutException after 10s, which seems weird to me. I have managed to go around this, by grouping files and reading them in a loop, so that finally I have the Seq[DataSet<Record>]. But if I try to union those datasets, then I will receive the AskTimeoutException again. So my question is, what can be the reason behind this exception being thrown and why is the setting ignored, even if this is pared properly.
I will be glad for any help. Best Regards, Dom.