When Spark job shows FetchFailedException it creates few duplicate data and we see few data also missing , please explain why. We have scenario when spark job complains FetchFailedException as one of the data node got rebooted middle of job running .
Now due to this we have few duplicate data and few missing data . Why spark is not handling this scenario correctly ? kind of we shouldn't miss any data and we shouldn't create duplicate data . I am using spark3.2.0 version.