Hi Team, We are using foreachPartition to send dataset row data to third system via HTTP client. The operation is not idempotent. I wanna ensure that in case of failures the previously processed dataset should not get processed again.
Is there a way to checkpoint in Spark batch 1. checkpoint processed partitions so that if there are 1000 partitions and 100 were processed in the previous batch, they should not get processed again. 2. checkpoint partial partition in foreachPartition, if I have processed 100 records from a partition which have 1000 total records, is there a way to checkpoint offset so that those 100 should not get processed again. Regards, Abhishek Singla