Yes. Operate your producer in transactional mode. Checkpointing is an abstract concept only applicable to streaming.
-dan On Wed, Apr 16, 2025 at 7:02 AM Abhishek Singla <abhisheksingla...@gmail.com> wrote: > Hi Team, > > We are using foreachPartition to send dataset row data to third system via > HTTP client. The operation is not idempotent. I wanna ensure that in case > of failures the previously processed dataset should not get processed > again. > > Is there a way to checkpoint in Spark batch > 1. checkpoint processed partitions so that if there are 1000 partitions > and 100 were processed in the previous batch, they should not get processed > again. > 2. checkpoint partial partition in foreachPartition, if I have processed > 100 records from a partition which have 1000 total records, is there a way > to checkpoint offset so that those 100 should not get processed again. > > Regards, > Abhishek Singla >