Yes. Operate your producer in transactional mode. Checkpointing is an
abstract concept only applicable to streaming.

-dan


On Wed, Apr 16, 2025 at 7:02 AM Abhishek Singla <abhisheksingla...@gmail.com>
wrote:

> Hi Team,
>
> We are using foreachPartition to send dataset row data to third system via
> HTTP client. The operation is not idempotent. I wanna ensure that in case
> of failures the previously processed dataset should not get processed
> again.
>
> Is there a way to checkpoint in Spark batch
> 1. checkpoint processed partitions so that if there are 1000 partitions
> and 100 were processed in the previous batch, they should not get processed
> again.
> 2. checkpoint partial partition in foreachPartition, if I have processed
> 100 records from a partition which have 1000 total records, is there a way
> to checkpoint offset so that those 100 should not get processed again.
>
> Regards,
> Abhishek Singla
>

Reply via email to