Hi Yegor If your are not using Delta format (eg avro/json/parquet/csv/etc) then you have two options #1 cleanup WAL files (afaik it’s named _metadata folder insider your data folder) which requires that SSS job has to be stopped before you are cleaning the WAL. #2 you can use foreachBatch for write your data but then your SSS will not be exactly once but at least once
Best regards > On 3 Dec 2024, at 17:07, Дубинкин Егор <dubinkine...@gmail.com> wrote: > > > Hello Community, > > I need to delete old src data created by Spark Structured Streaming. > Just deleting relevant folder throws an exception while reading batch > dataframe from file-system: > java.io.FileNotFoundException: File > file:/data/avro/year=2020/month=3/day=13/hour=12/part-00000-0cc84e65-3f49-4686-85e3-1ecf48952794.c000.avro > does not exist > Issue is actualy the same that described here: > https://stackoverflow.com/questions/60773445/how-to-delete-old-data-that-was-created-by-spark-structured-streaming?newreg=5cc791c48358491c88d9b2dae1e436d9 > > Didn't find a way to delete it via Spark API. > Are there any solutions to do it via API instead of editing metadata manually? > > Your help would be appreciated.