-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Hello!
We are facing the same issue.
Could you please elaborate on how to clean up WALs/delete irrelevant
partitions when SSS job is stopped?
Direct usage of FileStreamSinkLog in the mentioned StackOverflow question
looks "hacky".
Moreover workaround provided in SO question uses
FileStreamSinkLog
Yes but your SSS job has to be stopped gracefully.
Originally I raised this SPIP request
https://issues.apache.org/jira/browse/SPARK-42485
Then I requested "Adding pause() method to
pyspark.sql.streaming.StreamingQuery"
I believe they are still open.
HTH
Mich Talebzadeh,
Architect | Data Scie
Hi Yegor
If your are not using Delta format (eg avro/json/parquet/csv/etc) then you have
two options
#1 cleanup WAL files (afaik it’s named _metadata folder insider your data
folder) which requires that SSS job has to be stopped before you are cleaning
the WAL.
#2 you can use foreachBatch for wr
Forgot to mention: Spark 3.5.2 is used
On 2024/12/03 15:05:18 Дубинкин Егор wrote:
> Hello Community,
>
> I need to delete old src data created by Spark Structured Streaming.
> Just deleting relevant folder throws an exception while reading batch
> dataframe from file-system:
>
> java.io.FileNotF
Hello Community,
I need to delete old src data created by Spark Structured Streaming.
Just deleting relevant folder throws an exception while reading batch dataframe
from file-system:
java.io.FileNotFoundException: File
file:/data/avro/year=2020/month=3/day=13/hour=12/part-0-0cc84e65-3f49-4
Dear all,
We are happy to report that we have released Apache Sedona 1.7.0.
Thank you again for your help.
Apache Sedona is a cluster computing system for processing large-scale
spatial data.
Vote thread (Permalink from https://lists.apache.org/list.html):
https://lists.apache.org/thread/5hvcr80