Re: [PySpark Structured Streaming] How to tune .repartition(N) ?

2023-10-04 Thread Shao Yang Hong
hat there is an alternative option of using coalesce() instead of > repartition(). > -- > Raghavendra > > > On Thu, Oct 5, 2023 at 10:15 AM Shao Yang Hong > wrote: >> >> Hi all on user@spark: >> >> We are looking for advice and suggestions on how to tune the >

[PySpark Structured Streaming] How to tune .repartition(N) ?

2023-10-04 Thread Shao Yang Hong
Name(APP_NAME) .outputMode("append") .format("delta") .partitionBy(CREATED_DATE) .option("checkpointLocation", os.environ["CHECKPOINT"]) .start(os.environ["DELTA_PATH"]) ) query.awaitTermination() sp

[PySpark Structured Streaming] How to tune .repartition(N) ?

2023-10-04 Thread Shao Yang Hong
Name(APP_NAME) .outputMode("append") .format("delta") .partitionBy(CREATED_DATE) .option("checkpointLocation", os.environ["CHECKPOINT"]) .start(os.environ["DELTA_PATH"]) ) query.awaitTermination() sp