ShuffleDataIO: where is the reading part of the API?

2024-10-30 Thread Enrico Minack
Hi devs, the docs of org.apache.spark.shuffle.api.ShuffleDataIO read:     An interface for plugging in modules for storing and reading temporary shuffle data. but the API does only provide interface for writing shuffle data: - ShuffleExecutorComponents.createMapOutputWriter - ShuffleExecutorCom

[Spark SQL] KafkaWriteTask: allow customising timestamp - PR

2024-10-30 Thread Peter Fischer
Hi! We wrote a wrapper around the kafka writer to add client-side schema validation. In the process we noticed that there was no way to change a kafka record's timestamp when writing. So we extended spark-sql-kafka to support it and would love to hear your feedback. JIRA: https://issues.apache.or

Re: [DISCUSS] Apache Spark 3.5 LTS Period (~ 2026 April)

2024-10-30 Thread Dongjoon Hyun
Here is a concrete PR for this. - https://github.com/apache/spark-website/pull/565 Dongjoon On 2024/10/29 17:49:12 Dongjoon Hyun wrote: > Hi, All. > > Thank you again everyone for making together > Apache Spark 4.0.0-preview2 and Apache Spark 3.4.4 EOL releases. > > I believe it's time to deci