I think signing the artifacts produced from a secure CI sounds like a good
idea. I know we’ve been asked to reduce our GitHub action usage but perhaps
someone interested could volunteer to set that up.
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
h
Hi,
Thanks for the reply.
>From my experience, a build on a build server would be much more
predictable and less error prone than building on some laptop- and of
course much faster to have builds, snapshots, release candidates, early
previews releases, release candidates or final releases.
It will
Indeed. We could conceivably build the release in CI/CD but the final
verification / signing should be done locally to keep the keys safe (there
was some concern from earlier release processes).
Twitter: https://twitter.com/holdenkarau
Books (Learning Spark, High Performance Spark, etc.):
https://
Thank you so much for the update, Wenchen!
Dongjoon.
On Tue, May 7, 2024 at 10:49 AM Wenchen Fan wrote:
> UPDATE:
>
> Unfortunately, it took me quite some time to set up my laptop and get it
> ready for the release process (docker desktop doesn't work anymore, my pgp
> key is lost, etc.). I'll
Hello Folks,
in Spark I have read a file and done some transformation and finally
writing to hdfs.
Now I am interested in writing the same dataframe to MapRFS but for this
Spark will execute the full DAG again (recompute all the previous
steps)(all the read + transformations ).
I don't want this
Hi,
Sorry for the novice question, Wenchen - the release is done manually from
a laptop? Not using a CI CD process on a build server?
Thanks,
Nimrod
On Tue, May 7, 2024 at 8:50 PM Wenchen Fan wrote:
> UPDATE:
>
> Unfortunately, it took me quite some time to set up my laptop and get it
> ready
UPDATE:
Unfortunately, it took me quite some time to set up my laptop and get it
ready for the release process (docker desktop doesn't work anymore, my pgp
key is lost, etc.). I'll start the RC process at my tomorrow. Thanks for
your patience!
Wenchen
On Fri, May 3, 2024 at 7:47 AM yangjie01 wr
Hi Folks,
I wanted to check why spark doesn't create staging dir while doing an
insertInto on partitioned tables. I'm running below example code –
```
spark.sql("set hive.exec.dynamic.partition.mode=nonstrict")
val rdd = sc.parallelize(Seq((1, 5, 1), (2, 1, 2), (4, 4, 3)))
val df = spark.createDa