i was surprised to find out that if a streaming dataframe is joined with a
static dataframe, that the static dataframe is re-shuffled for every
microbatch, which adds considerable overhead.
wouldn't it make more sense to re-use the shuffle files?
or if that is not possible then load the static da
Severity: important
Description:
The Apache Spark UI offers the possibility to enable ACLs via the
configuration option spark.acls.enable. With an authentication filter, this
checks whether a user has access permissions to view or modify the
application. If ACLs are enabled, a code path in HttpSe
We are happy to announce the availability of Apache Spark 3.2.2!
Spark 3.2.2 is a maintenance release containing stability fixes. This
release is based on the branch-3.2 maintenance branch of Spark. We strongly
recommend all 3.2 users to upgrade to this stable release.
To download Spark 3.2.2, he