Seems to make sense to have it false by default.

(I agree this deserves a dev list mention though even if there is easy
consensus). We should make sure we mark the Jira with releasenotes so we
can add it to uograde guide.

On Mon, Jan 28, 2019 at 8:47 AM Sean Owen <sro...@gmail.com> wrote:

> Interesting notion at https://github.com/apache/spark/pull/23650 :
>
> .unpersist() takes an optional 'blocking' argument. If true, the call
> waits until the resource is freed. Otherwise it doesn't.
>
> The default looks pretty inconsistent:
> - RDD: true
> - Broadcast: true
> - Dataset / DataFrame: false
> - Graph (in GraphX): false
> - Pyspark RDD: (no option)
> - Pyspark Broadcast: false
> - Pyspark DataFrame: false
>
> I think false is a better default, as I'd expect it's much more likely
> that the caller doesn't want to wait around while resources are freed,
> especially as this happens on the driver. The possible downside is
> that if the resources don't free up quickly, other operations might
> not have as much memory available as they otherwise might have.
>
> What about making the default false everywhere for Spark 3?
> I raised it to dev@ just because that seems like a nontrivial behavior
> change, but maybe it isn't controversial.
>
> Sean
>
> ---------------------------------------------------------------------
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

Reply via email to