There is always a running session. I replied in the PR. On Tue, 23 Jul 2024 at 23:32, Dongjoon Hyun <dongj...@apache.org> wrote:
> I'm bumping up this thread because the overhead bites us back already. > Here is a commit merged 3 hours ago. > > https://github.com/apache/spark/pull/47453 > [SPARK-48970][PYTHON][ML] Avoid using SparkSession.getActiveSession in > spark ML reader/writer > > In short, unlike the original PRs' claims, this commit starts to create > `SparkSession` in this layer. Although I understand the reason why Hyukjin > and Martin claims that `SparkSession` will be there in any way, this is an > architectural change which we need to decide explicitly, not implicitly. > > > On 2024/07/13 05:33:32 Hyukjin Kwon wrote: > > We actually get the active Spark session so it doesn't cause overhead. > Also > > even we create, it will create once which should be pretty trivial > overhead. > > If this architectural change is required inevitably and needs to happen in > Apache Spark 4.0.0. Can we have a dev-document about this? If there is no > proper place, we can add it to the ML migration guide simply. > > Dongjoon. > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >