Echoing Sean's earlier comment … What is the functionality that would go into a
2.5.0 release, that can't be in a 2.4.7 release?
On Fri, Jun 12, 2020 at 11:14 PM, Holden Karau < hol...@pigscanfly.ca > wrote:
>
> Can I suggest we maybe decouple this conversation a bit? First, if there
> is an ag
Can I suggest we maybe decouple this conversation a bit? First, if there is
an agreement in making a transitional release in principle and then folks
who feel strongly about specific backports can have their respective
discussions.It's not like we normally know or have agreement on everything
going
I understand the argument to add JDK 11 support just to extend the EOL, but
the other things seem kind of arbitrary and are not supported by your
arguments, especially DSv2 which is a massive change. DSv2 IIUC is not api
stable yet and will continue to evolve in the 3.x line.
Spark is designed in
+1 for a 2.x release with DSv2, JDK11, and Scala 2.11 support
We had an internal preview version of Spark 3.0 for our customers to try
out for a while, and then we realized that it's very challenging for
enterprise applications in production to move to Spark 3.0. For example,
many of our customers
I guess we already went through the same discussion, right? If anyone is
missed, please go through the discussion thread. [1] The consensus looks to
be not positive to migrate the new DSv2 into Spark 2.x version line,
because the change is pretty much huge, and also backward incompatible.
What I c
+1 for a 2.x release with a DSv2 API that matches 3.0.
There are a lot of big differences between the API in 2.4 and 3.0, and I
think a release to help migrate would be beneficial to organizations like
ours that will be supporting 2.x and 3.0 in parallel for quite a while.
Migration to Spark 3 is
Based on my understanding, DSV2 is not stable yet. It still misses various
features. Even our built-in file sources are still unable to fully migrate
to DSV2. We plan to enhance it in the next few releases to close the gap.
Also, the changes on DSV2 in Spark 3.0 did not break any existing
applicat
So I one of the things which we’re planning on backporting internally is
DSv2, which I think being available in a community release in a 2 branch
would be more broadly useful. Anything else on top of that would be on a
case by case basis for if they make an easier upgrade path to 3.
If we’re worri
What is the functionality that would go into a 2.5.0 release, that can't be
in a 2.4.7 release? I think that's the key question. 2.4.x is the 2.x
maintenance branch, and I personally could imagine being open to more
freely backporting a few new features for 2.x users, whereas usually it's
only bug
Which new functionalities are you referring to? In Spark SQL, most of the
major features in Spark 3.0 are difficult/time-consuming to backport. For
example, adaptive query execution. Releasing a new version is not hard, but
backporting/reviewing/maintaining these features are very time-consuming.
Hi Folks,
As we're getting closer to Spark 3 I'd like to revisit a Spark 2.5 release.
Spark 3 brings a number of important changes, and by its nature is not
backward compatible. I think we'd all like to have as smooth an upgrade
experience to Spark 3 as possible, and I believe that having a Spark
Hi Nasrulla,
Without details of your code / configuration, it's a bit hard to tell what
exactly went wrong, since there can be a lot of places that could go
wrong...
But one thing for sure is that, the interpreted code path (non-WSCG) and
the WSCG path are two separate things and it wouldn't surp
Thanks Kris for your inputs. Yes I have a new data source which wraps around
built-in parquet data source. What I do not understand is with WSCG disabled,
output is not columnar batch, if my changes do not handle columnar support,
shouldn’t the behavior remain same with or without WSCG.
From
Thanks Kris for your inputs. Yes I have a new data source which wraps around
builtin parquet data source. What I do not understand is with WSCG disabled,
Output is not columnar batch.
From: Kris Mo
Sent: Friday, June 12, 2020 2:20 AM
To: Nasrulla Khan Haris
Cc: dev@spark.apache.org
Subject: [
Hi,
Just noticed an inconsistency between times when a BlockManager is about to
be registered [1][2] and the time listeners are going to be informed [3],
and got curious whether it's intentional or not.
Why is the `time` value not used for SparkListenerBlockManagerAdded message?
[1]
https://gith
Hi Nasrulla,
Not sure what your new code is doing, but the symptom looks like you're
creating a new data source that wraps around the builtin Parquet data
source?
The problem here is, whole-stage codegen generated code for row-based
input, but the actual input is columnar.
In other words, in your
16 matches
Mail list logo