Tested on both Linux and Windows, as package.
Found StackOverflowError with ALS on Windows
https://issues.apache.org/jira/browse/SPARK-20402
This is part of the R CRAN check to build the vignettes. Very simple, quick and
consistent repo on Windows. The exact same code works fine on Linux. Rep
Hello,
I am in the process of putting together a PR that introduces a new hint
called NO_COLLAPSE. This hint is essentially identical to Oracle's NO_MERGE
hint.
Let me first give an example of why I am proposing this.
df1 = sc.sql.createDataFrame([(1, "abc")], ["id", "user_agent"])
df2 = df1.wit
Hi Michael,
This sounds like a good idea. Can you open a JIRA to track this?
My initial feedback on your proposal would be that you might want to
express the no_collapse at the expression level and not at the plan level.
HTH
On Thu, Apr 20, 2017 at 3:31 PM, Michael Styles
wrote:
> Hello,
>
>
+1 (non-binding), looks good
Tested on RHEL 7.2, 7.3, CentOS 7.2, Ubuntu 14 04 and 16 04, SUSE 12, x86,
IBM Linux on Power and IBM Linux on Z (big-endian)
No problems with latest IBM Java, Hadoop 2.7.3 and Scala 2.11.8, no
performance concerns to report either (spark-sql-perf and HiBench)
Buil
Steve,
I think you're a good person to ask about this. Is the below any cause for
concern? Or did I perhaps test this incorrectly?
Nick
On Tue, Apr 18, 2017 at 11:50 PM Nicholas Chammas <
nicholas.cham...@gmail.com> wrote:
> I had trouble starting up a shell with the AWS package loaded
> (spec
Doesn't common sub expression elimination address this issue as well?
On Thu, Apr 20, 2017 at 6:40 AM Herman van Hövell tot Westerflier <
hvanhov...@databricks.com> wrote:
> Hi Michael,
>
> This sounds like a good idea. Can you open a JIRA to track this?
>
> My initial feedback on your proposal w
I want to caution that in testing a build from this morning's branch-2.1 we
found that Hive partition pruning was not working. We found that Spark SQL was
fetching all Hive table partitions for a very simple query whereas in a build
from several weeks ago it was fetching only the required partit
We've identified the cause of the change in behavior. It is related to the SQL
conf key "spark.sql.hive.caseSensitiveInferenceMode". This key and its related
functionality was absent from our previous build. The default setting in the
current build was causing Spark to attempt to scan all table