[Spark SQL] overload + BinaryOperator for String concatenation.

2019-04-03 Thread Mark Le Noury
Hi, I've tried searching the archives and haven't found anything relevant - but apologies if this has been discussed before. I was wondering how viable it would be to alter the behavior of Spark to allow: "String1" + "String2" = "String1String2" Currently it tries to cast both Strings to double

Re: [DISCUSS] Enable blacklisting feature by default in 3.0

2019-04-03 Thread Steve Loughran
On Tue, Apr 2, 2019 at 9:39 PM Ankur Gupta wrote: > Hi Steve, > > Thanks for your feedback. From your email, I could gather the following > two important points: > >1. Report failures to something (cluster manager) which can opt to >destroy the node and request a new one >2. Pluggable

Re: Closing a SparkSession stops the SparkContext

2019-04-03 Thread Vinoo Ganesh
Yeah, so I think there are 2 separate issues here: 1. The coupling of the SparkSession + SparkContext in their current form seem unnatural 2. The current memory leak, which I do believe is a case where the session is added onto the spark context, but is only needed by the session (but wo

CfP VHPC19: HPC Virtualization-Containers: Paper due May 1, 2019 (extended)

2019-04-03 Thread VHPC 19
CALL FOR PAPERS 14th Workshop on Virtualization in High­-Performance Cloud Computing (VHPC '19) held in conjunction with the International Supercomputing Conference - High Performance, June 16-20, 2019, Frankfurt, Germany. (Spri

Re: Closing a SparkSession stops the SparkContext

2019-04-03 Thread Ryan Blue
For #1, do we agree on the behavior? I think that closing a SparkSession should not close the SparkContext unless it is the only session. Evidently, that's not what happens and I consider the current the current behavior a bug. For more context, we're working on the new catalog APIs and how to gua

Re: [DISCUSS] Spark Columnar Processing

2019-04-03 Thread Bobby Evans
I am still working on the SPIP and should get it up in the next few days. I have the basic text more or less ready, but I want to get a high-level API concept ready too just to have something more concrete. I have not really done much with contributing new features to spark so I am not sure where

Resolving generated expressions in catalyst

2019-04-03 Thread Nikolas Vanderhoof
Hey everyone! I'm trying to implement a custom catalyst optimization that I think may be useful to others that make frequent use of the arrays_overlap and array_contains functions in joins. Consider this first query joining on overlapping arrays. ``` import org.apache.spark.sql.functions._ val