Re: Stable Release for Spark-on-k8s

2021-01-28 Thread Ankit Gupta
Hey Dongjoon Thanks a lot for confirming, sure let me try it out. Regards, Ankit On Fri, Jan 29, 2021 at 12:20 AM Dongjoon Hyun wrote: > Hi, Ankit > > Apache Spark 3.1.1 will do. Please see the following efforts. The Apache > Spark community is going to move to the next level in K8s area at 3.

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Hyukjin Kwon
FYI exposing methods with Column signature only is already documented on the top of functions.scala, and I believe that has been the current dev direction if I am not mistaken. Another point is that we should rather expose commonly used expressions. Its best if it considers language specific conte

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Matthew Powers
Thanks for the thoughtful responses. I now understand why adding all the functions across all the APIs isn't the default. To Nick's point, relying on heuristics to gauge user interest, in addition to personal experience, is a good idea. The regexp_extract_all SO thread has 16,000 views

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Reynold Xin
There's another thing that's not mentioned … it's primarily a problem for Scala. Due to static typing, we need a very large number of function overloads for the Scala version of each function, whereas in SQL/Python they are just one. There's a limit on how many functions we can add, and it also

Public API access to UDTs

2021-01-28 Thread Fitch, Simeon
Hi, First time posting here, so apologies if I need to be directing this topic elsewhere. I'm the author of RasterFrames, and a contributor to GeoMesa's Spark SQL module. Both make use of decently low level Catalyst constructs, include custom UDTs; RasterFrames introduces a geospatial raster type

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Maciej
Just my two cents on R side. On 1/28/21 10:00 PM, Nicholas Chammas wrote: > On Thu, Jan 28, 2021 at 3:40 PM Sean Owen > wrote: > > It isn't that regexp_extract_all (for example) is useless outside > SQL, just, where do you draw the line? Supporting 10s of random >

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Nicholas Chammas
On Thu, Jan 28, 2021 at 3:40 PM Sean Owen wrote: > It isn't that regexp_extract_all (for example) is useless outside SQL, > just, where do you draw the line? Supporting 10s of random SQL functions > across 3 other languages has a cost, which has to be weighed against > benefit, which we can never

Re: [Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread Sean Owen
I think I can articulate the general idea here, though I expect it is not deployed consistently. Yes there's a general desire to make APIs consistent across languages. Python and Scala should track pretty closely, even if R isn't really that consistent. SQL is a somewhat different case. There are

[Spark SQL]: SQL, Python, Scala and R API Consistency

2021-01-28 Thread MrPowers
Thank you all for your amazing work on this project. Spark has a great public interface and the source code is clean. The core team has done a great job building and maintaining this project. My emails / GitHub comments focus on the 1% that we might be able to improve. Pull requests / suggestio

Re: Stable Release for Spark-on-k8s

2021-01-28 Thread Dongjoon Hyun
Hi, Ankit Apache Spark 3.1.1 will do. Please see the following efforts. The Apache Spark community is going to move to the next level in K8s area at 3.1.1 and want more contribution in the future. Please try to use 3.1.1-rc1 and let us know your feedback. 3.1.1-rc2 will start soon (next Monday).

Stable Release for Spark-on-k8s

2021-01-28 Thread Ankit Gupta
Hey Everyone When are we planning to release the stable release/ Generally Available release for Spark-on-k8s ? I see that it is still experimental. Regards Ankit Prakash Gupta