After talking with people on this thread and offline, I've decided to go
with option 1, i.e. putting everything in a single "functions" object.
On Thu, Apr 30, 2015 at 10:04 AM, Ted Yu wrote:
> IMHO I would go with choice #1
>
> Cheers
>
> On Wed, Apr 29, 2015 at 10:03 PM, Reynold Xin wrote:
>
IMHO I would go with choice #1
Cheers
On Wed, Apr 29, 2015 at 10:03 PM, Reynold Xin wrote:
> We definitely still have the name collision problem in SQL.
>
> On Wed, Apr 29, 2015 at 10:01 PM, Punyashloka Biswal <
> punya.bis...@gmail.com
> > wrote:
>
> > Do we still have to keep the names of the
We definitely still have the name collision problem in SQL.
On Wed, Apr 29, 2015 at 10:01 PM, Punyashloka Biswal wrote:
> Do we still have to keep the names of the functions distinct to avoid
> collisions in SQL? Or is there a plan to allow "importing" a namespace into
> SQL somehow?
>
> I ask b
Do we still have to keep the names of the functions distinct to avoid
collisions in SQL? Or is there a plan to allow "importing" a namespace into
SQL somehow?
I ask because if we have to keep worrying about name collisions then I'm
not sure what the added complexity of #2 and #3 buys us.
Punya
On
Scaladoc isn't much of a problem because scaladocs are grouped. Java/Python
is the main problem ...
See
https://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.functions$
On Wed, Apr 29, 2015 at 3:38 PM, Shivaram Venkataraman <
shiva...@eecs.berkeley.edu> wrote:
> My feeli
My feeling is that we should have a handful of namespaces (say 4 or 5). It
becomes too cumbersome to import / remember more package names and having
everything in one package makes it hard to read scaladoc etc.
Thanks
Shivaram
On Wed, Apr 29, 2015 at 3:30 PM, Reynold Xin wrote:
> To add a littl
To add a little bit more context, some pros/cons I can think of are:
Option 1: Very easy for users to find the function, since they are all in
org.apache.spark.sql.functions. However, there will be quite a large number
of them.
Option 2: I can't tell why we would want this one over Option 3, sinc
Before we make DataFrame non-alpha, it would be great to decide how we want
to namespace all the functions. There are 3 alternatives:
1. Put all in org.apache.spark.sql.functions. This is how SQL does it,
since SQL doesn't have namespaces. I estimate eventually we will have ~ 200
functions.
2. Ha