Hi Anton,

Your example and documentation looks great! I left some comments suggesting a few additions, but the PR in its current state is a great improvement!

Thanks,

Jim

On 12/18/2016 09:09 AM, Anton Okolnychyi wrote:
Any comments/suggestions are more than welcome.

Thanks,
Anton

2016-12-18 15:08 GMT+01:00 Anton Okolnychyi <anton.okolnyc...@gmail.com <mailto:anton.okolnyc...@gmail.com>>:

    Here is the pull request:
    https://github.com/apache/spark/pull/16329
    <https://github.com/apache/spark/pull/16329>



    2016-12-16 20:54 GMT+01:00 Jim Hughes <jn...@ccri.com
    <mailto:jn...@ccri.com>>:

        I'd be happy to review a PR.  At the minute, I'm still
        learning Spark SQL, so writing documentation might be a bit of
        a stretch, but reviewing would be fine.

        Thanks!


        On 12/16/2016 08:39 AM, Thakrar, Jayesh wrote:

        Yes - that sounds good Anton, I can work on documenting the
        window functions.

        *From: *Anton Okolnychyi <anton.okolnyc...@gmail.com>
        <mailto:anton.okolnyc...@gmail.com>
        *Date: *Thursday, December 15, 2016 at 4:34 PM
        *To: *Conversant <jthak...@conversantmedia.com>
        <mailto:jthak...@conversantmedia.com>
        *Cc: *Michael Armbrust <mich...@databricks.com>
        <mailto:mich...@databricks.com>, Jim Hughes <jn...@ccri.com>
        <mailto:jn...@ccri.com>, "dev@spark.apache.org"
        <mailto:dev@spark.apache.org> <dev@spark.apache.org>
        <mailto:dev@spark.apache.org>
        *Subject: *Re: Expand the Spark SQL programming guide?

        I think it will make sense to show a sample implementation of
        UserDefinedAggregateFunction for DataFrames, and an example
        of the Aggregator API for typed Datasets.

        Jim, what if I submit a PR and you join the review process? I
        also do not mind to split this if you want, but it seems to
        be an overkill for this part.

        Jayesh, shall I skip the window functions part since you are
        going to work on that?

        2016-12-15 22:48 GMT+01:00 Thakrar, Jayesh
        <jthak...@conversantmedia.com
        <mailto:jthak...@conversantmedia.com>>:

            I too am interested in expanding the documentation for
            Spark SQL.

            For my work I needed to get some info/examples/guidance
            on window functions and have been using
            
https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html
            
<https://databricks.com/blog/2015/07/15/introducing-window-functions-in-spark-sql.html>
            .

            How about divide and conquer?

            *From: *Michael Armbrust <mich...@databricks.com
            <mailto:mich...@databricks.com>>
            *Date: *Thursday, December 15, 2016 at 3:21 PM
            *To: *Jim Hughes <jn...@ccri.com <mailto:jn...@ccri.com>>
            *Cc: *"dev@spark.apache.org
            <mailto:dev@spark.apache.org>" <dev@spark.apache.org
            <mailto:dev@spark.apache.org>>
            *Subject: *Re: Expand the Spark SQL programming guide?

            Pull requests would be welcome for any major missing
            features in the guide:
            
https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md
            
<https://github.com/apache/spark/blob/master/docs/sql-programming-guide.md>

            On Thu, Dec 15, 2016 at 11:48 AM, Jim Hughes
            <jn...@ccri.com <mailto:jn...@ccri.com>> wrote:

                Hi Anton,

                I'd like to see this as well.  I've been working on
                implementing geospatial user-defined types and
                functions. Having examples of aggregations and window
                functions would be awesome!

                I did test out implementing a distributed convex hull
                as a UserDefinedAggregateFunction, and that seemed to
                work sensibly.

                Cheers,

                Jim

                On 12/15/2016 03:28 AM, Anton Okolnychyi wrote:

                    Hi,

                    I am wondering whether it makes sense to expand
                    the Spark SQL programming guide with examples of
                    aggregations (including user-defined via the
                    Aggregator API) and window functions. For
                    instance, there might be a separate
                    subsection under "Getting Started" for each
                    functionality.

                    SPARK-16046 seems to be related but there is no
                    activity for more than 4 months.

                    Best regards,

                    Anton





Reply via email to