date:20160219

Re: pull request template

2016-02-19 Thread Iulian Dragoș

It's a good idea. I would add in there the spec for the PR title. I always
get wrong the order between Jira and component.

Moreover, CONTRIBUTING.md is also lacking them. Any reason not to add it
there? I can open PRs for both, but maybe you want to keep that info on the
wiki instead.

iulian

On Thu, Feb 18, 2016 at 4:18 AM, Reynold Xin  wrote:

> Github introduced a new feature today that allows projects to define
> templates for pull requests. I pushed a very simple template to the
> repository:
>
> https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE
>
>
> Over time I think we can see how this works and perhaps add a small
> checklist to the pull request template so contributors are reminded every
> time they submit a pull request the important things to do in a pull
> request (e.g. having proper tests).
>
>
>
> ## What changes were proposed in this pull request?
>
> (Please fill in changes proposed in this fix)
>
>
> ## How was the this patch tested?
>
> (Please explain how this patch was tested. E.g. unit tests, integration
> tests, manual tests)
>
>
> (If this patch involves UI changes, please attach a screenshot; otherwise,
> remove this)
>
>
>


-- 

--
Iulian Dragos

--
Reactive Apps on the JVM
www.typesafe.com

Re: pull request template

2016-02-19 Thread Sean Owen

All that seems fine. All of this is covered in the contributing wiki,
which is linked from CONTRIBUTING.md (and should be from the
template), but people don't seem to bother reading it. I don't mind
duplicating some key points, and even a more explicit exhortation to
read the whole wiki, before considering opening a PR. We spend way too
much time asking people to fix things they should have taken 60
seconds to do correctly in the first place.

On Fri, Feb 19, 2016 at 10:33 AM, Iulian Dragoș
 wrote:
> It's a good idea. I would add in there the spec for the PR title. I always
> get wrong the order between Jira and component.
>
> Moreover, CONTRIBUTING.md is also lacking them. Any reason not to add it
> there? I can open PRs for both, but maybe you want to keep that info on the
> wiki instead.
>
> iulian
>
> On Thu, Feb 18, 2016 at 4:18 AM, Reynold Xin  wrote:
>>
>> Github introduced a new feature today that allows projects to define
>> templates for pull requests. I pushed a very simple template to the
>> repository:
>>
>> https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE
>>
>>
>> Over time I think we can see how this works and perhaps add a small
>> checklist to the pull request template so contributors are reminded every
>> time they submit a pull request the important things to do in a pull request
>> (e.g. having proper tests).
>>
>>
>>
>> ## What changes were proposed in this pull request?
>>
>> (Please fill in changes proposed in this fix)
>>
>>
>> ## How was the this patch tested?
>>
>> (Please explain how this patch was tested. E.g. unit tests, integration
>> tests, manual tests)
>>
>>
>> (If this patch involves UI changes, please attach a screenshot; otherwise,
>> remove this)
>>
>>
>
>
>
> --
>
> --
> Iulian Dragos
>
> --
> Reactive Apps on the JVM
> www.typesafe.com
>

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Write access to wiki

2016-02-19 Thread Holden Karau

Any chance I could also get write access to the wiki? I'd like to update
some of the PySpark documentation in the wiki.

On Tue, Jan 12, 2016 at 10:14 AM, shane knapp  wrote:

> > Ok, sounds good. I think it would be great, if you could add installing
> the
> > 'docker-engine' package and starting the 'docker' service in there too. I
> > was planning to update the playbook if there were one in the apache/spark
> > repo but I didn't see one, hence my question.
> >
> we currently have docker 1.5 running on the worker, and after the
> Great Upgrade To CentOS7, we'll be running a much more modern version
> of docker.
>
> shane
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
> For additional commands, e-mail: dev-h...@spark.apache.org
>
>


-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: How to run PySpark tests?

2016-02-19 Thread Holden Karau

Or wait I don't have access to the wiki - if anyone can give me wiki access
I'll update the instructions.

On Thu, Feb 18, 2016 at 8:45 PM, Holden Karau  wrote:

> Great - I'll update the wiki.
>
> On Thu, Feb 18, 2016 at 8:34 PM, Jason White 
> wrote:
>
>> Compiling with `build/mvn -Pyarn -Phadoop-2.4 -Phive
>> -Dhadoop.version=2.4.0
>> -DskipTests clean package` followed by `python/run-tests` seemed to do the
>> trick! Thanks!
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-developers-list.1001551.n3.nabble.com/How-to-run-PySpark-tests-tp16357p16362.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>



-- 
Cell : 425-233-8271
Twitter: https://twitter.com/holdenkarau

Re: pull request template

2016-02-19 Thread Reynold Xin

We can add that too - just need to figure out a good way so people don't
leave a lot of the unnecessary "guideline" messages in the template.

The contributing guide is great, but unfortunately it is not as noticeable
and is often ignored. It's good to have this full-fledged contributing
guide, and then have a very lightweight version of that in the form of
templates to force contributors to think about all the important aspects
outlined in the contributing guide.




On Fri, Feb 19, 2016 at 2:36 AM, Sean Owen  wrote:

> All that seems fine. All of this is covered in the contributing wiki,
> which is linked from CONTRIBUTING.md (and should be from the
> template), but people don't seem to bother reading it. I don't mind
> duplicating some key points, and even a more explicit exhortation to
> read the whole wiki, before considering opening a PR. We spend way too
> much time asking people to fix things they should have taken 60
> seconds to do correctly in the first place.
>
> On Fri, Feb 19, 2016 at 10:33 AM, Iulian Dragoș
>  wrote:
> > It's a good idea. I would add in there the spec for the PR title. I
> always
> > get wrong the order between Jira and component.
> >
> > Moreover, CONTRIBUTING.md is also lacking them. Any reason not to add it
> > there? I can open PRs for both, but maybe you want to keep that info on
> the
> > wiki instead.
> >
> > iulian
> >
> > On Thu, Feb 18, 2016 at 4:18 AM, Reynold Xin 
> wrote:
> >>
> >> Github introduced a new feature today that allows projects to define
> >> templates for pull requests. I pushed a very simple template to the
> >> repository:
> >>
> >>
> https://github.com/apache/spark/blob/master/.github/PULL_REQUEST_TEMPLATE
> >>
> >>
> >> Over time I think we can see how this works and perhaps add a small
> >> checklist to the pull request template so contributors are reminded
> every
> >> time they submit a pull request the important things to do in a pull
> request
> >> (e.g. having proper tests).
> >>
> >>
> >>
> >> ## What changes were proposed in this pull request?
> >>
> >> (Please fill in changes proposed in this fix)
> >>
> >>
> >> ## How was the this patch tested?
> >>
> >> (Please explain how this patch was tested. E.g. unit tests, integration
> >> tests, manual tests)
> >>
> >>
> >> (If this patch involves UI changes, please attach a screenshot;
> otherwise,
> >> remove this)
> >>
> >>
> >
> >
> >
> > --
> >
> > --
> > Iulian Dragos
> >
> > --
> > Reactive Apps on the JVM
> > www.typesafe.com
> >
>

Re: DataFrame API and Ordering

2016-02-19 Thread Maciej Szymkiewicz

I am not sure. Spark SQL, DataFrames and Datasets Guide already has a
section about NaN semantics. This could be a good place to add at least
some basic description.

For the rest InterpretedOrdering could be a good choice.

On 02/19/2016 12:35 AM, Reynold Xin wrote:
> You are correct and we should document that.
>
> Any suggestions on where we should document this? In DoubleType and
> FloatType?
>
> On Tuesday, February 16, 2016, Maciej Szymkiewicz
> mailto:mszymkiew...@gmail.com>> wrote:
>
> I am not sure if I've missed something obvious but as far as I can
> tell
> DataFrame API doesn't provide a clearly defined ordering rules
> excluding
> NaN handling. Methods like DataFrame.sort or sql.functions like min /
> max provide only general description. Discrepancy between
> functions.max
> (min) and GroupedData.max where the latter one supports only numeric
> makes current situation even more confusing. With growing number of
> orderable types I believe that documentation should clearly define
> ordering rules including:
>
> - NULL behavior
> - collation
> - behavior on complex types (structs, arrays)
>
> While this information can extracted from the source it is not easily
> accessible and without explicit specification it is not clear if
> current
> behavior is contractual. It can be also confusing if user expects an
> order depending on a current locale (R).
>
> Best,
> Maciej
>



signature.asc
Description: OpenPGP digital signature

Re: pull request template

Re: pull request template

Re: Write access to wiki

Re: How to run PySpark tests?

Re: pull request template

Re: DataFrame API and Ordering

6 matches

Site Navigation

Mail list logo

Footer information