Re: Disabling Closed -> Reopened transition for non-committers

2017-10-05 Thread lucas.g...@gmail.com
As a casual reader of the dev mailing group / Spark JIRA

This looks reasonable to me.  I read the JIRA in question and although I
didn't spend enough time to really dig into the issues presented I think
it's reasonable to close in this situation.

If the person with the grievance feels strongly it's really their onus to
send email to the dev group and find other ways to engage.



On 5 October 2017 at 04:44, Hyukjin Kwon  wrote:

> It's, Closed -> Reopened (not, Resolved -> Reopened) and I think we mostly
> leave JIRAs as Resolved.
>
> I support this idea. I think don't think this is unfriendly as it sounds
> in practice. This case should be quite occasional I guess.
>
>
> 2017-10-05 20:02 GMT+09:00 Sean Owen :
>
>> Solving it socially is of course ideal. We do already likewise point
>> everyone to http://spark.apache.org/contributing.html . I think this is
>> about what to do in extreme cases. It's a minor point of workflow, but,
>> seems like there's no particular need to let anyone reopen any issue.
>>
>> Speaking to the particular issue, maybe the error can be improved, but it
>> already shows ClosureCleaner calling ensureSerializable and saying "task
>> not serializable" and the object in question. That wasn't quite the issue
>> though, but rather that it was a) an extended question about how batches
>> are processed mixed with b) basic misapprehensions about Spark and c)
>> unacceptable tone for an OSS community from start to finish. Closing it is
>> the right resolution for several reasons, and I don't see that a better
>> exception message would have done anything.
>>
>> On Thu, Oct 5, 2017 at 11:38 AM Steve Loughran 
>> wrote:
>>
>>> It's a losing battle which you need to deal with socially rather than
>>> through the tooling...if the user is unhappy then at best they don't use
>>> the code, don't contribute in future, at worse: they keep filing the same
>>> JIRA. I'll add a comment
>>>
>>> In hadoop we've ended up creating a wiki page added as a link when
>>> closing things as an invalid, usually with a polite let down "sorry"
>>>
>>> https://wiki.apache.org/hadoop/InvalidJiraIssues
>>>
>>> and a video to go into detail
>>>
>>> https://www.youtube.com/watch?v=NaJlRk5aTRQ
>>>
>>>
>>> There's something to consider though: how can error reporting be
>>> improved? Especially for people new to a system?
>>>
>>>
>>> Serialization errors are ubiquitous in spark when you call RDDs with
>>> unserializable data; the first thing people learn when they start writing
>>> them is "this stack trace means I'm invoking something which can't go over
>>> the wire". So: how to help people over the hump there? catch & wrap with a
>>> pointer to some docs. For networking, feel free to use
>>> org.apache.hadoop.net.NetUtils#wrapException , where the links to wiki
>>> entries are lists of "common causes of these networking issues" are a
>>> checklist for everyone; the real facts of hostnames and ports are there for
>>> tracking things down.  The core Java io networking errors are without
>>> meaningful information, so it's up to the layers above to fix.
>>>
>>>
>>>
>>> On 5 Oct 2017, at 03:51, Sean Owen  wrote:
>>>
>>> Although I assume we could get an account suspended if it started
>>> opening spam issues, yes we default to letting anyone open issues, and
>>> potentially abusing it. That much is the right default and I don't see any
>>> policy tweak that stops that.
>>>
>>> I see several INFRA tickets asking to *allow* the Closed -> Reopened
>>> transition, which suggests it's not the default. https://issues.apache
>>> .org/jira/browse/INFRA-11857?jql=project%20%3D%20INFRA%
>>> 20AND%20text%20~%20%22reopen%20JIRA%22
>>>
>>> I'm accustomed to Closed being a final state that nobody can reopen as a
>>> matter of workflow -- the idea being that anything else should be a new
>>> discussion if the current issue was deemed formally done.
>>>
>>> Spark pretty much leaves all issues in "Resolved" status which can still
>>> be reopened, and I think that's right. Although I'd like to limit all
>>> reopening to committers, it isn't that important.
>>>
>>> Being able to move a JIRA to Closed permanently seems useful, as it
>>> doesn't interfere with any normal workflow, doesn't actually prevent a new
>>> issue from succeeding it in normal usage, and gives another tool to limit a
>>> specific kind of abuse.
>>>
>>> On Thu, Oct 5, 2017 at 3:28 AM Dongjoon Hyun 
>>> wrote:
>>>
 It can stop reopening, but new JIRA issues with duplicate content will
 be created intentionally instead.

 Is that policy (privileged reopening) used in other Apache communities
 for that purpose?


 On Wed, Oct 4, 2017 at 7:06 PM, Sean Owen  wrote:

> We have this problem occasionally, where a disgruntled user
> continually reopens an issue after it's closed.
>
> https://issues.apache.org/jira/browse/SPARK-21999
>
> (Feel free to comment on this one if anyone disagrees)
>
> Regardless of th

Re: Disabling Closed -> Reopened transition for non-committers

2017-10-05 Thread lucas.g...@gmail.com
I missed Steve's comments :(

That looks like a really healthy process, but also time consuming.

I guess that's what it takes to make a community?

Gary


Re: Spark on Kubernetes: Birds-of-a-Feather Session 12:50pm 6/6

2017-06-05 Thread lucas.g...@gmail.com
Very much looking forward to this session!

Raj, it's a spark summit session:
https://spark-summit.org/2017/schedule/
12:50 PM
Lunch
BoF Discussion-Deep Learning on Apache Spark

   - Jason Dai  (Intel)

There are increasing interest and applications for running deep learning on
Apache Spark platform (e.g., BigDL, TensorFrames,
Caffe/TensorFlow-on-Spark, etc.) in the community. In this BoF discussion,
we would like to cover related topics such as…
BoF Discussion-Apache Spark on Kubernetes

   - Erik Erlandson
   (Red Hat)

Come learn about the community development project to add a native
Kubernetes scheduling back-end to Apache Spark! Meet contributors and
network with community members interested in running Spark on Kubernetes.
Learn how to run Spark…


I don't think there'll be a remote link, but could be wrong.

Gary

On 5 June 2017 at 17:37, Raj, Deepu  wrote:

> HI Erik,
>
>
>
> Can you please share the details (Timezone, Webex Details)?
>
>
>
> Thanks,
>
> Deepu Raj
>
>
>
> *From:* Erik Erlandson [mailto:eerla...@redhat.com]
> *Sent:* Tuesday, 6 June 2017 10:28 AM
> *To:* dev@spark.apache.org
> *Subject:* Spark on Kubernetes: Birds-of-a-Feather Session 12:50pm 6/6
>
>
>
> Come learn about the community development project to add a native
> Kubernetes scheduling back-end to Apache Spark!  Meet contributors
> and network with community members interested in running Spark on
> Kubernetes. Learn how to run Spark jobs on your Kubernetes cluster;
> find out how to contribute to the project.
>


Re: SPIP: Spark on Kubernetes

2017-08-15 Thread lucas.g...@gmail.com
>From our perspective, we have invested heavily in Kubernetes as our cluster
manager of choice.

We also make quite heavy use of spark.  We've been experimenting with using
these builds (2.1 with pyspark enabled) quite heavily.  Given that we've
already 'paid the price' to operate Kubernetes in AWS it seems rational to
move our jobs over to spark on k8s.  Having this project merged into the
master will significantly ease keeping our Data Munging toolchain primarily
on Spark.


Gary Lucas
Data Ops Team Lead
Unbounce

On 15 August 2017 at 15:52, Andrew Ash  wrote:

> +1 (non-binding)
>
> We're moving large amounts of infrastructure from a combination of open
> source and homegrown cluster management systems to unify on Kubernetes and
> want to bring Spark workloads along with us.
>
> On Tue, Aug 15, 2017 at 2:29 PM, liyinan926  wrote:
>
>> +1 (non-binding)
>>
>>
>>
>> --
>> View this message in context: http://apache-spark-developers
>> -list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-tp22147p22164.html
>> Sent from the Apache Spark Developers List mailing list archive at
>> Nabble.com.
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>


Re: doc patch review

2017-09-21 Thread lucas.g...@gmail.com
https://issues.apache.org/jira/browse/SPARK-20448

On 21 September 2017 at 04:09, Hyukjin Kwon  wrote:

> I think it would have been nicer if the JIRA and PR are written in this
> email.
>
> 2017-09-21 19:44 GMT+09:00 Steve Loughran :
>
>> I have a doc patch on spark streaming & object store sources which has
>> been hitting is six-month-unreviewed state this week
>>
>> are there any plans to review this or shall I close it as a wontfix?
>>
>> thanks
>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>