Sorry, I missed that the proposal includes majority approval. Why majority instead of consensus? I think we want to build consensus around these proposals and it makes sense to discuss until no one would veto.
rb On Mon, Oct 10, 2016 at 11:54 AM, Ryan Blue <rb...@netflix.com> wrote: > +1 to votes to approve proposals. I agree that proposals should have an > official mechanism to be accepted, and a vote is an established means of > doing that well. I like that it includes a period to review the proposal > and I think proposals should have been discussed enough ahead of a vote to > survive the possibility of a veto. > > I also like the names that are short and (mostly) unique, like SEP. > > Where I disagree is with the requirement that a committer must formally > propose an enhancement. I don't see the value of restricting this: if > someone has the will to write up a proposal then they should be encouraged > to do so and start a discussion about it. Even if there is a political > reality as Cody says, what is the value of codifying that in our process? I > think restricting who can submit proposals would only undermine them by > pushing contributors out. Maybe I'm missing something here? > > rb > > > > On Mon, Oct 10, 2016 at 7:41 AM, Cody Koeninger <c...@koeninger.org> > wrote: > >> Yes, users suggesting SIPs is a good thing and is explicitly called >> out in the linked document under the Who? section. Formally proposing >> them, not so much, because of the political realities. >> >> Yes, implementation strategy definitely affects goals. There are all >> kinds of examples of this, I'll pick one that's my fault so as to >> avoid sounding like I'm blaming: >> >> When I implemented the Kafka DStream, one of my (not explicitly agreed >> upon by the community) goals was to make sure people could use the >> Dstream with however they were already using Kafka at work. The lack >> of explicit agreement on that goal led to all kinds of fighting with >> committers, that could have been avoided. The lack of explicit >> up-front strategy discussion led to the DStream not really working >> with compacted topics. I knew about compacted topics, but don't have >> a use for them, so had a blind spot there. If there was explicit >> up-front discussion that my strategy was "assume that batches can be >> defined on the driver solely by beginning and ending offsets", there's >> a greater chance that a user would have seen that and said, "hey, what >> about non-contiguous offsets in a compacted topic". >> >> This kind of thing is only going to happen smoothly if we have a >> lightweight user-visible process with clear outcomes. >> >> On Mon, Oct 10, 2016 at 1:34 AM, assaf.mendelson >> <assaf.mendel...@rsa.com> wrote: >> > I agree with most of what Cody said. >> > >> > Two things: >> > >> > First we can always have other people suggest SIPs but mark them as >> > “unreviewed” and have committers basically move them forward. The >> problem is >> > that writing a good document takes time. This way we can leverage non >> > committers to do some of this work (it is just another way to >> contribute). >> > >> > >> > >> > As for strategy, in many cases implementation strategy can affect the >> goals. >> > I will give a small example: In the current structured streaming >> strategy, >> > we group by the time to achieve a sliding window. This is definitely an >> > implementation decision and not a goal. However, I can think of several >> > aggregation functions which have the time inside their calculation >> buffer. >> > For example, let’s say we want to return a set of all distinct values. >> One >> > way to implement this would be to make the set into a map and have the >> value >> > contain the last time seen. Multiplying it across the groupby would >> cost a >> > lot in performance. So adding such a strategy would have a great effect >> on >> > the type of aggregations and their performance which does affect the >> goal. >> > Without adding the strategy, it is easy for whoever goes to the design >> > document to not think about these cases. Furthermore, it might be >> decided >> > that these cases are rare enough so that the strategy is still good >> enough >> > but how would we know it without user feedback? >> > >> > I believe this example is exactly what Cody was talking about. Since >> many >> > times implementation strategies have a large effect on the goal, we >> should >> > have it discussed when discussing the goals. In addition, while it is >> often >> > easy to throw out completely infeasible goals, it is often much harder >> to >> > figure out that the goals are unfeasible without fine tuning. >> > >> > >> > >> > >> > >> > Assaf. >> > >> > >> > >> > From: Cody Koeninger-2 [via Apache Spark Developers List] >> > [mailto:ml-node+[hidden email]] >> > Sent: Monday, October 10, 2016 2:25 AM >> > To: Mendelson, Assaf >> > Subject: Re: Spark Improvement Proposals >> > >> > >> > >> > Only committers should formally submit SIPs because in an apache >> > project only commiters have explicit political power. If a user can't >> > find a commiter willing to sponsor an SIP idea, they have no way to >> > get the idea passed in any case. If I can't find a committer to >> > sponsor this meta-SIP idea, I'm out of luck. >> > >> > I do not believe unrealistic goals can be found solely by inspection. >> > We've managed to ignore unrealistic goals even after implementation! >> > Focusing on APIs can allow people to think they've solved something, >> > when there's really no way of implementing that API while meeting the >> > goals. Rapid iteration is clearly the best way to address this, but >> > we've already talked about why that hasn't really worked. If adding a >> > non-binding API section to the template is important to you, I'm not >> > against it, but I don't think it's sufficient. >> > >> > On your PRD vs design doc spectrum, I'm saying this is closer to a >> > PRD. Clear agreement on goals is the most important thing and that's >> > why it's the thing I want binding agreement on. But I cannot agree to >> > goals unless I have enough minimal technical info to judge whether the >> > goals are likely to actually be accomplished. >> > >> > >> > >> > On Sun, Oct 9, 2016 at 5:35 PM, Matei Zaharia <[hidden email]> wrote: >> > >> > >> >> Well, I think there are a few things here that don't make sense. First, >> >> why >> >> should only committers submit SIPs? Development in the project should >> be >> >> open to all contributors, whether they're committers or not. Second, I >> >> think >> >> unrealistic goals can be found just by inspecting the goals, and I'm >> not >> >> super worried that we'll accept a lot of SIPs that are then infeasible >> -- >> >> we >> >> can then submit new ones. But this depends on whether you want this >> >> process >> >> to be a "design doc lite", where people also agree on implementation >> >> strategy, or just a way to agree on goals. This is what I asked earlier >> >> about PRDs vs design docs (and I'm open to either one but I'd just like >> >> clarity). Finally, both as a user and designer of software, I always >> want >> >> to >> >> give feedback on APIs, so I'd really like a culture of having those >> early. >> >> People don't argue about prettiness when they discuss APIs, they argue >> >> about >> >> the core concepts to expose in order to meet various goals, and then >> >> they're >> >> stuck maintaining those for a long time. >> >> >> >> Matei >> >> >> >> On Oct 9, 2016, at 3:10 PM, Cody Koeninger <[hidden email]> wrote: >> >> >> >> Users instead of people, sure. Commiters and contributors are (or at >> >> least >> >> should be) a subset of users. >> >> >> >> Non goals, sure. I don't care what the name is, but we need to clearly >> say >> >> e.g. 'no we are not maintaining compatibility with XYZ right now'. >> >> >> >> API, what I care most about is whether it allows me to accomplish the >> >> goals. >> >> Arguing about how ugly or pretty it is can be saved for design/ >> >> implementation imho. >> >> >> >> Strategy, this is necessary because otherwise goals can be out of line >> >> with >> >> reality. Don't propose goals you don't have at least some idea of how >> to >> >> implement. >> >> >> >> Rejected strategies, given that commiters are the only ones I'm saying >> >> should formally submit SPARKLIs or SIPs, if they put junk in a required >> >> section then slap them down for it and tell them to fix it. >> >> >> >> >> >> On Oct 9, 2016 4:36 PM, "Matei Zaharia" <[hidden email]> wrote: >> >>> >> >>> Yup, this is the stuff that I found unclear. Thanks for clarifying >> here, >> >>> but we should also clarify it in the writeup. In particular: >> >>> >> >>> - Goals needs to be about user-facing behavior ("people" is broad) >> >>> >> >>> - I'd rename Rejected Goals to Non-Goals. Otherwise someone will dig >> up >> >>> one of these and say "Spark's developers have officially rejected X, >> >>> which >> >>> our awesome system has". >> >>> >> >>> - For user-facing stuff, I think you need a section on API. Virtually >> all >> >>> other *IPs I've seen have that. >> >>> >> >>> - I'm still not sure why the strategy section is needed if the >> purpose is >> >>> to define user-facing behavior -- unless this is the strategy for >> setting >> >>> the goals or for defining the API. That sounds squarely like a design >> doc >> >>> issue. In some sense, who cares whether the proposal is technically >> >>> feasible >> >>> right now? If it's infeasible, that will be discovered later during >> >>> design >> >>> and implementation. Same thing with rejected strategies -- listing >> some >> >>> of >> >>> those is definitely useful sometimes, but if you make this a >> *required* >> >>> section, people are just going to fill it in with bogus stuff (I've >> seen >> >>> this happen before). >> >>> >> >>> Matei >> >>> >> > >> >>> > On Oct 9, 2016, at 2:14 PM, Cody Koeninger <[hidden email]> wrote: >> >>> > >> >>> > So to focus the discussion on the specific strategy I'm suggesting, >> >>> > documented at >> >>> > >> >>> > >> >>> > >> >>> > https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark-i >> mprovement-proposals.md >> >>> > >> >>> > "Goals: What must this allow people to do, that they can't >> currently?" >> >>> > >> >>> > Is it unclear that this is focusing specifically on people-visible >> >>> > behavior? >> >>> > >> >>> > Rejected goals - are important because otherwise people keep trying >> >>> > to argue about scope. Of course you can change things later with a >> >>> > different SIP and different vote, the point is to focus. >> >>> > >> >>> > Use cases - are something that people are going to bring up in >> >>> > discussion. If they aren't clearly documented as a goal ("This must >> >>> > allow me to connect using SSL"), they should be added. >> >>> > >> >>> > Internal architecture - if the people who need specific behavior are >> >>> > implementers of other parts of the system, that's fine. >> >>> > >> >>> > Rejected strategies - If you have none of these, you have no >> evidence >> >>> > that the proponent didn't just go with the first thing they had in >> >>> > mind (or have already implemented), which is a big problem >> currently. >> >>> > Approval isn't binding as to specifics of implementation, so these >> >>> > aren't handcuffs. The goals are the contract, the strategy is >> >>> > evidence that contract can actually be met. >> >>> > >> >>> > Design docs - I'm not touching design docs. The markdown file I >> >>> > linked specifically says of the strategy section "This is not a full >> >>> > design document." Is this unclear? Design docs can be worked on >> >>> > obviously, but that's not what I'm concerned with here. >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > On Sun, Oct 9, 2016 at 2:34 PM, Matei Zaharia <[hidden email]> >> >>> > wrote: >> >>> >> Hi Cody, >> >>> >> >> >>> >> I think this would be a lot more concrete if we had a more detailed >> >>> >> template >> >>> >> for SIPs. Right now, it's not super clear what's in scope -- e.g. >> are >> >>> >> they >> >>> >> a way to solicit feedback on the user-facing behavior or on the >> >>> >> internals? >> >>> >> "Goals" can cover both things. I've been thinking of SIPs more as >> >>> >> Product >> >>> >> Requirements Docs (PRDs), which focus on *what* a code change >> should >> >>> >> do >> >>> >> as >> >>> >> opposed to how. >> >>> >> >> >>> >> In particular, here are some things that you may or may not >> consider >> >>> >> in >> >>> >> scope for SIPs: >> >>> >> >> >>> >> - Goals and non-goals: This is definitely in scope, and IMO should >> >>> >> focus on >> >>> >> user-visible behavior (e.g. "system supports SQL window functions" >> or >> >>> >> "system continues working if one node fails"). BTW I wouldn't say >> >>> >> "rejected >> >>> >> goals" because some of them might become goals later, so we're not >> >>> >> definitively rejecting them. >> >>> >> >> >>> >> - Public API: Probably should be included in most SIPs unless it's >> too >> >>> >> large >> >>> >> to fully specify then (e.g. "let's add an ML library"). >> >>> >> >> >>> >> - Use cases: I usually find this very useful in PRDs to better >> >>> >> communicate >> >>> >> the goals. >> >>> >> >> >>> >> - Internal architecture: This is usually *not* a thing users can >> >>> >> easily >> >>> >> comment on and it sounds more like a design doc item. Of course >> it's >> >>> >> important to show that the SIP is feasible to implement. One >> >>> >> exception, >> >>> >> however, is that I think we'll have some SIPs primarily on >> internals >> >>> >> (e.g. >> >>> >> if somebody wants to refactor Spark's query optimizer or >> something). >> >>> >> >> >>> >> - Rejected strategies: I personally wouldn't put this, because >> what's >> >>> >> the >> >>> >> point of voting to reject a strategy before you've really begun >> >>> >> designing >> >>> >> and implementing something? What if you discover that the strategy >> is >> >>> >> actually better when you start doing stuff? >> >>> >> >> >>> >> At a super high level, it depends on whether you want the SIPs to >> be >> >>> >> PRDs >> >>> >> for getting some quick feedback on the goals of a feature before >> it is >> >>> >> designed, or something more like full-fledged design docs (just a >> more >> >>> >> visible design doc for bigger changes). I looked at Kafka's KIPs, >> and >> >>> >> they >> >>> >> actually seem to be more like design docs. This can work too but it >> >>> >> does >> >>> >> require more work from the proposer and it can lead to the same >> >>> >> problems you >> >>> >> mentioned with people already having a design and implementation in >> >>> >> mind. >> >>> >> >> >>> >> Basically, the question is, are you trying to iterate faster on >> design >> >>> >> by >> >>> >> adding a step for user feedback earlier? Or are you just trying to >> >>> >> make >> >>> >> design docs for key features more visible (and their approval more >> >>> >> formal)? >> >>> >> >> >>> >> BTW note that in either case, I'd like to have a template for >> design >> >>> >> docs >> >>> >> too, which should also include goals. I think that would've avoided >> >>> >> some of >> >>> >> the issues you brought up. >> >>> >> >> >>> >> Matei >> >>> >> >> >>> >> On Oct 9, 2016, at 10:40 AM, Cody Koeninger <[hidden email]> wrote: >> >>> >> >> >>> >> Here's my specific proposal (meta-proposal?) >> >>> >> >> >>> >> Spark Improvement Proposals (SIP) >> >>> >> >> >>> >> >> >>> >> Background: >> >>> >> >> >>> >> The current problem is that design and implementation of large >> >>> >> features >> >>> >> are >> >>> >> often done in private, before soliciting user feedback. >> >>> >> >> >>> >> When feedback is solicited, it is often as to detailed design >> >>> >> specifics, not >> >>> >> focused on goals. >> >>> >> >> >>> >> When implementation does take place after design, there is often >> >>> >> disagreement as to what goals are or are not in scope. >> >>> >> >> >>> >> This results in commits that don't fully meet user needs. >> >>> >> >> >>> >> >> >>> >> Goals: >> >>> >> >> >>> >> - Ensure user, contributor, and committer goals are clearly >> identified >> >>> >> and >> >>> >> agreed upon, before implementation takes place. >> >>> >> >> >>> >> - Ensure that a technically feasible strategy is chosen that is >> likely >> >>> >> to >> >>> >> meet the goals. >> >>> >> >> >>> >> >> >>> >> Rejected Goals: >> >>> >> >> >>> >> - SIPs are not for detailed design. Design by committee doesn't >> work. >> >>> >> >> >>> >> - SIPs are not for every change. We dont need that much process. >> >>> >> >> >>> >> >> >>> >> Strategy: >> >>> >> >> >>> >> My suggestion is outlined as a Spark Improvement Proposal process >> >>> >> documented >> >>> >> at >> >>> >> >> >>> >> >> >>> >> >> >>> >> https://github.com/koeninger/spark-1/blob/SIP-0/docs/spark-i >> mprovement-proposals.md >> >>> >> >> >>> >> Specifics of Jira manipulation are an implementation detail we can >> >>> >> figure >> >>> >> out. >> >>> >> >> >>> >> I'm suggesting voting; the need here is for a _clear_ outcome. >> >>> >> >> >>> >> >> >>> >> Rejected Strategies: >> >>> >> >> >>> >> Having someone who understands the problem implement it first >> works, >> >>> >> but >> >>> >> only if significant iteration after user feedback is allowed. >> >>> >> >> >>> >> Historically this has been problematic due to pressure to limit >> public >> >>> >> api >> >>> >> changes. >> >>> >> >> >>> >> >> >>> >> On Fri, Oct 7, 2016 at 5:16 PM, Reynold Xin <[hidden email]> >> >>> >> wrote: >> >>> >>> >> >>> >>> Alright looks like there are quite a bit of support. We should >> wait >> >>> >>> to >> >>> >>> hear from more people too. >> >>> >>> >> >>> >>> To push this forward, Cody and I will be working together in the >> next >> >>> >>> couple of weeks to come up with a concrete, detailed proposal on >> what >> >>> >>> this >> >>> >>> entails, and then we can discuss this the specific proposal as >> well. >> >>> >>> >> >>> >>> >> >>> >>> On Fri, Oct 7, 2016 at 2:29 PM, Cody Koeninger <[hidden email]> >> >>> >>> wrote: >> >>> >>>> >> >>> >>>> Yeah, in case it wasn't clear, I was talking about SIPs for major >> >>> >>>> user-facing or cross-cutting changes, not minor feature adds. >> >>> >>>> >> >>> >>>> On Fri, Oct 7, 2016 at 3:58 PM, Stavros Kontopoulos >> >>> >>>> <[hidden email]> wrote: >> >>> >>>>> >> >>> >>>>> +1 to the SIP label as long as it does not slow down things and >> it >> >>> >>>>> targets optimizing efforts, coordination etc. For example really >> >>> >>>>> small >> >>> >>>>> features should not need to go through this process (assuming >> they >> >>> >>>>> dont >> >>> >>>>> touch public interfaces) or re-factorings and hope it will be >> kept >> >>> >>>>> this >> >>> >>>>> way. So as a guideline doc should be provided, like in the KIP >> >>> >>>>> case. >> >>> >>>>> >> >>> >>>>> IMHO so far aside from tagging things and linking them elsewhere >> >>> >>>>> simply >> >>> >>>>> having design docs and prototypes implementations in PRs is not >> >>> >>>>> something >> >>> >>>>> that has not worked so far. What is really a pain in many >> projects >> >>> >>>>> out there >> >>> >>>>> is discontinuity in progress of PRs, missing features, slow >> reviews >> >>> >>>>> which is >> >>> >>>>> understandable to some extent... it is not only about Spark but >> >>> >>>>> things can >> >>> >>>>> be improved for sure for this project in particular as already >> >>> >>>>> stated. >> >>> >>>>> >> >>> >>>>> On Fri, Oct 7, 2016 at 11:14 PM, Cody Koeninger <[hidden email]> >> >>> >>>>> wrote: >> >>> >>>>>> >> >>> >>>>>> +1 to adding an SIP label and linking it from the website. I >> >>> >>>>>> think >> >>> >>>>>> it >> >>> >>>>>> needs >> >>> >>>>>> >> >>> >>>>>> - template that focuses it towards soliciting user goals / non >> >>> >>>>>> goals >> >>> >>>>>> - clear resolution as to which strategy was chosen to pursue. >> I'd >> >>> >>>>>> recommend a vote. >> >>> >>>>>> >> >>> >>>>>> Matei asked me to clarify what I meant by changing interfaces, >> I >> >>> >>>>>> think >> >>> >>>>>> it's directly relevant to the SIP idea so I'll clarify here, >> and >> >>> >>>>>> split >> >>> >>>>>> a thread for the other discussion per Nicholas' request. >> >>> >>>>>> >> >>> >>>>>> I meant changing public user interfaces. I think the first >> design >> >>> >>>>>> is >> >>> >>>>>> unlikely to be right, because it's done at a time when you have >> >>> >>>>>> the >> >>> >>>>>> least information. As a user, I find it considerably more >> >>> >>>>>> frustrating >> >>> >>>>>> to be unable to use a tool to get my job done, than I do >> having to >> >>> >>>>>> make minor changes to my code in order to take advantage of >> >>> >>>>>> features. >> >>> >>>>>> I've seen committers be seriously reluctant to allow changes to >> >>> >>>>>> @experimental code that are needed in order for it to really >> work >> >>> >>>>>> right. You need to be able to iterate, and if people on both >> >>> >>>>>> sides >> >>> >>>>>> of >> >>> >>>>>> the fence aren't going to respect that some newer apis are >> subject >> >>> >>>>>> to >> >>> >>>>>> change, then why even mark them as such? >> >>> >>>>>> >> >>> >>>>>> Ideally a finished SIP should give me a checklist of things >> that >> >>> >>>>>> an >> >>> >>>>>> implementation must do, and things that it doesn't need to do. >> >>> >>>>>> Contributors/committers should be seriously discouraged from >> >>> >>>>>> putting >> >>> >>>>>> out a version 0.1 that doesn't have at least a prototype >> >>> >>>>>> implementation of all those things, especially if they're then >> >>> >>>>>> going >> >>> >>>>>> to argue against interface changes necessary to get the the >> rest >> >>> >>>>>> of >> >>> >>>>>> the things done in the 0.2 version. >> >>> >>>>>> >> >>> >>>>>> >> >>> >>>>>> On Fri, Oct 7, 2016 at 2:18 PM, Reynold Xin <[hidden email]> >> >>> >>>>>> wrote: >> >>> >>>>>>> I like the lightweight proposal to add a SIP label. >> >>> >>>>>>> >> >>> >>>>>>> During Spark 2.0 development, Tom (Graves) and I suggested >> using >> >>> >>>>>>> wiki >> >>> >>>>>>> to >> >>> >>>>>>> track the list of major changes, but that never really >> >>> >>>>>>> materialized >> >>> >>>>>>> due to >> >>> >>>>>>> the overhead. Adding a SIP label on major JIRAs and then link >> to >> >>> >>>>>>> them >> >>> >>>>>>> prominently on the Spark website makes a lot of sense. >> >>> >>>>>>> >> >>> >>>>>>> >> >>> >>>>>>> On Fri, Oct 7, 2016 at 10:50 AM, Matei Zaharia >> >>> >>>>>>> <[hidden email]> >> >>> >>>>>>> wrote: >> >>> >>>>>>>> >> >>> >>>>>>>> For the improvement proposals, I think one major point was to >> >>> >>>>>>>> make >> >>> >>>>>>>> them >> >>> >>>>>>>> really visible to users who are not contributors, so we >> should >> >>> >>>>>>>> do >> >>> >>>>>>>> more than >> >>> >>>>>>>> sending stuff to dev@. One very lightweight idea is to have >> a >> >>> >>>>>>>> new >> >>> >>>>>>>> type of >> >>> >>>>>>>> JIRA called a SIP and have a link to a filter that shows all >> >>> >>>>>>>> such >> >>> >>>>>>>> JIRAs from >> >>> >>>>>>>> http://spark.apache.org. I also like the idea of SIP and >> design >> >>> >>>>>>>> doc >> >>> >>>>>>>> templates (in fact many projects have them). >> >>> >>>>>>>> >> >>> >>>>>>>> Matei >> >>> >>>>>>>> >> >>> >>>>>>>> On Oct 7, 2016, at 10:38 AM, Reynold Xin <[hidden email]> >> >>> >>>>>>>> wrote: >> >>> >>>>>>>> >> >>> >>>>>>>> I called Cody last night and talked about some of the topics >> in >> >>> >>>>>>>> his >> >>> >>>>>>>> email. >> >>> >>>>>>>> It became clear to me Cody genuinely cares about the project. >> >>> >>>>>>>> >> >>> >>>>>>>> Some of the frustrations come from the success of the project >> >>> >>>>>>>> itself >> >>> >>>>>>>> becoming very "hot", and it is difficult to get clarity from >> >>> >>>>>>>> people >> >>> >>>>>>>> who >> >>> >>>>>>>> don't dedicate all their time to Spark. In fact, it is in >> some >> >>> >>>>>>>> ways >> >>> >>>>>>>> similar >> >>> >>>>>>>> to scaling an engineering team in a successful startup: old >> >>> >>>>>>>> processes that >> >>> >>>>>>>> worked well might not work so well when it gets to a certain >> >>> >>>>>>>> size, >> >>> >>>>>>>> cultures >> >>> >>>>>>>> can get diluted, building culture vs building process, etc. >> >>> >>>>>>>> >> >>> >>>>>>>> I also really like to have a more visible process for larger >> >>> >>>>>>>> changes, >> >>> >>>>>>>> especially major user facing API changes. Historically we >> upload >> >>> >>>>>>>> design docs >> >>> >>>>>>>> for major changes, but it is not always consistent and >> difficult >> >>> >>>>>>>> to >> >>> >>>>>>>> quality >> >>> >>>>>>>> of the docs, due to the volunteering nature of the >> organization. >> >>> >>>>>>>> >> >>> >>>>>>>> Some of the more concrete ideas we discussed focus on >> building a >> >>> >>>>>>>> culture >> >>> >>>>>>>> to improve clarity: >> >>> >>>>>>>> >> >>> >>>>>>>> - Process: Large changes should have design docs posted on >> JIRA. >> >>> >>>>>>>> One >> >>> >>>>>>>> thing >> >>> >>>>>>>> Cody and I didn't discuss but an idea that just came to me >> is we >> >>> >>>>>>>> should >> >>> >>>>>>>> create a design doc template for the project and ask >> everybody >> >>> >>>>>>>> to >> >>> >>>>>>>> follow. >> >>> >>>>>>>> The design doc template should also explicitly list goals and >> >>> >>>>>>>> non-goals, to >> >>> >>>>>>>> make design doc more consistent. >> >>> >>>>>>>> >> >>> >>>>>>>> - Process: Email dev@ to solicit feedback. We have some this >> >>> >>>>>>>> with >> >>> >>>>>>>> some >> >>> >>>>>>>> changes, but again very inconsistent. Just posting something >> on >> >>> >>>>>>>> JIRA >> >>> >>>>>>>> isn't >> >>> >>>>>>>> sufficient, because there are simply too many JIRAs and the >> >>> >>>>>>>> signal >> >>> >>>>>>>> get lost >> >>> >>>>>>>> in the noise. While this is generally impossible to enforce >> >>> >>>>>>>> because >> >>> >>>>>>>> we can't >> >>> >>>>>>>> force all volunteers to conform to a process (or they might >> not >> >>> >>>>>>>> even >> >>> >>>>>>>> be >> >>> >>>>>>>> aware of this), those who are more familiar with the project >> >>> >>>>>>>> can >> >>> >>>>>>>> help by >> >>> >>>>>>>> emailing the dev@ when they see something that hasn't been. >> >>> >>>>>>>> >> >>> >>>>>>>> - Culture: The design doc author(s) should be open to >> feedback. >> >>> >>>>>>>> A >> >>> >>>>>>>> design >> >>> >>>>>>>> doc should serve as the base for discussion and is by no >> means >> >>> >>>>>>>> the >> >>> >>>>>>>> final >> >>> >>>>>>>> design. Of course, this does not mean the author has to >> accept >> >>> >>>>>>>> every >> >>> >>>>>>>> feedback. They should also be comfortable accepting / >> rejecting >> >>> >>>>>>>> ideas on >> >>> >>>>>>>> technical grounds. >> >>> >>>>>>>> >> >>> >>>>>>>> - Process / Culture: For major ongoing projects, it can be >> >>> >>>>>>>> useful >> >>> >>>>>>>> to >> >>> >>>>>>>> have >> >>> >>>>>>>> some monthly Google hangouts that are open to the world. I am >> >>> >>>>>>>> actually not >> >>> >>>>>>>> sure how well this will work, because of the volunteering >> nature >> >>> >>>>>>>> and >> >>> >>>>>>>> we need >> >>> >>>>>>>> to adjust for timezones for people across the globe, but it >> >>> >>>>>>>> seems >> >>> >>>>>>>> worth >> >>> >>>>>>>> trying. >> >>> >>>>>>>> >> >>> >>>>>>>> - Culture: Contributors (including committers) should be more >> >>> >>>>>>>> direct >> >>> >>>>>>>> in >> >>> >>>>>>>> setting expectations, including whether they are working on a >> >>> >>>>>>>> specific >> >>> >>>>>>>> issue, whether they will be working on a specific issue, and >> >>> >>>>>>>> whether >> >>> >>>>>>>> an >> >>> >>>>>>>> issue or pr or jira should be rejected. Most people I know in >> >>> >>>>>>>> this >> >>> >>>>>>>> community >> >>> >>>>>>>> are nice and don't enjoy telling other people no, but it is >> >>> >>>>>>>> often >> >>> >>>>>>>> more >> >>> >>>>>>>> annoying to a contributor to not know anything than getting a >> >>> >>>>>>>> no. >> >>> >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> On Fri, Oct 7, 2016 at 10:03 AM, Matei Zaharia >> >>> >>>>>>>> <[hidden email]> >> >>> >>>>>>>> wrote: >> >>> >>>>>>>>> >> >>> >>>>>>>>> >> >>> >>>>>>>>> Love the idea of a more visible "Spark Improvement Proposal" >> >>> >>>>>>>>> process that >> >>> >>>>>>>>> solicits user input on new APIs. For what it's worth, I >> don't >> >>> >>>>>>>>> think >> >>> >>>>>>>>> committers are trying to minimize their own work -- every >> >>> >>>>>>>>> committer >> >>> >>>>>>>>> cares >> >>> >>>>>>>>> about making the software useful for users. However, it is >> >>> >>>>>>>>> always >> >>> >>>>>>>>> hard to >> >>> >>>>>>>>> get user input and so it helps to have this kind of process. >> >>> >>>>>>>>> I've >> >>> >>>>>>>>> certainly >> >>> >>>>>>>>> looked at the *IPs a lot in other software I use just to see >> >>> >>>>>>>>> the >> >>> >>>>>>>>> biggest >> >>> >>>>>>>>> things on the roadmap. >> >>> >>>>>>>>> >> >>> >>>>>>>>> When you're talking about "changing interfaces", are you >> >>> >>>>>>>>> talking >> >>> >>>>>>>>> about >> >>> >>>>>>>>> public or internal APIs? I do think many people hate >> changing >> >>> >>>>>>>>> public APIs >> >>> >>>>>>>>> and I actually think that's for the best of the project. >> That's >> >>> >>>>>>>>> a >> >>> >>>>>>>>> technical >> >>> >>>>>>>>> debate, but basically, the worst thing when you're using a >> >>> >>>>>>>>> piece >> >>> >>>>>>>>> of >> >>> >>>>>>>>> software >> >>> >>>>>>>>> is that the developers constantly ask you to rewrite your >> app >> >>> >>>>>>>>> to >> >>> >>>>>>>>> update to a >> >>> >>>>>>>>> new version (and thus benefit from bug fixes, etc). Cue >> anyone >> >>> >>>>>>>>> who's used >> >>> >>>>>>>>> Protobuf, or Guava. The "let's get everyone to change their >> >>> >>>>>>>>> code >> >>> >>>>>>>>> this >> >>> >>>>>>>>> release" model works well within a single large company, but >> >>> >>>>>>>>> doesn't work >> >>> >>>>>>>>> well for a community, which is why nearly all *very* widely >> >>> >>>>>>>>> used >> >>> >>>>>>>>> programming >> >>> >>>>>>>>> interfaces (I'm talking things like Java standard library, >> >>> >>>>>>>>> Windows >> >>> >>>>>>>>> API, etc) >> >>> >>>>>>>>> almost *never* break backwards compatibility. All this is >> done >> >>> >>>>>>>>> within reason >> >>> >>>>>>>>> though, e.g. we do change things in major releases (2.x, >> 3.x, >> >>> >>>>>>>>> etc). >> >>> >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>>> >> >>> >>>>>>> >> >>> >>>>>> >> >>> >>>>>> >> >>> >>>>>> >> >>> >>>>>> ------------------------------------------------------------ >> --------- >> >>> >>>>>> To unsubscribe e-mail: [hidden email] >> >>> >>>>>> >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> >> >>> >>>>> -- >> >>> >>>>> Stavros Kontopoulos >> >>> >>>>> Senior Software Engineer >> >>> >>>>> Lightbend, Inc. >> >>> >>>>> p: +30 6977967274 >> >>> >>>>> e: [hidden email] >> >>> >>>>> >> >>> >>>>> >> >>> >>>> >> >>> >>> >> >>> >> >> >>> >> >> >>> >> >> >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe e-mail: [hidden email] >> > >> > >> > ________________________________ >> > >> > If you reply to this email, your message will be added to the discussion >> > below: >> > >> > http://apache-spark-developers-list.1001551.n3.nabble.com/ >> Spark-Improvement-Proposals-tp19268p19359.html >> > >> > To start a new topic under Apache Spark Developers List, email [hidden >> > email] >> > To unsubscribe from Apache Spark Developers List, click here. >> > NAML >> > >> > >> > ________________________________ >> > View this message in context: RE: Spark Improvement Proposals >> > Sent from the Apache Spark Developers List mailing list archive at >> > Nabble.com. >> >> --------------------------------------------------------------------- >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org >> >> > > > -- > Ryan Blue > Software Engineer > Netflix > -- Ryan Blue Software Engineer Netflix