Hi Eric and Carlo,

Thanks for taking the initiative! I am willing to take this task up for
improving the Ozone codebase.

I have cloned the task and sub-tasks for Ozone -
https://issues.apache.org/jira/browse/HDDS-4050

- Vivek Subramanian

On Thu, Jul 30, 2020 at 3:54 PM Eric Badger
<ebad...@verizonmedia.com.invalid> wrote:

> Thanks for the responses, Jon and Carlo!
>
> It makes sense to me to prevent future patches from re-introducing the
> terminology. I can file a JIRA to add the +1/-1 functionality to the
> precommit builds.
>
> As for splitting up the work, I think it'll probably be easiest and
> cleanest to have an umbrella for each subproject of Hadoop (Hadoop, HDFS,
> YARN, Mapreduce) with smaller tasks (e.g. whitelist/blacklist,
> master/slave) as subtasks of each umbrella. That way each expert can chime
> in on their relative land of expertise and the patches won't be gigantic. I
> can then link the umbrella JIRAs together so everything can be found
> easily. As Carlo pointed out, it's unclear whether fewer, but larger
> patches is better or worse than more, smaller patches. But I think that at
> least for the sake of manageability and getting this into Apache, smaller
> patches is likely easier.
>
> Eric
>
> On Thu, Jul 30, 2020 at 5:50 PM Carlo Aldo Curino <carlo.cur...@gmail.com>
> wrote:
>
> > Thanks again Eric for leading the charge. As for whether to chop it up or
> > keep it in fewer patches, I think it primarily impact the conflict
> surface
> > with dev branches and other in-flight development. More patches are
> likely
> > creating more localized clashes (as in I clash with a smaller patch,
> which
> > might be less daunting, though potentially more of them to deal with). I
> > don't have a strong preference, maybe chunking it into reasonable
> packages,
> > so that you can involve the right core group of committers to way in for
> > each sub-area.
> >
> > Thanks,
> > Carlo
> >
> >
> >
> > On Thu, Jul 30, 2020 at 1:20 PM Jonathan Eagles <jeag...@gmail.com>
> wrote:
> >
> > > Thanks, Eric. I like this proposal and I'm glad this work is getting
> > > traction. A few thoughts on implementation.
> > >
> > > Once the fix is done, I think it will be necessary to ensure these
> > > language restrictions are enforced at the patch level. This will +1/-1
> > > patches that introduce terminology that violate our policy.
> > >
> > > As to splitting up the patches, it may be necessary to to split these
> up
> > > further in cases where feature experts need to weigh in on
> compatibility
> > > (usually with regards to persistence or wire compatibility). This can
> be
> > > done case-by-case basis.
> > >
> > > Regards,
> > > jeagles
> > >
> > > On Thu, Jul 30, 2020 at 1:28 PM Eric Badger
> > > <ebad...@verizonmedia.com.invalid> wrote:
> > >
> > >> I have created
> >
> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HADOOP-17168__;!!Op6eflyXZCqGR5I!XjCu5VSFdt2uqyuzlkc53KSBa6IM-M2Wun_FX6uD8fl99OAvaj9wb-0kz4fK$
> > to
> > >> remove
> > >> non-inclusive terminology from Hadoop. However I would like input on
> how
> > >> to
> > >> go about putting up patches. This umbrella JIRA is under Hadoop
> Common,
> > >> but
> > >> there are sure to be instances in YARN, HDFS, and Mapreduce. Should I
> > >> create an umbrella like this for each subproject? Or should I do all
> > >> whitelist/blacklist fixes in a single JIRA that fixes them across all
> > >> Hadoop subprojects?
> > >>
> > >> Thanks,
> > >>
> > >> Eric
> > >>
> > >> On Thu, Jul 30, 2020 at 8:47 AM Carlo Aldo Curino <
> > carlo.cur...@gmail.com
> > >> >
> > >> wrote:
> > >>
> > >> > RE Mentorship: I think the Mentorship program is an interesting
> idea.
> > >> The
> > >> > concerns with these efforts is always the follow-through. If you can
> > >> find a
> > >> > group of folks that are motivated and will work on this I think it
> > >> could be
> > >> > a great idea, especially if you focus on a diverse set of mentees,
> and
> > >> the
> > >> > focus in on teaching not just code but a bit of the "apache way" of
> > >> > interacting, and conducting yourself in open-source.
> > >> >
> > >> > RE Diversity and representation: Wei-Chiu I think you raise an
> > important
> > >> > problem. The main force behind this is typically for a company to be
> > >> deeply
> > >> > invested in a project and valuing OSS  and putting lots full-time
> > >> > developers on it. Those will naturally become committers. On one
> side
> > >> this
> > >> > is good to the project, unless it becomes so unbalance that the OSS
> > >> nature
> > >> > of the effort is in question. Attracting more contributors across
> > >> > companies/countries (and any other dimension of diversity is
> > important)
> > >> > @Vinod I am sure you have been thinking about this issue, any
> > thoughts?
> > >> >
> > >> > Thanks,
> > >> > Carlo
> > >> >
> > >> > On Fri, Jul 10, 2020 at 1:49 PM Ahmed Hussein <a...@ahussein.me>
> wrote:
> > >> >
> > >> >> +1, this is great folks.
> > >> >>
> > >> >> In addition to that initiative, Do you think there is a chance to
> > >> launch
> > >> >> a "*Hadoop Mentorship Program for Minority Students*"
> > >> >>
> > >> >> *The program will work as follows:*
> > >> >>
> > >> >>    - Define a programme committee to administrate and mentor
> > >> candidates.
> > >> >>    - The Committee defines a timeline for applications and
> projects.
> > >> >>    Let's say it is some sort of 3 months. (Similar to an
> internship)
> > >> >>    - Define a list of ideas/projects that can be picked by the
> > >> candidates
> > >> >>    - Candidates can propose their idea as well. This can be a good
> > way
> > >> >>    to inject new blood and research ideas into Hadoop.
> > >> >>    - Pick top top applications and assign them to mentors.
> > >> >>    - If sponsors can allocate money, then candidates with good
> > >> >>    evaluation can get some sort of prize. If no money is allocated,
> > >> then we
> > >> >>    can discuss any other kind of motivation.
> > >> >>
> > >> >> I remember there were Student Mentorship programmes in Open source
> > >> >> projects like "JikesRVM" and several proposals were actually merged
> > >> and/or
> > >> >> transformed into publications.
> > >> >> There are many missing links that need to be filled like how to
> > define
> > >> >> the target and the audience of the programme
> > >> >>
> > >> >> Let me know WDYT guys.
> > >> >>
> > >> >> On Fri, Jul 10, 2020 at 1:45 PM Wei-Chiu Chuang <
> weic...@apache.org>
> > >> >> wrote:
> > >> >>
> > >> >>> Thanks Carlo and Eric for the initiative.
> > >> >>>
> > >> >>> I am all for it and I'll do my part to mind the code. This is a
> > small
> > >> yet
> > >> >>> meaningful step we can take. Meanwhile, I'd like to take this
> > >> opportunity
> > >> >>> to open up conversation around the Diversity & Inclusion within
> the
> > >> >>> community.
> > >> >>>
> > >> >>> If you read this quarter's Hadoop board report, I am starting to
> > >> collect
> > >> >>> metrics about the composition of our community in order to
> > understand
> > >> if
> > >> >>> we
> > >> >>> are building a diverse & inclusive community. Things that are
> > obvious
> > >> to
> > >> >>> me
> > >> >>> that I thought I should report are the following: affiliation
> among
> > >> >>> commiters, and demographics of committers. As of last quarter, 4
> out
> > >> of 7
> > >> >>> newly minted committers are affiliated with Cloudera. 4 out of
> the 7
> > >> said
> > >> >>> committers are located in Asia. Those facts suggest we have a good
> > >> >>> international participation (I am being US-centric), which is
> good.
> > >> >>> However, having half of the active committers affiliated with one
> > >> company
> > >> >>> is a potential problem.
> > >> >>>
> > >> >>> I'd like to hear your thoughts on this. What other metrics should
> we
> > >> >>> collect, and what actions can we take.
> > >> >>>
> > >> >>>
> > >> >>>
> > >> >>> On Fri, Jul 10, 2020 at 11:29 AM Carlo Aldo Curino <
> > >> >>> carlo.cur...@gmail.com>
> > >> >>> wrote:
> > >> >>>
> > >> >>> > Eric,
> > >> >>> >
> > >> >>> > Thank you so much for the support and for stepping up offering
> to
> > >> work
> > >> >>> on
> > >> >>> > this. I am super +1 on this. Let's give folks a few more days to
> > >> chime
> > >> >>> in,
> > >> >>> > in case there is anything to discuss before we get cracking!
> > >> >>> >
> > >> >>> > (Really) Thanks,
> > >> >>> > Carlo
> > >> >>> >
> > >> >>> > On Fri, Jul 10, 2020, 10:38 AM Eric Badger <
> > >> ebad...@verizonmedia.com>
> > >> >>> > wrote:
> > >> >>> >
> > >> >>> > > Thanks for writing this up, Carlo. I'm +1 (idk if I'm
> > technically
> > >> >>> binding
> > >> >>> > > on this or not) for the changes moving forward and I think we
> > >> >>> refactor
> > >> >>> > away
> > >> >>> > > any instances that are internal to the code (i.e. not APIs or
> > >> other
> > >> >>> > things
> > >> >>> > > that would break compatibility) in all active branches and
> then
> > >> also
> > >> >>> > change
> > >> >>> > > the APIs in trunk (an incompatible change).
> > >> >>> > >
> > >> >>> > > I just came across an internal issue related to the NM
> > >> >>> > > whitelist/blacklist. I would be happy to go refactor the code
> > and
> > >> >>> look
> > >> >>> > for
> > >> >>> > > instances of these and replace them with allowlist/blocklist.
> > >> Doing a
> > >> >>> > quick
> > >> >>> > > "git grep" of trunk, I see 270 instances of "whitelist" and
> 1318
> > >> >>> > instances
> > >> >>> > > of "blacklist".
> > >> >>> > >
> > >> >>> > > If there are no objections, I'll create a JIRA to clean this
> > >> specific
> > >> >>> > > stuff up. It would be wonderful if others could pick up a
> > >> different
> > >> >>> > portion
> > >> >>> > > (e.g. master/slave) so that we can spread the work out.
> > >> >>> > >
> > >> >>> > > Eric
> > >> >>> > >
> > >> >>> > > On Tue, Jul 7, 2020 at 6:27 PM Carlo Aldo Curino <
> > >> >>> carlo.cur...@gmail.com
> > >> >>> > >
> > >> >>> > > wrote:
> > >> >>> > >
> > >> >>> > >> Hello Folks,
> > >> >>> > >>
> > >> >>> > >> I hope you are all doing well...
> > >> >>> > >>
> > >> >>> > >> *The problem*
> > >> >>> > >> The recent protests made me realize that we are not just a
> > >> >>> bystanders of
> > >> >>> > >> the systematic racism that affect our society, but we are
> > active
> > >> >>> > >> participants of it. Being "non-racist" is not enough, I
> > strongly
> > >> >>> feel we
> > >> >>> > >> should be actively "anti-racist" in our day to day lives, and
> > >> >>> > continuously
> > >> >>> > >> check our biases. I assume most of you will agree with the
> > >> general
> > >> >>> > >> sentiment, but based on your exposure to the recent events
> and
> > US
> > >> >>> > >> culture/history might have more or less strong feelings about
> > >> your
> > >> >>> role
> > >> >>> > in
> > >> >>> > >> the problem and potential solution.
> > >> >>> > >>
> > >> >>> > >> *What can we do about it?* I think a simple action we can
> take
> > >> is to
> > >> >>> > work
> > >> >>> > >> on our code/comments/documentation/websites and remove racist
> > >> >>> > terminology.
> > >> >>> > >> Here is a IETF draft to fix up some of the most egregious
> > >> examples
> > >> >>> > >> (master/slave, whitelist/backlist) with proposed
> alternatives.
> > >> >>> > >>
> > >> >>> > >>
> > >> >>> >
> > >> >>>
> > >>
> >
> https://urldefense.com/v3/__https://tools.ietf.org/id/draft-knodel-terminology-00.html*rfc.section.1.1.1__;Iw!!Op6eflyXZCqGR5I!XjCu5VSFdt2uqyuzlkc53KSBa6IM-M2Wun_FX6uD8fl99OAvaj9wb5A12dpg$
> > >> >>> <
> > >>
> >
> https://urldefense.com/v3/__https://tools.ietf.org/id/draft-knodel-terminology-00.html*rfc.section.1.1.1__;Iw!!Op6eflyXZCqGR5I!W9THsx9iZb2VObBrVY5_8ZRJKCws3YRAXARB-YTUElcUtxOBPWpiHWfGaWE7Lbogn7k$
> > >> >
> > >> >>> > >> Also as we go about this effort, we should also consider
> other
> > >> >>> > >> "non-inclusive" terminology issues around gender (e.g.,
> binary
> > >> >>> gendered
> > >> >>> > >> examples, "Alice" doing the wrong security thing
> > systematically),
> > >> >>> and
> > >> >>> > >> ableism (e.g., referring to misbehaving hardware as "lame" or
> > >> >>> "limping",
> > >> >>> > >> etc.).
> > >> >>> > >> The easiest action item is to avoid this going forward
> (ideally
> > >> >>> adding
> > >> >>> > it
> > >> >>> > >> to the checkstyles if possible), a more costly one is to
> start
> > >> going
> > >> >>> > back
> > >> >>> > >> and refactor away existing instances.
> > >> >>> > >>
> > >> >>> > >> I know this requires a bunch of work as refactorings might
> > break
> > >> dev
> > >> >>> > >> branches and non-committed patches, possibly scripts, etc.
> but
> > I
> > >> >>> think
> > >> >>> > >> this
> > >> >>> > >> is something important and relatively simple we can do. The
> > >> effect
> > >> >>> goes
> > >> >>> > >> well beyond some text in github, it signals what we believe
> in,
> > >> and
> > >> >>> > forces
> > >> >>> > >> hundreds of users and contributors to notice and think about
> > it.
> > >> Our
> > >> >>> > >> force-multiplier is huge and it matches our responsibility.
> > >> >>> > >>
> > >> >>> > >> What do you folks think?
> > >> >>> > >>
> > >> >>> > >> Thanks,
> > >> >>> > >> Carlo
> > >> >>> > >>
> > >> >>> > >
> > >> >>> >
> > >> >>>
> > >> >>
> > >> >>
> > >> >> --
> > >> >> --
> > >> >> Best Regards,
> > >> >>
> > >> >> *Ahmed Hussein, PhD*
> > >> >>
> > >> >
> > >>
> > >
> >
>

Reply via email to