https://issues.apache.org/jira/browse/HADOOP-17169

I don't really know who the people are to review this patch, but it removes
all non-inclusive terminology from Hadoop Common. This ends up changing
some things in some other projects (mostly HDFS) as well since they depend
on stuff from hadoop-common. I believe patch 003 is ready for review and
would appreciate some experts letting me know if anything is a bad change
code-wise.

Eric

On Thu, Jul 30, 2020 at 5:59 PM Vivek Ratnavel Subramanian
<vsubraman...@cloudera.com.invalid> wrote:

> Hi Eric and Carlo,
>
> Thanks for taking the initiative! I am willing to take this task up for
> improving the Ozone codebase.
>
> I have cloned the task and sub-tasks for Ozone -
>
> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HDDS-4050__;!!Op6eflyXZCqGR5I!VdShoY1ZPxJVYhyFFCSuGX4gU-2R6sHAr7G_HH0W5YjeJluizw7npVPF4ULP$
>
> - Vivek Subramanian
>
> On Thu, Jul 30, 2020 at 3:54 PM Eric Badger
> <ebad...@verizonmedia.com.invalid> wrote:
>
> > Thanks for the responses, Jon and Carlo!
> >
> > It makes sense to me to prevent future patches from re-introducing the
> > terminology. I can file a JIRA to add the +1/-1 functionality to the
> > precommit builds.
> >
> > As for splitting up the work, I think it'll probably be easiest and
> > cleanest to have an umbrella for each subproject of Hadoop (Hadoop, HDFS,
> > YARN, Mapreduce) with smaller tasks (e.g. whitelist/blacklist,
> > master/slave) as subtasks of each umbrella. That way each expert can
> chime
> > in on their relative land of expertise and the patches won't be
> gigantic. I
> > can then link the umbrella JIRAs together so everything can be found
> > easily. As Carlo pointed out, it's unclear whether fewer, but larger
> > patches is better or worse than more, smaller patches. But I think that
> at
> > least for the sake of manageability and getting this into Apache, smaller
> > patches is likely easier.
> >
> > Eric
> >
> > On Thu, Jul 30, 2020 at 5:50 PM Carlo Aldo Curino <
> carlo.cur...@gmail.com>
> > wrote:
> >
> > > Thanks again Eric for leading the charge. As for whether to chop it up
> or
> > > keep it in fewer patches, I think it primarily impact the conflict
> > surface
> > > with dev branches and other in-flight development. More patches are
> > likely
> > > creating more localized clashes (as in I clash with a smaller patch,
> > which
> > > might be less daunting, though potentially more of them to deal with).
> I
> > > don't have a strong preference, maybe chunking it into reasonable
> > packages,
> > > so that you can involve the right core group of committers to way in
> for
> > > each sub-area.
> > >
> > > Thanks,
> > > Carlo
> > >
> > >
> > >
> > > On Thu, Jul 30, 2020 at 1:20 PM Jonathan Eagles <jeag...@gmail.com>
> > wrote:
> > >
> > > > Thanks, Eric. I like this proposal and I'm glad this work is getting
> > > > traction. A few thoughts on implementation.
> > > >
> > > > Once the fix is done, I think it will be necessary to ensure these
> > > > language restrictions are enforced at the patch level. This will
> +1/-1
> > > > patches that introduce terminology that violate our policy.
> > > >
> > > > As to splitting up the patches, it may be necessary to to split these
> > up
> > > > further in cases where feature experts need to weigh in on
> > compatibility
> > > > (usually with regards to persistence or wire compatibility). This can
> > be
> > > > done case-by-case basis.
> > > >
> > > > Regards,
> > > > jeagles
> > > >
> > > > On Thu, Jul 30, 2020 at 1:28 PM Eric Badger
> > > > <ebad...@verizonmedia.com.invalid> wrote:
> > > >
> > > >> I have created
> > >
> >
> https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HADOOP-17168__;!!Op6eflyXZCqGR5I!XjCu5VSFdt2uqyuzlkc53KSBa6IM-M2Wun_FX6uD8fl99OAvaj9wb-0kz4fK$
> > > to
> > > >> remove
> > > >> non-inclusive terminology from Hadoop. However I would like input on
> > how
> > > >> to
> > > >> go about putting up patches. This umbrella JIRA is under Hadoop
> > Common,
> > > >> but
> > > >> there are sure to be instances in YARN, HDFS, and Mapreduce. Should
> I
> > > >> create an umbrella like this for each subproject? Or should I do all
> > > >> whitelist/blacklist fixes in a single JIRA that fixes them across
> all
> > > >> Hadoop subprojects?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Eric
> > > >>
> > > >> On Thu, Jul 30, 2020 at 8:47 AM Carlo Aldo Curino <
> > > carlo.cur...@gmail.com
> > > >> >
> > > >> wrote:
> > > >>
> > > >> > RE Mentorship: I think the Mentorship program is an interesting
> > idea.
> > > >> The
> > > >> > concerns with these efforts is always the follow-through. If you
> can
> > > >> find a
> > > >> > group of folks that are motivated and will work on this I think it
> > > >> could be
> > > >> > a great idea, especially if you focus on a diverse set of mentees,
> > and
> > > >> the
> > > >> > focus in on teaching not just code but a bit of the "apache way"
> of
> > > >> > interacting, and conducting yourself in open-source.
> > > >> >
> > > >> > RE Diversity and representation: Wei-Chiu I think you raise an
> > > important
> > > >> > problem. The main force behind this is typically for a company to
> be
> > > >> deeply
> > > >> > invested in a project and valuing OSS  and putting lots full-time
> > > >> > developers on it. Those will naturally become committers. On one
> > side
> > > >> this
> > > >> > is good to the project, unless it becomes so unbalance that the
> OSS
> > > >> nature
> > > >> > of the effort is in question. Attracting more contributors across
> > > >> > companies/countries (and any other dimension of diversity is
> > > important)
> > > >> > @Vinod I am sure you have been thinking about this issue, any
> > > thoughts?
> > > >> >
> > > >> > Thanks,
> > > >> > Carlo
> > > >> >
> > > >> > On Fri, Jul 10, 2020 at 1:49 PM Ahmed Hussein <a...@ahussein.me>
> > wrote:
> > > >> >
> > > >> >> +1, this is great folks.
> > > >> >>
> > > >> >> In addition to that initiative, Do you think there is a chance to
> > > >> launch
> > > >> >> a "*Hadoop Mentorship Program for Minority Students*"
> > > >> >>
> > > >> >> *The program will work as follows:*
> > > >> >>
> > > >> >>    - Define a programme committee to administrate and mentor
> > > >> candidates.
> > > >> >>    - The Committee defines a timeline for applications and
> > projects.
> > > >> >>    Let's say it is some sort of 3 months. (Similar to an
> > internship)
> > > >> >>    - Define a list of ideas/projects that can be picked by the
> > > >> candidates
> > > >> >>    - Candidates can propose their idea as well. This can be a
> good
> > > way
> > > >> >>    to inject new blood and research ideas into Hadoop.
> > > >> >>    - Pick top top applications and assign them to mentors.
> > > >> >>    - If sponsors can allocate money, then candidates with good
> > > >> >>    evaluation can get some sort of prize. If no money is
> allocated,
> > > >> then we
> > > >> >>    can discuss any other kind of motivation.
> > > >> >>
> > > >> >> I remember there were Student Mentorship programmes in Open
> source
> > > >> >> projects like "JikesRVM" and several proposals were actually
> merged
> > > >> and/or
> > > >> >> transformed into publications.
> > > >> >> There are many missing links that need to be filled like how to
> > > define
> > > >> >> the target and the audience of the programme
> > > >> >>
> > > >> >> Let me know WDYT guys.
> > > >> >>
> > > >> >> On Fri, Jul 10, 2020 at 1:45 PM Wei-Chiu Chuang <
> > weic...@apache.org>
> > > >> >> wrote:
> > > >> >>
> > > >> >>> Thanks Carlo and Eric for the initiative.
> > > >> >>>
> > > >> >>> I am all for it and I'll do my part to mind the code. This is a
> > > small
> > > >> yet
> > > >> >>> meaningful step we can take. Meanwhile, I'd like to take this
> > > >> opportunity
> > > >> >>> to open up conversation around the Diversity & Inclusion within
> > the
> > > >> >>> community.
> > > >> >>>
> > > >> >>> If you read this quarter's Hadoop board report, I am starting to
> > > >> collect
> > > >> >>> metrics about the composition of our community in order to
> > > understand
> > > >> if
> > > >> >>> we
> > > >> >>> are building a diverse & inclusive community. Things that are
> > > obvious
> > > >> to
> > > >> >>> me
> > > >> >>> that I thought I should report are the following: affiliation
> > among
> > > >> >>> commiters, and demographics of committers. As of last quarter, 4
> > out
> > > >> of 7
> > > >> >>> newly minted committers are affiliated with Cloudera. 4 out of
> > the 7
> > > >> said
> > > >> >>> committers are located in Asia. Those facts suggest we have a
> good
> > > >> >>> international participation (I am being US-centric), which is
> > good.
> > > >> >>> However, having half of the active committers affiliated with
> one
> > > >> company
> > > >> >>> is a potential problem.
> > > >> >>>
> > > >> >>> I'd like to hear your thoughts on this. What other metrics
> should
> > we
> > > >> >>> collect, and what actions can we take.
> > > >> >>>
> > > >> >>>
> > > >> >>>
> > > >> >>> On Fri, Jul 10, 2020 at 11:29 AM Carlo Aldo Curino <
> > > >> >>> carlo.cur...@gmail.com>
> > > >> >>> wrote:
> > > >> >>>
> > > >> >>> > Eric,
> > > >> >>> >
> > > >> >>> > Thank you so much for the support and for stepping up offering
> > to
> > > >> work
> > > >> >>> on
> > > >> >>> > this. I am super +1 on this. Let's give folks a few more days
> to
> > > >> chime
> > > >> >>> in,
> > > >> >>> > in case there is anything to discuss before we get cracking!
> > > >> >>> >
> > > >> >>> > (Really) Thanks,
> > > >> >>> > Carlo
> > > >> >>> >
> > > >> >>> > On Fri, Jul 10, 2020, 10:38 AM Eric Badger <
> > > >> ebad...@verizonmedia.com>
> > > >> >>> > wrote:
> > > >> >>> >
> > > >> >>> > > Thanks for writing this up, Carlo. I'm +1 (idk if I'm
> > > technically
> > > >> >>> binding
> > > >> >>> > > on this or not) for the changes moving forward and I think
> we
> > > >> >>> refactor
> > > >> >>> > away
> > > >> >>> > > any instances that are internal to the code (i.e. not APIs
> or
> > > >> other
> > > >> >>> > things
> > > >> >>> > > that would break compatibility) in all active branches and
> > then
> > > >> also
> > > >> >>> > change
> > > >> >>> > > the APIs in trunk (an incompatible change).
> > > >> >>> > >
> > > >> >>> > > I just came across an internal issue related to the NM
> > > >> >>> > > whitelist/blacklist. I would be happy to go refactor the
> code
> > > and
> > > >> >>> look
> > > >> >>> > for
> > > >> >>> > > instances of these and replace them with
> allowlist/blocklist.
> > > >> Doing a
> > > >> >>> > quick
> > > >> >>> > > "git grep" of trunk, I see 270 instances of "whitelist" and
> > 1318
> > > >> >>> > instances
> > > >> >>> > > of "blacklist".
> > > >> >>> > >
> > > >> >>> > > If there are no objections, I'll create a JIRA to clean this
> > > >> specific
> > > >> >>> > > stuff up. It would be wonderful if others could pick up a
> > > >> different
> > > >> >>> > portion
> > > >> >>> > > (e.g. master/slave) so that we can spread the work out.
> > > >> >>> > >
> > > >> >>> > > Eric
> > > >> >>> > >
> > > >> >>> > > On Tue, Jul 7, 2020 at 6:27 PM Carlo Aldo Curino <
> > > >> >>> carlo.cur...@gmail.com
> > > >> >>> > >
> > > >> >>> > > wrote:
> > > >> >>> > >
> > > >> >>> > >> Hello Folks,
> > > >> >>> > >>
> > > >> >>> > >> I hope you are all doing well...
> > > >> >>> > >>
> > > >> >>> > >> *The problem*
> > > >> >>> > >> The recent protests made me realize that we are not just a
> > > >> >>> bystanders of
> > > >> >>> > >> the systematic racism that affect our society, but we are
> > > active
> > > >> >>> > >> participants of it. Being "non-racist" is not enough, I
> > > strongly
> > > >> >>> feel we
> > > >> >>> > >> should be actively "anti-racist" in our day to day lives,
> and
> > > >> >>> > continuously
> > > >> >>> > >> check our biases. I assume most of you will agree with the
> > > >> general
> > > >> >>> > >> sentiment, but based on your exposure to the recent events
> > and
> > > US
> > > >> >>> > >> culture/history might have more or less strong feelings
> about
> > > >> your
> > > >> >>> role
> > > >> >>> > in
> > > >> >>> > >> the problem and potential solution.
> > > >> >>> > >>
> > > >> >>> > >> *What can we do about it?* I think a simple action we can
> > take
> > > >> is to
> > > >> >>> > work
> > > >> >>> > >> on our code/comments/documentation/websites and remove
> racist
> > > >> >>> > terminology.
> > > >> >>> > >> Here is a IETF draft to fix up some of the most egregious
> > > >> examples
> > > >> >>> > >> (master/slave, whitelist/backlist) with proposed
> > alternatives.
> > > >> >>> > >>
> > > >> >>> > >>
> > > >> >>> >
> > > >> >>>
> > > >>
> > >
> >
> https://urldefense.com/v3/__https://tools.ietf.org/id/draft-knodel-terminology-00.html*rfc.section.1.1.1__;Iw!!Op6eflyXZCqGR5I!XjCu5VSFdt2uqyuzlkc53KSBa6IM-M2Wun_FX6uD8fl99OAvaj9wb5A12dpg$
> > > >> >>> <
> > > >>
> > >
> >
> https://urldefense.com/v3/__https://tools.ietf.org/id/draft-knodel-terminology-00.html*rfc.section.1.1.1__;Iw!!Op6eflyXZCqGR5I!W9THsx9iZb2VObBrVY5_8ZRJKCws3YRAXARB-YTUElcUtxOBPWpiHWfGaWE7Lbogn7k$
> > > >> >
> > > >> >>> > >> Also as we go about this effort, we should also consider
> > other
> > > >> >>> > >> "non-inclusive" terminology issues around gender (e.g.,
> > binary
> > > >> >>> gendered
> > > >> >>> > >> examples, "Alice" doing the wrong security thing
> > > systematically),
> > > >> >>> and
> > > >> >>> > >> ableism (e.g., referring to misbehaving hardware as "lame"
> or
> > > >> >>> "limping",
> > > >> >>> > >> etc.).
> > > >> >>> > >> The easiest action item is to avoid this going forward
> > (ideally
> > > >> >>> adding
> > > >> >>> > it
> > > >> >>> > >> to the checkstyles if possible), a more costly one is to
> > start
> > > >> going
> > > >> >>> > back
> > > >> >>> > >> and refactor away existing instances.
> > > >> >>> > >>
> > > >> >>> > >> I know this requires a bunch of work as refactorings might
> > > break
> > > >> dev
> > > >> >>> > >> branches and non-committed patches, possibly scripts, etc.
> > but
> > > I
> > > >> >>> think
> > > >> >>> > >> this
> > > >> >>> > >> is something important and relatively simple we can do. The
> > > >> effect
> > > >> >>> goes
> > > >> >>> > >> well beyond some text in github, it signals what we believe
> > in,
> > > >> and
> > > >> >>> > forces
> > > >> >>> > >> hundreds of users and contributors to notice and think
> about
> > > it.
> > > >> Our
> > > >> >>> > >> force-multiplier is huge and it matches our responsibility.
> > > >> >>> > >>
> > > >> >>> > >> What do you folks think?
> > > >> >>> > >>
> > > >> >>> > >> Thanks,
> > > >> >>> > >> Carlo
> > > >> >>> > >>
> > > >> >>> > >
> > > >> >>> >
> > > >> >>>
> > > >> >>
> > > >> >>
> > > >> >> --
> > > >> >> --
> > > >> >> Best Regards,
> > > >> >>
> > > >> >> *Ahmed Hussein, PhD*
> > > >> >>
> > > >> >
> > > >>
> > > >
> > >
> >
>

Reply via email to