https://issues.apache.org/jira/browse/HADOOP-17169
I don't really know who the people are to review this patch, but it removes all non-inclusive terminology from Hadoop Common. This ends up changing some things in some other projects (mostly HDFS) as well since they depend on stuff from hadoop-common. I believe patch 003 is ready for review and would appreciate some experts letting me know if anything is a bad change code-wise. Eric On Thu, Jul 30, 2020 at 5:59 PM Vivek Ratnavel Subramanian <vsubraman...@cloudera.com.invalid> wrote: > Hi Eric and Carlo, > > Thanks for taking the initiative! I am willing to take this task up for > improving the Ozone codebase. > > I have cloned the task and sub-tasks for Ozone - > > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HDDS-4050__;!!Op6eflyXZCqGR5I!VdShoY1ZPxJVYhyFFCSuGX4gU-2R6sHAr7G_HH0W5YjeJluizw7npVPF4ULP$ > > - Vivek Subramanian > > On Thu, Jul 30, 2020 at 3:54 PM Eric Badger > <ebad...@verizonmedia.com.invalid> wrote: > > > Thanks for the responses, Jon and Carlo! > > > > It makes sense to me to prevent future patches from re-introducing the > > terminology. I can file a JIRA to add the +1/-1 functionality to the > > precommit builds. > > > > As for splitting up the work, I think it'll probably be easiest and > > cleanest to have an umbrella for each subproject of Hadoop (Hadoop, HDFS, > > YARN, Mapreduce) with smaller tasks (e.g. whitelist/blacklist, > > master/slave) as subtasks of each umbrella. That way each expert can > chime > > in on their relative land of expertise and the patches won't be > gigantic. I > > can then link the umbrella JIRAs together so everything can be found > > easily. As Carlo pointed out, it's unclear whether fewer, but larger > > patches is better or worse than more, smaller patches. But I think that > at > > least for the sake of manageability and getting this into Apache, smaller > > patches is likely easier. > > > > Eric > > > > On Thu, Jul 30, 2020 at 5:50 PM Carlo Aldo Curino < > carlo.cur...@gmail.com> > > wrote: > > > > > Thanks again Eric for leading the charge. As for whether to chop it up > or > > > keep it in fewer patches, I think it primarily impact the conflict > > surface > > > with dev branches and other in-flight development. More patches are > > likely > > > creating more localized clashes (as in I clash with a smaller patch, > > which > > > might be less daunting, though potentially more of them to deal with). > I > > > don't have a strong preference, maybe chunking it into reasonable > > packages, > > > so that you can involve the right core group of committers to way in > for > > > each sub-area. > > > > > > Thanks, > > > Carlo > > > > > > > > > > > > On Thu, Jul 30, 2020 at 1:20 PM Jonathan Eagles <jeag...@gmail.com> > > wrote: > > > > > > > Thanks, Eric. I like this proposal and I'm glad this work is getting > > > > traction. A few thoughts on implementation. > > > > > > > > Once the fix is done, I think it will be necessary to ensure these > > > > language restrictions are enforced at the patch level. This will > +1/-1 > > > > patches that introduce terminology that violate our policy. > > > > > > > > As to splitting up the patches, it may be necessary to to split these > > up > > > > further in cases where feature experts need to weigh in on > > compatibility > > > > (usually with regards to persistence or wire compatibility). This can > > be > > > > done case-by-case basis. > > > > > > > > Regards, > > > > jeagles > > > > > > > > On Thu, Jul 30, 2020 at 1:28 PM Eric Badger > > > > <ebad...@verizonmedia.com.invalid> wrote: > > > > > > > >> I have created > > > > > > https://urldefense.com/v3/__https://issues.apache.org/jira/browse/HADOOP-17168__;!!Op6eflyXZCqGR5I!XjCu5VSFdt2uqyuzlkc53KSBa6IM-M2Wun_FX6uD8fl99OAvaj9wb-0kz4fK$ > > > to > > > >> remove > > > >> non-inclusive terminology from Hadoop. However I would like input on > > how > > > >> to > > > >> go about putting up patches. This umbrella JIRA is under Hadoop > > Common, > > > >> but > > > >> there are sure to be instances in YARN, HDFS, and Mapreduce. Should > I > > > >> create an umbrella like this for each subproject? Or should I do all > > > >> whitelist/blacklist fixes in a single JIRA that fixes them across > all > > > >> Hadoop subprojects? > > > >> > > > >> Thanks, > > > >> > > > >> Eric > > > >> > > > >> On Thu, Jul 30, 2020 at 8:47 AM Carlo Aldo Curino < > > > carlo.cur...@gmail.com > > > >> > > > > >> wrote: > > > >> > > > >> > RE Mentorship: I think the Mentorship program is an interesting > > idea. > > > >> The > > > >> > concerns with these efforts is always the follow-through. If you > can > > > >> find a > > > >> > group of folks that are motivated and will work on this I think it > > > >> could be > > > >> > a great idea, especially if you focus on a diverse set of mentees, > > and > > > >> the > > > >> > focus in on teaching not just code but a bit of the "apache way" > of > > > >> > interacting, and conducting yourself in open-source. > > > >> > > > > >> > RE Diversity and representation: Wei-Chiu I think you raise an > > > important > > > >> > problem. The main force behind this is typically for a company to > be > > > >> deeply > > > >> > invested in a project and valuing OSS and putting lots full-time > > > >> > developers on it. Those will naturally become committers. On one > > side > > > >> this > > > >> > is good to the project, unless it becomes so unbalance that the > OSS > > > >> nature > > > >> > of the effort is in question. Attracting more contributors across > > > >> > companies/countries (and any other dimension of diversity is > > > important) > > > >> > @Vinod I am sure you have been thinking about this issue, any > > > thoughts? > > > >> > > > > >> > Thanks, > > > >> > Carlo > > > >> > > > > >> > On Fri, Jul 10, 2020 at 1:49 PM Ahmed Hussein <a...@ahussein.me> > > wrote: > > > >> > > > > >> >> +1, this is great folks. > > > >> >> > > > >> >> In addition to that initiative, Do you think there is a chance to > > > >> launch > > > >> >> a "*Hadoop Mentorship Program for Minority Students*" > > > >> >> > > > >> >> *The program will work as follows:* > > > >> >> > > > >> >> - Define a programme committee to administrate and mentor > > > >> candidates. > > > >> >> - The Committee defines a timeline for applications and > > projects. > > > >> >> Let's say it is some sort of 3 months. (Similar to an > > internship) > > > >> >> - Define a list of ideas/projects that can be picked by the > > > >> candidates > > > >> >> - Candidates can propose their idea as well. This can be a > good > > > way > > > >> >> to inject new blood and research ideas into Hadoop. > > > >> >> - Pick top top applications and assign them to mentors. > > > >> >> - If sponsors can allocate money, then candidates with good > > > >> >> evaluation can get some sort of prize. If no money is > allocated, > > > >> then we > > > >> >> can discuss any other kind of motivation. > > > >> >> > > > >> >> I remember there were Student Mentorship programmes in Open > source > > > >> >> projects like "JikesRVM" and several proposals were actually > merged > > > >> and/or > > > >> >> transformed into publications. > > > >> >> There are many missing links that need to be filled like how to > > > define > > > >> >> the target and the audience of the programme > > > >> >> > > > >> >> Let me know WDYT guys. > > > >> >> > > > >> >> On Fri, Jul 10, 2020 at 1:45 PM Wei-Chiu Chuang < > > weic...@apache.org> > > > >> >> wrote: > > > >> >> > > > >> >>> Thanks Carlo and Eric for the initiative. > > > >> >>> > > > >> >>> I am all for it and I'll do my part to mind the code. This is a > > > small > > > >> yet > > > >> >>> meaningful step we can take. Meanwhile, I'd like to take this > > > >> opportunity > > > >> >>> to open up conversation around the Diversity & Inclusion within > > the > > > >> >>> community. > > > >> >>> > > > >> >>> If you read this quarter's Hadoop board report, I am starting to > > > >> collect > > > >> >>> metrics about the composition of our community in order to > > > understand > > > >> if > > > >> >>> we > > > >> >>> are building a diverse & inclusive community. Things that are > > > obvious > > > >> to > > > >> >>> me > > > >> >>> that I thought I should report are the following: affiliation > > among > > > >> >>> commiters, and demographics of committers. As of last quarter, 4 > > out > > > >> of 7 > > > >> >>> newly minted committers are affiliated with Cloudera. 4 out of > > the 7 > > > >> said > > > >> >>> committers are located in Asia. Those facts suggest we have a > good > > > >> >>> international participation (I am being US-centric), which is > > good. > > > >> >>> However, having half of the active committers affiliated with > one > > > >> company > > > >> >>> is a potential problem. > > > >> >>> > > > >> >>> I'd like to hear your thoughts on this. What other metrics > should > > we > > > >> >>> collect, and what actions can we take. > > > >> >>> > > > >> >>> > > > >> >>> > > > >> >>> On Fri, Jul 10, 2020 at 11:29 AM Carlo Aldo Curino < > > > >> >>> carlo.cur...@gmail.com> > > > >> >>> wrote: > > > >> >>> > > > >> >>> > Eric, > > > >> >>> > > > > >> >>> > Thank you so much for the support and for stepping up offering > > to > > > >> work > > > >> >>> on > > > >> >>> > this. I am super +1 on this. Let's give folks a few more days > to > > > >> chime > > > >> >>> in, > > > >> >>> > in case there is anything to discuss before we get cracking! > > > >> >>> > > > > >> >>> > (Really) Thanks, > > > >> >>> > Carlo > > > >> >>> > > > > >> >>> > On Fri, Jul 10, 2020, 10:38 AM Eric Badger < > > > >> ebad...@verizonmedia.com> > > > >> >>> > wrote: > > > >> >>> > > > > >> >>> > > Thanks for writing this up, Carlo. I'm +1 (idk if I'm > > > technically > > > >> >>> binding > > > >> >>> > > on this or not) for the changes moving forward and I think > we > > > >> >>> refactor > > > >> >>> > away > > > >> >>> > > any instances that are internal to the code (i.e. not APIs > or > > > >> other > > > >> >>> > things > > > >> >>> > > that would break compatibility) in all active branches and > > then > > > >> also > > > >> >>> > change > > > >> >>> > > the APIs in trunk (an incompatible change). > > > >> >>> > > > > > >> >>> > > I just came across an internal issue related to the NM > > > >> >>> > > whitelist/blacklist. I would be happy to go refactor the > code > > > and > > > >> >>> look > > > >> >>> > for > > > >> >>> > > instances of these and replace them with > allowlist/blocklist. > > > >> Doing a > > > >> >>> > quick > > > >> >>> > > "git grep" of trunk, I see 270 instances of "whitelist" and > > 1318 > > > >> >>> > instances > > > >> >>> > > of "blacklist". > > > >> >>> > > > > > >> >>> > > If there are no objections, I'll create a JIRA to clean this > > > >> specific > > > >> >>> > > stuff up. It would be wonderful if others could pick up a > > > >> different > > > >> >>> > portion > > > >> >>> > > (e.g. master/slave) so that we can spread the work out. > > > >> >>> > > > > > >> >>> > > Eric > > > >> >>> > > > > > >> >>> > > On Tue, Jul 7, 2020 at 6:27 PM Carlo Aldo Curino < > > > >> >>> carlo.cur...@gmail.com > > > >> >>> > > > > > >> >>> > > wrote: > > > >> >>> > > > > > >> >>> > >> Hello Folks, > > > >> >>> > >> > > > >> >>> > >> I hope you are all doing well... > > > >> >>> > >> > > > >> >>> > >> *The problem* > > > >> >>> > >> The recent protests made me realize that we are not just a > > > >> >>> bystanders of > > > >> >>> > >> the systematic racism that affect our society, but we are > > > active > > > >> >>> > >> participants of it. Being "non-racist" is not enough, I > > > strongly > > > >> >>> feel we > > > >> >>> > >> should be actively "anti-racist" in our day to day lives, > and > > > >> >>> > continuously > > > >> >>> > >> check our biases. I assume most of you will agree with the > > > >> general > > > >> >>> > >> sentiment, but based on your exposure to the recent events > > and > > > US > > > >> >>> > >> culture/history might have more or less strong feelings > about > > > >> your > > > >> >>> role > > > >> >>> > in > > > >> >>> > >> the problem and potential solution. > > > >> >>> > >> > > > >> >>> > >> *What can we do about it?* I think a simple action we can > > take > > > >> is to > > > >> >>> > work > > > >> >>> > >> on our code/comments/documentation/websites and remove > racist > > > >> >>> > terminology. > > > >> >>> > >> Here is a IETF draft to fix up some of the most egregious > > > >> examples > > > >> >>> > >> (master/slave, whitelist/backlist) with proposed > > alternatives. > > > >> >>> > >> > > > >> >>> > >> > > > >> >>> > > > > >> >>> > > > >> > > > > > > https://urldefense.com/v3/__https://tools.ietf.org/id/draft-knodel-terminology-00.html*rfc.section.1.1.1__;Iw!!Op6eflyXZCqGR5I!XjCu5VSFdt2uqyuzlkc53KSBa6IM-M2Wun_FX6uD8fl99OAvaj9wb5A12dpg$ > > > >> >>> < > > > >> > > > > > > https://urldefense.com/v3/__https://tools.ietf.org/id/draft-knodel-terminology-00.html*rfc.section.1.1.1__;Iw!!Op6eflyXZCqGR5I!W9THsx9iZb2VObBrVY5_8ZRJKCws3YRAXARB-YTUElcUtxOBPWpiHWfGaWE7Lbogn7k$ > > > >> > > > > >> >>> > >> Also as we go about this effort, we should also consider > > other > > > >> >>> > >> "non-inclusive" terminology issues around gender (e.g., > > binary > > > >> >>> gendered > > > >> >>> > >> examples, "Alice" doing the wrong security thing > > > systematically), > > > >> >>> and > > > >> >>> > >> ableism (e.g., referring to misbehaving hardware as "lame" > or > > > >> >>> "limping", > > > >> >>> > >> etc.). > > > >> >>> > >> The easiest action item is to avoid this going forward > > (ideally > > > >> >>> adding > > > >> >>> > it > > > >> >>> > >> to the checkstyles if possible), a more costly one is to > > start > > > >> going > > > >> >>> > back > > > >> >>> > >> and refactor away existing instances. > > > >> >>> > >> > > > >> >>> > >> I know this requires a bunch of work as refactorings might > > > break > > > >> dev > > > >> >>> > >> branches and non-committed patches, possibly scripts, etc. > > but > > > I > > > >> >>> think > > > >> >>> > >> this > > > >> >>> > >> is something important and relatively simple we can do. The > > > >> effect > > > >> >>> goes > > > >> >>> > >> well beyond some text in github, it signals what we believe > > in, > > > >> and > > > >> >>> > forces > > > >> >>> > >> hundreds of users and contributors to notice and think > about > > > it. > > > >> Our > > > >> >>> > >> force-multiplier is huge and it matches our responsibility. > > > >> >>> > >> > > > >> >>> > >> What do you folks think? > > > >> >>> > >> > > > >> >>> > >> Thanks, > > > >> >>> > >> Carlo > > > >> >>> > >> > > > >> >>> > > > > > >> >>> > > > > >> >>> > > > >> >> > > > >> >> > > > >> >> -- > > > >> >> -- > > > >> >> Best Regards, > > > >> >> > > > >> >> *Ahmed Hussein, PhD* > > > >> >> > > > >> > > > > >> > > > > > > > > > >