Hi Isabella
Thanks very much for reaching out to me. It's great to hear that you
are interested in the Apache Software Foundation (ASF) and doing some
research on the the potential lifecycle of incubating projects. I think
that the topics that you are looking to gather information on cover a
few areas within the ASF. While Community Development is a general
umbrella covering all Apache communities, we do have specific areas that
are focussed specifically on D&I and Apache Incubator itself.
Our Apache Incubator community oversees the whole Apache incubation
process while D&I community has been instrumental in performing the
latest survey of diversity within the Apache communities and may be able
to give you a better indication of what diversity information we have
and can share.
In the meantime I will try to respond inline to your various points below.
A key tool we use to gather statistics and metrics for all Apache
projects is another Apache project called Apache Kibble which collects
contribution and statistics information on incubating projects too so
maybe take a look at.
On 2020-11-02 16:10, Isabella Ferreira wrote:
Dear Sharan Foga,
My name is Isabella Ferreira and I was in your presentation at
CHAOSScon Europe 2020. Based on your presentation, I saw that you are
involved in several initiatives to encourage diversity within the ASF.
We, a group of Canadian and Dutch software engineering researchers,
are interested in understanding why some projects joining Apache
incubator grow and succeed, and others fail. Based on this study, our
eventual goal is to formulate recommendations for projects considering
to join Apache in terms of expectations and best practices. We aim to
share our findings with the Apache community as well as software
practitioners and researchers.
So far we have manually classified the incubator proposals of 292
projects to understand their motivation. We have found that these are
the top-5 reasons for joining the Apache incubator:
1.
Community building
2.
Community diversity
3.
Follow an established development process (such as the "Apache Way")
4.
Increase user base
5.
Expected collaboration with other projects
As the next step, we would like to evaluate to what extent joining the
Apache ecosystem has enabled projects to achieve their goals. In
particular, we are interested in questions like:
*
Did the number of organizations contributing to Apache projects
increase compared to before joining the Apache incubator?
We have been using Apache Kibble to generate statistics for all our
projects and we don't currently track track organisational affiliation
properly but there have been discussions about ways to improve and
include it.
For projects coming into Apache Incubator, I believe some organisational
affiliation is captured initially to ensure diversity of project
affiliation and the lack of dependency on one specific company. As you
mention sometimes a project enter incubation to grow their communities
as they need to diversify to survive.
*
Did the geographical spread of contributions to Apache projects
increase compared to before joining the Apache incubator?
I don't think Apache Kibble captures geographical location of
contributions but it does capture the time and date of the contribution,
if that is any help.
*
Did the gender diversity of contributions to Apache projects
increase compared to before joining the Apache incubator?
We do have the contributor id but Apache Kibble doesn't specifically
capture or report on this information. Perhaps our D&I community may be
able to help you here with some relevant details from the last Apache
Diversity survey.
While the GitHub and Subversion repositories of Apache projects
provide information about the kind of contributions made (size,
complexity, etc.), the information needed to address the above
questions is not as readily available.
Hence, as the current VP of the Apache Community Development, we would
like to have your thoughts on what would be the best way to obtain
access to the above diversity data, without breaching any
confidentiality concerns:
*
Is there a means to get access to Apache patch submitters’
contributor agreements, for research purposes? If so, what is the
process for this (e.g., NDAs to sign)?
Tha ASF site publishes publicly the list of people and companies that
have signed an Individual or Corporate Contributor Licence Agreeement
(ICLAs). If you are asking for access to the actual document signed,
then no - this is not possible.
*
Alternatively, is there a way for us to provide R or Python
analysis scripts that someone with data access could run on our
behalf, as such only exposing aggregate data to us?
*
Another alternative would be to perform a series of interviews
and/or a survey amongst Apache contributors, although the success
would heavily rely on a large participation rate.
If your focus is on Incubator then by reaching out to them you maybe be
able to gather enough survey participants. What sort of participation
levels do you need to reach?
What are your thoughts on these points? Of course, we would be
interested in organizing a virtual call to clarify our research
objectives and/or questions.
I think you that have asked some interesting questions, but I am not
sure that we have all the information available. Some of the
information you have asked for, we cannot give you. Perhaps it would be
good to continue this discussion on our mailing list to explore a bit
more what public data we have that could help with your research.
I have copied our VP Apache Incubator Justin McLean and our VP Apache
Diversity & Inclusion Gris Cuevas who may also be able to respond with
their comments or any additional details that could help you.
Thanks
Sharan
Kind regards,
Isabella Ferreira, Polytechnique Montréal, Canada
Bram Adams, Queen’s University, Canada
Alexander Serebrenik, Eindhoven University of Technology, The Netherlands
Nan Yang, Eindhoven University of Technology, The Netherlands