On Fri, Aug 05, 2016 at 04:38:30PM +0100, Manuel López-Ibáñez wrote: > I think those conclusions are debatable:
I won't respond to all your points (I'm busy this evening), but I can regenerate my table with some of your suggestions. > * GCC has also grown over the years, there is a lot more code and > areas, specially more targets, which attract their own temporary > developers who do not contribute to the rest of the compiler (much > less review patches for the rest of the compiler). > > * Your analysis includes Ada, Go and Fortran. I'd suggest to exclude > them, since in terms of developers and reviewing, they seem to be > doing fine. They also tend to organise themselves mostly independently > of the rest of the compiler. This is also mostly true for targets. Excluding this is tricky, but in principle is just matter of tweaking the git shortlog command. If that's something you want to do, I'd be interested to see. I didn't get reasonable results in time for the history back to 1998 to present these numbers, in a few rough tests they didn't look vastly different (when filtering on gcc/*.[ch]). I've given the 2012-2015 numbers below, just to show that (for the files in gcc/*.[ch]) your hypothesis doesn't hold. The vast majority of committers make <20 commits in a year. Year | 2012 | 2013 | 2014 | 2015 Commits | 1816 | 1632 | 2148 | 2362 Committers | 98 | 110 | 109 | 114 Average commits | 19 | 15 | 20 | 21 Number of committers achieving N commits by bucket... 1-19 | 78 | 96 | 92 | 94 20-39 | 12 | 4 | 5 | 7 40-59 | 2 | 3 | 7 | 3 60-79 | 1 | 3 | 0 | 0 80-100 | 1 | 1 | 0 | 2 100-199 | 2 | 1 | 2 | 6 200+ | 2 | 2 | 3 | 2 Percentage of committers achieving N commits by bucket... 1-19 | 80 | 87 | 84 | 82 20-39 | 12 | 4 | 5 | 6 40-59 | 2 | 3 | 6 | 3 60-79 | 1 | 3 | 0 | 0 80-99 | 1 | 1 | 0 | 2 100-199 | 2 | 1 | 2 | 5 200+ | 2 | 2 | 3 | 2 > * 100 commits is less than 2%. Quite a low threshold. Perhaps 1%, 25%, > 50%, 75%, 90% are more informative. Again, just done for time. I've changed the last two buckets to 100-199 and 200+ in this run. If you'd like to do, I'd be happy to see the results. > * https://www.openhub.net/p/taezaza/contributors/summary shows that > more than 25% of the commits in the last 12 months were made by 6 > people. Note that those people are also the most active reviewers. True, but as you point out below, few data samples tell us little. > * If I adjust the numbers by the total number of contributors, then we > get a different picture: I've added that to my table. > that is, most of the commits are done by smaller fraction of the > total. For 2015 I found the 4 "25%" marks to be: 26% 1-4 25% 5-13 25% 14-39 23% 40+ So 75% of the work is being done by people who commit fewer than 40 patches in a year. Encouragingly 50% of the people who committed in 2015 committed at least one patch per month (on average). > * Numbers for other years might shed more light. 2010, 2013 and 2015 > might have been especial in one sense or another. I compressed this for space. The full table is below (and attached - just in case your mail client gets zealous with the text and re-wraps it). Year | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | Commits | 4997 | 5531 | 7031 | 6850 | 6704 | 7961 | 9137 | 7646 | 5039 | 6633 | 5667 | 6244 | 7582 | 8181 | 6463 | 5970 | 7497 | 7742 | Committers | 44 | 65 | 89 | 116 | 128 | 153 | 153 | 167 | 163 | 167 | 172 | 174 | 171 | 176 | 181 | 176 | 204 | 190 | Average commits | 114 | 85 | 79 | 59 | 52 | 52 | 60 | 46 | 31 | 40 | 33 | 36 | 44 | 46 | 36 | 34 | 37 | 41 | Number of committers achieving N commits by bucket... 1-19 | 16 | 29 | 38 | 55 | 64 | 80 | 71 | 91 | 97 | 91 | 111 | 107 | 103 | 114 | 116 | 110 | 131 | 116 | 20-39 | 8 | 12 | 19 | 19 | 18 | 21 | 26 | 18 | 28 | 26 | 25 | 29 | 28 | 18 | 26 | 28 | 32 | 31 | 40-59 | 4 | 5 | 4 | 9 | 8 | 13 | 10 | 15 | 13 | 19 | 15 | 15 | 13 | 9 | 15 | 16 | 16 | 12 | 60-79 | 3 | 4 | 4 | 7 | 11 | 8 | 10 | 13 | 4 | 9 | 6 | 6 | 3 | 12 | 7 | 4 | 6 | 5 | 80-100 | 2 | 3 | 4 | 6 | 7 | 12 | 10 | 7 | 7 | 5 | 4 | 3 | 4 | 4 | 0 | 4 | 3 | 4 | 100-199 | 6 | 6 | 8 | 11 | 13 | 8 | 13 | 16 | 12 | 10 | 6 | 6 | 9 | 8 | 8 | 8 | 6 | 11 | 200+ | 5 | 6 | 12 | 10 | 8 | 12 | 14 | 8 | 3 | 8 | 6 | 9 | 12 | 12 | 10 | 7 | 11 | 12 | Percentage of committers achieving N commits by bucket... 1-19 | 36 | 45 | 43 | 47 | 50 | 52 | 46 | 54 | 60 | 54 | 65 | 61 | 60 | 65 | 64 | 62 | 64 | 61 | 20-39 | 18 | 18 | 21 | 16 | 14 | 14 | 17 | 11 | 17 | 16 | 15 | 17 | 16 | 10 | 14 | 16 | 16 | 16 | 40-59 | 9 | 8 | 4 | 8 | 6 | 8 | 7 | 9 | 8 | 11 | 9 | 9 | 8 | 5 | 8 | 9 | 8 | 6 | 60-79 | 7 | 6 | 4 | 6 | 9 | 5 | 7 | 8 | 2 | 5 | 3 | 3 | 2 | 7 | 4 | 2 | 3 | 3 | 80-99 | 5 | 5 | 4 | 5 | 5 | 8 | 7 | 4 | 4 | 3 | 2 | 2 | 2 | 2 | 0 | 2 | 1 | 2 | 100-199 | 14 | 9 | 9 | 9 | 10 | 5 | 8 | 10 | 7 | 6 | 3 | 3 | 5 | 5 | 4 | 5 | 3 | 6 | 200+ | 11 | 9 | 13 | 9 | 6 | 8 | 9 | 5 | 2 | 5 | 3 | 5 | 7 | 7 | 6 | 4 | 5 | 6 | Personally, I think that looks like a fairly stable and healthy community, but you're welcome to draw your own conclusions from the data. The raw data for these tables can be generated with: for i in {1989..2015}; do printf "%d\t" $i; git shortlog -s -n --since=01/01/$i --until=01/01/$((i+1)) | awk '{if ($2 != "gccadmin") {sum+=$1;count+=1}; if ($1 < 20) {bucket1+=1} else if ($1 < 40) {bucket2+=1} else if ($1 < 60) {bucket3+=1} else if ($1 < 80) {bucket4+=1} else if ($1 < 100) {bucket5+=1} else if ($1 < 200) {bucket6+=1} else {bucket7+=1} } END {printf "%d\t%d\t%.0f\t%d\t%d\t%d\t%d\t%d\t%d\t%d\t%.0f\t%.0f\t%.0f\t%.0f\t%.0f\t%.0f\t%.0f\n", sum, count, (sum/count), bucket1, bucket2, bucket3, bucket4, bucket5, bucket6, bucket7, (bucket1/count) *100, (bucket2/count)*100, (bucket3/count)*100, (bucket4/count)*100, (bucket5/count)*100, (bucket6/count)*100, (bucket7/count)*100}'; done in your git checkout. Hope that helps. Thanks, James
Year | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 | 2014 | 2015 | Commits | 4997 | 5531 | 7031 | 6850 | 6704 | 7961 | 9137 | 7646 | 5039 | 6633 | 5667 | 6244 | 7582 | 8181 | 6463 | 5970 | 7497 | 7742 | Committers | 44 | 65 | 89 | 116 | 128 | 153 | 153 | 167 | 163 | 167 | 172 | 174 | 171 | 176 | 181 | 176 | 204 | 190 | Average commits | 114 | 85 | 79 | 59 | 52 | 52 | 60 | 46 | 31 | 40 | 33 | 36 | 44 | 46 | 36 | 34 | 37 | 41 | Number of committers achieving N commits by bucket... 1-19 | 16 | 29 | 38 | 55 | 64 | 80 | 71 | 91 | 97 | 91 | 111 | 107 | 103 | 114 | 116 | 110 | 131 | 116 | 20-39 | 8 | 12 | 19 | 19 | 18 | 21 | 26 | 18 | 28 | 26 | 25 | 29 | 28 | 18 | 26 | 28 | 32 | 31 | 40-59 | 4 | 5 | 4 | 9 | 8 | 13 | 10 | 15 | 13 | 19 | 15 | 15 | 13 | 9 | 15 | 16 | 16 | 12 | 60-79 | 3 | 4 | 4 | 7 | 11 | 8 | 10 | 13 | 4 | 9 | 6 | 6 | 3 | 12 | 7 | 4 | 6 | 5 | 80-100 | 2 | 3 | 4 | 6 | 7 | 12 | 10 | 7 | 7 | 5 | 4 | 3 | 4 | 4 | 0 | 4 | 3 | 4 | 100-199 | 6 | 6 | 8 | 11 | 13 | 8 | 13 | 16 | 12 | 10 | 6 | 6 | 9 | 8 | 8 | 8 | 6 | 11 | 200+ | 5 | 6 | 12 | 10 | 8 | 12 | 14 | 8 | 3 | 8 | 6 | 9 | 12 | 12 | 10 | 7 | 11 | 12 | Percentage of committers achieving N commits by bucket... 1-19 | 36 | 45 | 43 | 47 | 50 | 52 | 46 | 54 | 60 | 54 | 65 | 61 | 60 | 65 | 64 | 62 | 64 | 61 | 20-39 | 18 | 18 | 21 | 16 | 14 | 14 | 17 | 11 | 17 | 16 | 15 | 17 | 16 | 10 | 14 | 16 | 16 | 16 | 40-59 | 9 | 8 | 4 | 8 | 6 | 8 | 7 | 9 | 8 | 11 | 9 | 9 | 8 | 5 | 8 | 9 | 8 | 6 | 60-79 | 7 | 6 | 4 | 6 | 9 | 5 | 7 | 8 | 2 | 5 | 3 | 3 | 2 | 7 | 4 | 2 | 3 | 3 | 80-99 | 5 | 5 | 4 | 5 | 5 | 8 | 7 | 4 | 4 | 3 | 2 | 2 | 2 | 2 | 0 | 2 | 1 | 2 | 100-199 | 14 | 9 | 9 | 9 | 10 | 5 | 8 | 10 | 7 | 6 | 3 | 3 | 5 | 5 | 4 | 5 | 3 | 6 | 200+ | 11 | 9 | 13 | 9 | 6 | 8 | 9 | 5 | 2 | 5 | 3 | 5 | 7 | 7 | 6 | 4 | 5 | 6 | For gcc/*.[ch] Year | 2012 | 2013 | 2014 | 2015 Commits | 1816 | 1632 | 2148 | 2362 Committers | 98 | 110 | 109 | 114 Average commits | 19 | 15 | 20 | 21 Number of committers achieving N commits by bucket... 1-19 | 78 | 96 | 92 | 94 20-39 | 12 | 4 | 5 | 7 40-59 | 2 | 3 | 7 | 3 60-79 | 1 | 3 | 0 | 0 80-100 | 1 | 1 | 0 | 2 100-199 | 2 | 1 | 2 | 6 200+ | 2 | 2 | 3 | 2 Percentage of committers achieving N commits by bucket... 1-19 | 80 | 87 | 84 | 82 20-39 | 12 | 4 | 5 | 6 40-59 | 2 | 3 | 6 | 3 60-79 | 1 | 3 | 0 | 0 80-99 | 1 | 1 | 0 | 2 100-199 | 2 | 1 | 2 | 5 200+ | 2 | 2 | 3 | 2