Hi Mark, I try to get contributors from svn side with `svn log --quiet -v http://svn.apache.org/repos/asf/tomcat | grep "^r"` and get 100 contributors in total. After removing duplicates with git, right now we have 95 contributors till 2012 and ~150 till now (compared to 20/90 and 10/60 before). (See https://www.apiseven.com/en/contributor-graph?chart=contributorOverTime&repo=apache/tomcat for the graph)
Since email that is not bound to GitHub is regarded as "anonymous" contributors, and I could only get 29 anonymous ones from tomcat github repo, we could say there is quite a lot of contributors lost happened during the switch. Actually, the earliest commit of tomcat github repo I could get is from 2006, so I could say of course there is something lost. Talking about contributors in 2012, it seems from the data I get, there are still no new contributors in 2012. There are some commits with "no author", but not for 2012. (I try to figure out how to deal with "no author" but it seems I could do nothing on my side. Please correct me if I'm wrong.) I'll list the svn side contributors here, so maybe you could tell if anything goes wrong. duncan, 1999-10-08 20:05:52 -0400 EDT (no author), 1999-10-08 20:05:52 -0400 EDT costin, 1999-10-10 17:19:44 -0400 EDT craigmcc, 1999-10-12 01:42:09 -0400 EDT gonzo, 1999-10-12 03:17:47 -0400 EDT bergsten, 1999-10-12 04:33:51 -0400 EDT stefano, 1999-10-12 18:43:06 -0400 EDT akv, 1999-10-12 21:12:44 -0400 EDT mode, 1999-10-14 21:24:49 -0400 EDT harishp, 1999-10-14 23:20:35 -0400 EDT arun, 1999-10-15 18:26:05 -0400 EDT mandar, 1999-10-17 14:40:06 -0400 EDT jhunter, 1999-10-17 23:03:40 -0400 EDT vanitha, 1999-10-18 19:49:24 -0400 EDT jons, 1999-11-23 14:46:12 -0500 EST pier, 1999-12-03 07:41:42 -0500 EST rubys, 1999-12-07 20:37:20 -0500 EST shemnon, 2000-01-12 01:14:36 -0500 EST preston, 2000-01-17 05:17:36 -0500 EST shachor, 2000-02-17 05:37:42 -0500 EST jon, 2000-03-24 13:43:48 -0500 EST jluc, 2000-03-29 14:30:26 -0500 EST nacho, 2000-04-03 20:56:55 -0400 EDT ed, 2000-06-15 14:58:19 -0400 EDT alex, 2000-06-22 19:02:53 -0400 EDT glenn, 2000-07-25 08:13:53 -0400 EDT jiricka, 2000-07-28 17:41:44 -0400 EDT pierred, 2000-08-11 17:32:39 -0400 EDT remm, 2000-08-11 20:17:35 -0400 EDT dannyc, 2000-08-16 15:53:22 -0400 EDT horwat, 2000-08-16 20:58:20 -0400 EDT larryi, 2000-08-26 09:03:38 -0400 EDT santosh, 2000-10-02 18:44:57 -0400 EDT arieh, 2000-10-06 16:42:00 -0400 EDT eduardop, 2000-10-11 20:29:52 -0400 EDT hgomez, 2000-11-15 06:37:25 -0500 EST rameshm, 2000-12-15 19:23:33 -0500 EST marcsaeg, 2000-12-21 14:24:19 -0500 EST danmil, 2000-12-25 17:31:58 -0500 EST keith, 2001-02-02 11:41:52 -0500 EST kief, 2001-02-13 03:58:54 -0500 EST melaquias, 2001-03-04 17:38:14 -0500 EST amyroh, 2001-03-21 16:31:46 -0500 EST clucas, 2001-03-24 01:49:30 -0500 EST bip, 2001-04-25 21:30:27 -0400 EDT seguin, 2001-05-12 01:52:38 -0400 EDT jfclere, 2001-06-05 03:55:52 -0400 EDT andya, 2001-06-13 17:26:45 -0400 EDT mmanders, 2001-06-13 17:28:28 -0400 EDT ccain, 2001-08-31 16:15:12 -0400 EDT bojan, 2001-09-25 00:33:45 -0400 EDT billbarker, 2001-10-02 01:38:21 -0400 EDT kinman, 2001-10-03 15:26:47 -0400 EDT patrickl, 2001-11-06 16:52:14 -0500 EST rlubke, 2001-12-12 08:11:47 -0500 EST manveen, 2002-01-26 15:52:58 -0500 EST ekr, 2002-05-28 10:19:47 -0400 EDT dsandberg, 2002-06-05 15:09:17 -0400 EDT cks, 2002-06-20 12:16:00 -0400 EDT mturk, 2002-06-23 01:40:29 -0400 EDT luehe, 2002-06-26 12:50:38 -0400 EDT morgand, 2002-07-22 14:41:34 -0400 EDT bobh, 2002-08-14 16:54:57 -0400 EDT jfarcand, 2002-08-22 08:48:56 -0400 EDT idarwin, 2002-09-13 12:53:33 -0400 EDT fhanik, 2003-02-19 15:24:10 -0500 EST funkman, 2003-06-01 16:57:00 -0400 EDT yoavs, 2003-06-06 23:35:38 -0400 EDT ecarmich, 2003-08-23 21:18:44 -0400 EDT markt, 2003-12-10 16:29:06 -0500 EST truk, 2004-01-30 16:54:40 -0500 EST fuankg, 2004-04-06 12:07:58 -0400 EDT pero, 2004-09-21 03:30:32 -0400 EDT wrowe, 2005-05-11 19:38:30 -0400 EDT clar, 2005-06-10 12:24:35 -0400 EDT bayard, 2005-08-04 20:24:38 -0400 EDT jim, 2005-11-04 18:47:49 -0500 EST jhook, 2006-03-05 14:18:11 -0500 EST rjung, 2006-05-10 04:12:29 -0400 EDT fcarrion, 2007-03-24 21:08:07 -0400 EDT kkolinko, 2009-05-15 18:50:29 -0400 EDT rahul, 2009-08-17 13:39:14 -0400 EDT timw, 2010-02-07 02:33:25 -0500 EST kfujino, 2010-03-31 02:08:32 -0400 EDT jboynes, 2010-07-08 02:12:25 -0400 EDT schultz, 2010-11-23 17:03:23 -0500 EST slaurent, 2010-12-02 17:14:23 -0500 EST eijit, 2011-08-26 00:36:56 -0400 EDT olamy, 2011-08-30 04:23:49 -0400 EDT violetagg, 2013-01-31 09:49:04 -0500 EST kpreisser, 2013-09-24 15:10:44 -0400 EDT fschumacher, 2014-09-19 11:25:29 -0400 EDT ognjen, 2015-10-23 11:08:40 -0400 EDT mgrigorov, 2015-10-27 03:50:01 -0400 EDT huxing, 2016-08-31 10:04:33 -0400 EDT csutherl, 2016-10-03 11:55:16 -0400 EDT ebourg, 2017-01-20 15:24:24 -0500 EST isapir, 2018-05-21 15:30:01 -0400 EDT michaelo, 2018-08-21 04:16:42 -0400 EDT woonsan, 2019-01-08 00:01:45 -0500 EST I'm not familiar with svn at all :( so I'm not sure if I did it correctly. Also, I failed to understand how to search with "provided by <name>(<asf-id>)". I'll appreciate it if you could give me some other guidance and thanks again for your time! Best, Shuyang Mark Thomas <ma...@apache.org> 于2021年5月17日周一 下午2:16写道: > On 17/05/2021 03:55, Shuyang Wu wrote: > > Hi Mark, > > > > I've updated the "anonymous" contributors, and currently there are around > > 20 contributors in early 2012, and 90 for now (compared to 10/~60 > > separately before). Would those data be more reasonable? > > I am afraid these figures are still very misleading. > > Taking the year 2012 as an example, it shows no new contributors. A > quick look at the changelog for Tomcat for 2012 shows that this is not > the case. I very quickly found a handful of first time contributors and > I only looked at a couple of months. > > I would be interested to see what, if any, difference the switch from > svn to git made but I don't see a way to generate that data short of > manually processing the changelog entries (and I am not interested > enough in the answer to want to do that right now). > > For those that are interested, searching for "provided by <name>. > (<asf-id>) should be a fairly reliable pattern for identifying names of > contributors. You'll also need to look at (<asf-id>) as if the > contributor was an ASF committer but not a Tomcat committer we'd > normally recognise them that way. > > Mark > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org > For additional commands, e-mail: users-h...@tomcat.apache.org > >