Hi Andreas
Lets think about some better fine tuning. "NOT LIKE '%salsa%'" might
catch also Vcs URLs that are intentionally somewhere else. While I'd
love to see all packages on Salsa, it might be sensible to start with
packages that are unintentionally not in Salsa so
udd=> SELECT COUNT(DISTINCT source) FROM sources WHERE release = 'sid' AND
(vcs_url IS NULL OR vcs_url like '%alioth%' OR vcs_url like '%git.debian.org%' OR
vcs_url like '%svn.debian.org%') ;
count
-------
2213
That might make a real challenge to bring that number below 2000 until
end of my term. Any help to approach this is welcome.
Well, let's look at some of these other d.o URLs.
- Not our alioth: There are 16 vcs URLs in there that have 'alioth'
in them but aren't alioth.debian.org; they are git hosted but not
on Debian infrastructure (and perhaps not in a place that facilitates
collaboration in the way being discussed)
- dgit.debian.org: There are 30 in there that are dgit.debian.org.
That surprised me, maybe I don't know enough about dgit.
- git.debian.org: There are 146 with git.debian.org - none of these VCS
URLs work any more
- svn.debian.org: !4 list svn.d.o but like git.d.o that's dead. svn.d.o
doesn't even exist as a hostname any more.
There's 161 packages in sid with old d.o URLs pointing to alioth.
There's a reasonable chance that a good portion of them are also not
maintained.
- 11% of them list their maintainer as Debian QA Group
- 13% of them have a current O bug (another 1 with an RFA)
- who knows how many are otherwise abandoned with MIA maintainers or
maintainers who have just moved on to other things
There was a recent discussion about what to do with VCSes for orphaned
packages. Maybe if it doesn't exist on salsa, it's worth creating one in
the salsa.d.o/debian/ namespace as part of doing the QA upload?
(gbp import-dscs --debsnap) That would be a good outcome and a good
little project for someone...
The vast majority of these packages have seen post-alioth uploads but
with the broken Vcs fields still in place. That's perhaps offering the
opposite of collaborative development? The question is whether the repo
has actually moved to salsa but d/control hasn't been updated, or
whether the repo has just vanished. An MBF that the VCS fields are out
of date is easy, but checking and fixing is likely manual work.
year | count
-----+-------
2011 | 1
2012 | 4
2013 | 3
2014 | 4
2015 | 1
2016 | 1
2017 | 2
2018 | 2 (salsa.d.o general availability)
2019 | 1
2020 | 13
2021 | 95
2022 | 20
2023 | 7
2024 | 6
2025 | 1
I noticed that some teams have some lintian tags checking this from a
team policy perspective - doing this more broadly for other teams would
help provide teams with visibility via lintian.d.o reports.
lintian-explain-tags -t team/pkg-perl/vcs/no-git \
team/pkg-perl/vcs/no-team-url
(I accidentally found 2 python-team packages without Vcs URLs yesterday
- the repos were on salsa, just not listed in d/control)
Over half of these old alioth URLs can be addressed by Teams doing some
data normalisation and uploads:
maintainer_name | count
-------------------------------+-------
Debian Perl Group | 72
Debian Java Maintainers | 10
Debian X Strike Force | 7
Debian XML/SGML Group | 4
Debian Science Maintainers | 3
Debian CLI Applications Team | 2
Debian Ruby Extras Maintainers | 1
Debian Javascript Maintainers | 1
Debian Telepathy maintainers | 1
Debian Fonts Task Force | 1
Debian CLI Libraries Team | 1
Debian-IN Team | 1
Debichem Team | 1
NeuroDebian Team | 1
The Debian Lua Team | 1
So in terms of where to start... perhaps there's a couple of teams that
would like to do some data cleansing?
regards
Stuart
SELECT
s.source, date, vcs_url
FROM
sources AS s
JOIN upload_history AS h
ON s.source = h.source AND s.version = h.version
WHERE
release = 'sid' AND
vcs_url ~ '/(git|svn|alioth).debian.org'
ORDER BY
date DESC;
SELECT
DATE_PART('year', date) AS year,
COUNT(*)
FROM
sources AS s
JOIN upload_history AS h
ON s.source = h.source AND s.version = h.version
WHERE
release = 'sid'
AND vcs_url ~ '/(git|svn|alioth).debian.org'
GROUP BY
year
ORDER BY
year ASC;
SELECT
maintainer, COUNT(*)
FROM sources
WHERE
release = 'sid'
AND vcs_url ~ '/(git|svn|alioth).debian.org'
AND maintainer ~ '(team|group|lists)'
GROUP BY
maintainer
ORDER BY
count DESC;
--
Stuart Prescott http://www.nanonanonano.net/ stu...@nanonanonano.net
Debian Developer http://www.debian.org/ stu...@debian.org
GPG fingerprint 90E2 D2C1 AD14 6A1B 7EBB 891D BBC1 7EBB 1396 F2F7