YA update

On Wed, May 02, 2018 at 02:52:09PM +0100, Steve McIntyre wrote:
>Update again...
>
>On Sun, Apr 29, 2018 at 01:12:51PM +0100, Steve McIntyre wrote:
>>
>>So, an initial run of svn2git just ignoring the people directory took
>>11.5h here, and gave me a .git dir of ~680M. That's quite
>>big. I'm re-doing it now with an "authors" file in place, to get
>>something more usable.
>
>If anybody would like to play with this, I've just uploaded it to
>
>https://salsa.debian.org/93sam/d-i-test1
>
>and I'll leave it up there for now.
>
>>Discussing with KiBi on IRC last night, we're thinking that it's
>>probably worth splitting the manual off into a separate
>>project/repo. I'll try doing that too, and see what we get.
>
>To make things go much faster, I grabbed a copy of the svn repo
>directly and I've been running with that. It goes *much* more quickly
>due to the latency reduction on each revision checkout, but it
>reliably fails with:
>
>Name does not refer to a filesystem directory: Failure opening
>'/trunk/installer/build/pkg-lists/netboot/mipsel/sb1-swarm-bn.cfg':
>'/trunk/installer/build/pkg-lists/netboot/mipsel' is not a directory
>in filesystem '48c42b26-1dd6-0310-b98f-a58d8bce7237' at
>/usr/share/perl5/Git/SVN/Ra.pm line 312
>
>at r35516. Joy. I've not modified the svn data files in any way, and
>this worked from alioth...
>
>Having looked online, I find various recommendations to avoid using
>this version of svn2git (which is a simple wrapper around
>git-svn). I'm now trying the svn2git tool the KDE people used for
>migration:
>
>  https://github.com/svn-all-fast-export/svn2git.git

After some fighting with config, I've used this tool with the attached
configs - a rules to control what goes where, and a mapping file for
username -> name/email lookups.

It is *massively* faster than the first tool, something like a factor
of 15-20x. That makes it much more feasible to run this a few times
with different configs, to compare results. For now, I've not filtered
any branches or anything, but I've ignored /people and /README and
moved the manual out into a separate repo. The outputs from this run
were a surprising amount bigger than my first test repo, as the
following bare clones from each will show:

tack:/tmp$ du -s test*
613888  test1-bare.git
3653432 test2-bare.git
714336  test2-manual-bare.git

I've not worked out why yet. In case people might find them useful (or
maybe find time to have a look!), I've pushed these new test repos to
salsa too:

  https://salsa.debian.org/93sam/d-i-test2
  https://salsa.debian.org/93sam/d-i-test2-manual

Suggestions on what else we might want to separate or prune here would
be helpful. I don't really like the idea of losing our history. We
could maybe prune old branches, but I'm not sure it'll save much. Or
am I worrying too much about the repo sizes already?

-- 
Steve McIntyre, Cambridge, UK.                                [email protected]
You lock the door
And throw away the key
There's someone in my head but it's not me 
93sam = Steve McIntyre <[email protected]>
adioe3-guest = Armin Beširović <[email protected]>
adn = Mohammed Adnène Trojette <[email protected]>
adn-guest = Mohammed Adnène Trojette <[email protected]>
adorosh-guest = Andrei Darashenka <[email protected]>
adrianorg = Adriano Rafael Gomes <[email protected]>
adrianorg-guest = Adriano Rafael Gomes <[email protected]>
agx = Guido Guenther <[email protected]>
aiet-guest = Aiet Kolkhi  <[email protected]>
aigarius = Aigars Mahinovs <[email protected]>
ajenbo-guest = Anders Jenbo <[email protected]>
ajt = Anthony Towns <[email protected]>
akumar = Kumar Appaiah <[email protected]>
albbas-guest = Børre Gaup <[email protected]>
alphix-guest = David Härdeman <[email protected]>
amanpreet-guest = A S Alam <[email protected]>
amp-guest = Andrei Popescu <[email protected]>
andersee = Erik B. Andersen <[email protected]>
andread-guest = Andreas Dahl <[email protected]>
andreas = Andreas Schuldei <[email protected]>
andred-guest = André Dahlqvist <[email protected]>
andrelop = Andre Luis Lopes <[email protected]>
anmar-guest = Anmar Oueja <[email protected]>
aph = Adam Di Carlo <[email protected]>
appaji = Giridhar Appaji Nag Yasa <[email protected]>
appaji-guest = Y Giridhar Appaji Nag  <[email protected]>
arangel-guest = Arangel Angov <[email protected]>
arashb-guest = Arash Bijanzadeh <[email protected]>
arief-guest = Arief S Fitrianto <[email protected]>
arjunaraoc-guest = Arjuna Rao Chavala <[email protected]>
athornton-guest = Adam Thornton <[email protected]>
aurel32 = Aurelien Jarno <[email protected]>
baptiste-guest = Baptiste Jammet <[email protected]>
barbier = Denis Barbier <[email protected]>
bcollins = Ben Collins <[email protected]>
bdale = Bdale Garbee <[email protected]>
benh = Ben Hutchings <[email protected]>
berserker-guest = Pavel Piatruk <[email protected]>
bibar-guest = Denis ARNAUD <[email protected]>
blade = Eduard Bloch <[email protected]>
borman-guest = Borys Yanovych  <[email protected]>
brother = Martin Bagge <[email protected]>
brother-guest = Martin Bagge <[email protected]>
bruno-guest = Bruno Barrera <[email protected]>
bubulle = Christian Perrier <[email protected]>
bug1 = Glenn McGrath <[email protected]>
cajus = Cajus Pollmeier <[email protected]>
carlosliu-guest = Carlos Z.F. Liu <[email protected]>
catman-guest = Bjørn Steensrud <[email protected]>
cjwatson = Colin Watson <[email protected]>
claush = Claus Hindsgaul <[email protected]>
claush-guest = Claus Hindsgaul <[email protected]>
clytie-guest = Clytie Siddall <[email protected]>
cobaco = Bart Cornelis  <[email protected]>
cobaco-guest = Bart Cornelis  <[email protected]>
cperrier-guest = Christian Perrier (test account) <[email protected]>
cri-guest = Cristian Rigamonti <[email protected]>
cvelbar-guest = Vanja Cvelbar <[email protected]>
cwryu = Changwoo Ryu <[email protected]>
daf-guest = Dafydd Harries <[email protected]>
dam-guest = Damyan Ivanov <[email protected]>
damog-guest = David Moreno Garza <[email protected]>
damu-guest = Damodharan Rajalingam <[email protected]>
dan = Dan Jacobowitz <[email protected]>
dancer = Junichi Uekawa <[email protected]>
daniel = Daniel Baumann <[email protected]>
danishka-guest = Danishka Navin <[email protected]>
dannf = Dann Frazier <[email protected]>
danweber-guest = Dan Weber <[email protected]>
darehanl-guest = Sunjae Park  <[email protected]>
ddt1984-guest = Jung Seung-Cheol <[email protected]>
deep-guest = Jaonary Rabarisoa <[email protected]>
demarchi-guest = Innocent De Marchi <[email protected]>
di-l10n-guest = l10n Commit Robot <[email protected]>
diablero-guest = Thomas Poindessous <[email protected]>
dilinger-guest = Andres Salomon <[email protected]>
dmn = Damyan Ivanov <[email protected]>
dnusinow-guest = David Nusinow <[email protected]>
dreamcry-guest = Jia-Wei Jhang <[email protected]>
drssay-guest = Seok-moon Jang <[email protected]>
drtv-guest = Vasudevan Tirumurti <[email protected]>
dsg-guest = Davíð Steinn Geirsson  <[email protected]>
dwhedon = David Whedon <[email protected]>
echoray-guest = Alwin Meschede <[email protected]>
eddyp-guest = Eddy Petrisor  <[email protected]>
edu-guest = Esko Arajärvi <[email protected]>
eirikub-guest = Eirik U. Birkeland <[email protected]>
eknagy-guest = Elemér Károly Nagy <[email protected]>
elbrus = Paul Mathijs Gevers <[email protected]>
elian-guest = Elian Myftiu <[email protected]>
elmig-guest = Miguel Figueiredo <[email protected]>
eloy = Krzysztof Krzyzaniak <[email protected]>
elric-guest = Omar Campagne <[email protected]>
ender = David Martínez <[email protected]>
enver555-guest = Samuel Gimeno Artigas <[email protected]>
eppesuig = Giuseppe Sacco <[email protected]>
erdal-guest = Erdal Ronahi  <[email protected]>
erispre-guest = Eric Spreen <[email protected]>
eugen = Eugeniy Meshcheryakov <[email protected]>
eugeniy-guest = Eugeniy Meshcheryakov <[email protected]>
fabbione = Fabio Massimo Di Nitto <[email protected]>
falk = Falk Hueffner <[email protected]>
faw = Felipe Augusto van de Wiel <[email protected]>
faw-guest = Felipe Augusto van de Wiel <[email protected]>
felipo-guest = Felipe Castro <[email protected]>
fenio = Bartosz Fenski <[email protected]>
fenio-guest = Bartosz Feński <[email protected]>
fiandro-guest = Attilio Fiandrotti <[email protected]>
fjp = Frans Pop <[email protected]>
fjpop-guest = Frans Pop <[email protected]>
flo = Florian Lohoff <[email protected]>
foka = Anthony Fok <[email protected]>
franklin-guest = Frank Lin Piat <[email protected]>
frederik-guest = Frederik Dannemare <[email protected]>
fs = Frederik Schüler <[email protected]>
fschueler-guest = Frederik Schüler <[email protected]>
ftlerror-guest = Guilherme de S. Pastore  <[email protected]>
fzielcke-guest = Felix Zielcke <[email protected]>
galaxico-guest = Emmanuel Galatoulas <[email protected]>
gandalfar-guest = Jure Čuhalev <[email protected]>
gaudenz = Gaudenz Steinlin <[email protected]>
gaudenz-guest = Gaudenz Steinlin <[email protected]>
ghaffari-guest = Abdul Rahim Nizamani <[email protected]>
gheyret-guest = Gheyret Kenji <[email protected]>
ghoseb-guest = Baishampayan Ghose <[email protected]>
gladk = Anton Gladky <[email protected]>
gladky-anton-guest = Anton Gladky <[email protected]>
gleydson = Gleydson Mazioli da Silva <[email protected]>
glisha-guest = Georgi Stanojevski <[email protected]>
gordon-guest = Gordon Farquharson <[email protected]>
goswin = Goswin von Brederlow <[email protected]>
goswin-guest = Goswin von Brederlow <[email protected]>
gsacco = Giuseppe Sacco <[email protected]>
guillem = Guillem Jover <[email protected]>
h01ger-guest = Holger Levsen <[email protected]>
hansfn-guest = Hans Fredrik  Nordhaug <[email protected]>
helix84-guest = Ivan Masár <[email protected]>
hertzog = Raphaël Hertzog <[email protected]>
holger = Holger Levsen <[email protected]>
holger-guest = Holger Wansing  <[email protected]>
holgerw = Holger Wansing <[email protected]>
huggie = Simon Huggins <[email protected]>
ibragimov-guest = Victor Ibragimov <[email protected]>
ijc-guest = Ian Campbell <[email protected]>
ilyas-guest = Ilyas Bakirov <[email protected]>
israt-guest = Israt Jahan <[email protected]>
jamil-guest = Jamil Ahmed <[email protected]>
janos-guest = Janos Guljas <[email protected]>
jbailey = Jeff Bailey <[email protected]>
jcisio-guest = Hai-Nam Nguyen <[email protected]>
jcristau = Julien Cristau <[email protected]>
jdthood-guest = Thomas Hood <[email protected]>
jfs = Javier Fernandez-Sanguino Peña <[email protected]>
jkoenig-guest = Jérémie Koenig <[email protected]>
jnk-guest = Jan Keller <[email protected]>
joedalton-guest = Joe Hansen <[email protected]>
joey = Martin Schulze <[email protected]>
joeyh = Joey Hess <[email protected]>
jolof-guest = Mouhamadou Mamoune Mbacke <[email protected]>
jordi = Jordi Mallach <[email protected]>
joshk = Joshua Kwan <[email protected]>
joshk-guest = Joshua Kwan [obsolete] <[email protected]>
joy = Josip Rodin <[email protected]>
jseidel-guest = Jens Seidel <[email protected]>
jtarrio = Jacobo Tarrio <[email protected]>
judit-guest = Judit Gyimesi <[email protected]>
jurij-guest = Jurij Smakov  <[email protected]>
kakada-guest = kakada hok <[email protected]>
kaplan = Lior Kaplan <[email protected]>
kaplan-guest = Lior Kaplan <[email protected]>
karlheg = Karl M. Hegbloom <[email protected]>
karolina-guest = Karolina Kalic <[email protected]>
kartik = Kartik Mistry <[email protected]>
kartikm-guest = Kartik Mistry <[email protected]>
kebil-guest = Kęstutis Biliūnas <[email protected]>
kibi = Cyril Brulebois <[email protected]>
kirfrank = Frank Kirschner <[email protected]>
klausade-guest = Klaus Ade Johnstad <[email protected]>
klfmanik-guest = Peter Mann <[email protected]>
kmuto = Kenshi Muto <[email protected]>
korsvoll-guest = Håvard Korsvoll <[email protected]>
kov = Gustavo Noronha <[email protected]>
kraai = Matt Kraai <[email protected]>
kroeckx = Kurt Roeckx <[email protected]>
kroeckx-guest = Kurt Roeckx <[email protected]>
kruno99-guest = Krunoslav Gernhard <[email protected]>
kubota = Tomohiro KUBOTA <[email protected]>
kulach-guest = Michał Kułach <[email protected]>
kyle = Kyle McMartin <[email protected]>
lamby = Chris Lamb <[email protected]>
lamby-guest = Chris Lamb <[email protected]>
laonux-guest = Anousak Souphavanh <[email protected]>
ley = Sebastian Ley <[email protected]>
lieb-guest = Jim Lieb <[email protected]>
lks1331-guest = Kyungsoon Lee <[email protected]>
luk = Luk Claes <[email protected]>
lunar = Jérémy Bobbio <[email protected]>
luther = Sven Luther <[email protected]>
madduck = Martin F. Krafft <[email protected]>
manphiz-guest = Xiyue Deng <[email protected]>
markos = Konstantinos Margaritis <[email protected]>
marquinos-guest = Marcos Alvarez Costales <[email protected]>
matthai-guest = Matej Kovacic <[email protected]>
mattiaspoldaru-guest = Mattias Põldaru <[email protected]>
mbc = Michael Cardenas <[email protected]>
mck = Miroslav Kure <[email protected]>
mck-guest = Miroslav Kure <[email protected]>
mckinstry = Alastair McKinstry <[email protected]>
medicalwei-guest = Yao Wei <[email protected]>
mejo = Jonas Meurer <[email protected]>
merker = Karsten Merker <[email protected]>
miki-guest = Milan Kupcevic <[email protected]>
milo-guest = Milo Casagrande <[email protected]>
minghua-guest = Ming Hua <[email protected]>
mlang = Mario Lang <[email protected]>
mondo-guest = Luca Monducci <[email protected]>
moshez = Moshe Zadka <[email protected]>
mpalmer = Matthew Palmer <[email protected]>
murat = Murat Demirten <[email protected]>
murj-guest = Rongjun Mu <[email protected]>
mvillarino-guest = Marcelino Villarino <[email protected]>
nabetaro-guest = Nozomu KURASAWA <[email protected]>
nahoo-guest = Rubén Porras Campo <[email protected]>
nayan-guest = nayan nakhare <[email protected]>
nbliang-guest = Nicholas Ng <[email protected]>
nekral-guest = Nicolas FRANÇOIS <[email protected]>
nidd = Peter Novodvorsky <[email protected]>
nishants-guest = Nishant Sharma <[email protected]>
nitrium-guest = Halil Demirezen <[email protected]>
nobse = Norbert Tretkowski <[email protected]>
notclive-guest = Jonathan Price <[email protected]>
odyx-guest = Didier Raboud <[email protected]>
ogi-guest = Ognyan Kulev <[email protected]>
okhayat-guest = Ossama Khayat <[email protected]>
orccl1001-guest = yumi Lee <[email protected]>
otavio = Otavio Salvador <[email protected]>
ottavio-guest = Ottavio Campana <[email protected]>
pabs = Paul Wise <[email protected]>
panzer-guest = Veselin Mijušković <[email protected]>
paras-guest = Paras Pradhan <[email protected]>
pelle = Per Olofsson <[email protected]>
pelle-guest = Per Olofsson <[email protected]>
pere = Petter Reinholdtsen <[email protected]>
peterk = Peter Karlsson <[email protected]>
pgeyleg-guest = Pema Geyleg <[email protected]>
philbat-guest = Philippe Batailler <[email protected]>
philh = Philip Hands <[email protected]>
pi-guest = Piarres Beobide Egaña <[email protected]>
pkern = Philipp Kern <[email protected]>
pmachard = Pierre Machard <[email protected]>
polish = Unknown user polish <[email protected]>
pootle-guest = Pootle Server <[email protected]>
porridge = Marcin Owsiany <[email protected]>
prasad-guest = Prasad Ramamurthy Kadambi <[email protected]>
pravi-guest = Praveen Arimbrathodiyil <[email protected]>
priti-guest = Priti Patil <[email protected]>
progfou-guest = Jean Christophe André <[email protected]>
proguy-guest = Paul Fleischer <[email protected]>
pronik-guest = Nikolai Prokoschenko <[email protected]>
pt-guest = Parlin Imanuel Toh  <[email protected]>
pyasi.arun-guest = Arun Pyasi <[email protected]>
rakeshpandit-guest = Rakesh Pandit <[email protected]>
rgh-guest = Richard Hirst <[email protected]>
rhirst = Richard Hirst <[email protected]>
rmh = Robert Millan <[email protected]>
roktas = Recai Oktas <[email protected]>
roktas-guest = Recai Oktas <[email protected]>
rq-guest = Rimas Kudelis <[email protected]>
rudy = Rudy Godoy <[email protected]>
rudy-guest = Rudy Godoy <[email protected]>
rwhitby-guest = Rod Whitby <[email protected]>
ryan52-guest = Ryan Niebur <[email protected]>
sahran-guest = Abduqadir Abliz <[email protected]>
sapphire-guest = Safir Secerovic <[email protected]>
sas-guest = SZERVÁC Attila <[email protected]>
sc-guest = Stefano Canepa <[email protected]>
scannell-guest = Kevin Scannell <[email protected]>
schot-guest = Jeroen Schot <[email protected]>
seppy = Dennis Stampfer <[email protected]>
sesse = Steinar H. Gunderson <[email protected]>
sesse-guest = Steinar H. Gunderson <[email protected]>
sferriol-guest = sylvain ferriol <[email protected]>
shlomil-guest = Shlomi Loubaton <[email protected]>
sinxwal-guest = Segio Cxurbaty <[email protected]>
sjogren = Martin Sjögren <[email protected]>
skx = Steve Kemp <[email protected]>
slackydeb-guest = Luca Favatella <[email protected]>
sleblanc-guest = Serge Leblanc <[email protected]>
sley = Sebastian Ley <[email protected]>
smarenka = Stephen Marenka <[email protected]>
sokhem-guest = Khoem Sokhem  <[email protected]>
stappers = Geert Stappers <[email protected]>
stappers-guest = Geert Stappers <[email protected]>
stepgr-guest = George Papamichelakis <[email protected]>
sthibaul-guest = Samuel Thibault <[email protected]>
sthibault = Samuel Thibault <[email protected]>
stultus-guest = Hrishikesh K B <[email protected]>
szjungle-guest = Ji YongGang  <[email protected]>
taem-guest = Timur Birsh <[email protected]>
tale = Tapio Lehtonen <[email protected]>
tausq = Randolph Chung <[email protected]>
tbm = Martin Michlmayr <[email protected]>
teferra-guest = tegegne tefera <[email protected]>
tejas-guest = tj g <[email protected]>
tennom-guest = Tennom YK <[email protected]>
tenzin-guest = Tenzin Dendup  <[email protected]>
teo = Teófilo   Ruiz Suárez <[email protected]>
tetralet-guest = Tetralet <[email protected]>
tfheen = Tollef Fog Heen <[email protected]>
thep = Theppitak Karoonboonyanan <[email protected]>
thep-guest = Theppitak Karoonboonyanan <[email protected]>
thibg-guest = Thibaut GIRKA <[email protected]>
thijs = Thijs Kinkhorst <[email protected]>
ths = Thiemo Seufer <[email protected]>
ths-guest = Thiemo Seufer <[email protected]>
timo = Timo Jyrinki <[email protected]>
toff = Unknown user toff <[email protected]>
tolstoy-guest = behrad eslamifar <[email protected]>
tomislav-guest = Tomislav Krznar <[email protected]>
tomos-guest = SUGIYAMA Tomoaki <[email protected]>
trorrr-guest = Héctor Fernández López <[email protected]>
tsauter = Thorsten Sauter <[email protected]>
tvainika = Tommi Vainikainen <[email protected]>
uden-guest = Uden Sherpa <[email protected]>
udienz-guest = Mahyuddin Susanto <[email protected]>
unknown = Unknown user unknown <[email protected]>
vagrant = Vagrant Cascadian <[email protected]>
vgevorgy-guest = Vardan Gevorgyan <[email protected]>
vi-guest = VERÓK István <[email protected]>
vics-guest = Viktar Siarheichyk <[email protected]>
victory-guest = victory .deb <[email protected]>
viktor-guest = Viktor Horvath <[email protected]>
vince = Vincent Sanders <[email protected]>
vincent-guest = Vikram Vincent <[email protected]>
vivekvc-guest = Vivek Varghese Cherian <[email protected]>
vorlon = Steve Langasek <[email protected]>
vvidic-guest = Valentin Vidic <[email protected]>
waldi = Bastian Blank <[email protected]>
walters = Colin Walters <[email protected]>
wart = Wartan Hachaturow <[email protected]>
windo-guest = Siim Põder <[email protected]>
wookey = Wookey <[email protected]>
wosman-guest = Wolfgang Silbermayr <[email protected]>
wouter = Wouter Verhelst <[email protected]>
wzssyqa-guest = YunQiang Su <[email protected]>
xam = Max Vozeler <[email protected]>
xenos-guest = Eric Pareja <[email protected]>
xnox = Dimitri John Ledkov <[email protected]>
yangfl-guest = Fl Yang <[email protected]>
yeager-guest = Daniel Nylander <[email protected]>
yortx-guest = Xurxo Barreiro González <[email protected]>
yuray-guest = Yuri Kozlov <[email protected]>
zinosat-guest = Davide Viti  <[email protected]>
zinoviev = Anton Zinoviev <[email protected]>
zumbi = Hector Oron <[email protected]>
zw = Zhao Way <[email protected]>
#
# Declare the repositories we know about:
#

create repository d-i.git
end repository

create repository d-i-manual.git
end repository

#
# Declare the rules
# Note: rules must end in a slash
#

# Ignore this dir:
# Note that rules are applied in order of appearance, so this rule
# must appear before the generic rules
match /people/
end match

match /README
end match

match /trunk/manual/
  repository d-i-manual.git
  branch master
end match

match /branches/([^/]+)/manual/
  repository d-i-manual.git
  branch \1
end match

match /tags/([^/]+)/manual/
  repository d-i-manual.git
  branch refs/tags/\1
end match

match /trunk/
  repository d-i.git
  branch master
end match

match /branches/([^/]+)/
  repository d-i.git
  branch \1
end match

match /tags/([^/]+)/
  repository d-i.git
  branch refs/tags/\1
end match

#
#match /project2/trunk/
#  repository project2
#  branch master
#end match
#
# Note how we can use regexp to capture the repository name
#match /([^/]+)/branches/([^/]+)/
#  repository \1
#  branch \2
#end match

# No tag processing

Attachment: signature.asc
Description: PGP signature

Reply via email to