El dijous, 10 d’agost de 2023, a les 14:00:05 (CEST), Andre Heinecke va escriure: > Hi, > > tl;dr; po sync is blowing up our repository sizes far more then it appears > to be necessary. We might need a force push across all repos to correct > that. Kleopatra repo has increased in size tenfold since po files were > added less then a year ago. > > > I recently noticed that Kleopatra has gained some weight. While she is an > old lady, and when she was split up from the old KDEPIM repo took all her > history with it she was always quite chubby. But not by that much. ( I am > messy with Mega / Mebi here since it is not important for the overall > picture) > > So let us see: > A fresh clone of Kleopatra: > 209M kleopatra > Running: > git filter-repo --path po --invert-paths > 21M kleopatra > > Let us do the same for KMail: > Before: > 169M kmail > after: > 56M kmail > > Now yes Kleopatra has quite a few translations. Their checked out size is > about 29Megabytes. But there is something wrong here. > > What I don't understand though is that if I look at the scripty commits in > the git log, nothing seems unusual. > > But Let us take the language of Low Saxon. I hope that offends the least > people here. There have been no new translations there in ~10 years. > > It's checked out size is 460KB. > > In master we have: > 715 translated messages, 709 fuzzy translations, 428 untranslated messages. > Going back to the first revision that added po files: > 763 translated messages, 645 fuzzy translations, 391 untranslated messages. > > Sizes are fairly equal with master of course a bit larger. Now this > language, unchanged in translation. Has alone added 10 Megabytes. That is > about half of the size of the complete history of the real source code for > Kleopatra. > > du -hs . > 209M > git filter-repo --path po/nds --invert-paths --force > du -hs . > 199M > > Now here is what I don't understand. If I look at the changes > git log -p po/nds/kleopatra.po | wc -c > 164774 > That seems reasonable for all the automatic scripty updates and even with > all the context lines, that is just 1,6MB uncompressed. > > And this is where my git understanding runs into limits. To understand why > the history has gotten so large i tried some snippets from stackoverflow > and from there with: > git rev-list --objects --all po/nds/kleopatra.po| git cat-file --batch- > check='%(objecttype) %(objectname) %(objectsize) %(rest)' | > sed -n 's/^blob //p' | > sort --numeric-sort --key=2 | > cut -c 1-12,41- > > I think that I can roughly see that apparently each commit in the repo has a > blob associated with it that is the same size of the file. > > So can some git sleuth please investigte what is happening here? This kind > of repo growth is unstainable and at least for Kleopatra I see no possible > solution then to figure this out and then remove the po history from the > last year with a force push :-/ > > Don't get me wrong I like that the po files are now also in the repo, and > that this will of course increase the repo size, but something fishy is > going on here in my opinion.
As discussed on Matrix, it seems doing git gc ---aggresive brings down kleopatra size from 223M to 55M Is this something worth doing on the server side? Anyone knows of any potential downsides of that? Cheers, Albert > > > Best Regards, > Andre