Re: [gentoo-dev] Breakage and frustration

Robin H. Johnson Sun, 13 Dec 2015 13:41:42 -0800

TL;DR summary:
Yes, stuff has broken, but I'd call them reasonable teething issues well
distributed through the stack, and to be compared to the CVS server
moves from a decade ago, rather than CVS just before the Git switch.


On Sun, Dec 13, 2015 at 06:36:41PM +0100, Patrick Lauer wrote:
... (mail re-ordered for related issues)
> Once that was 'fixed' there was the fun of [2], which made emerge --sync
> very expensive because it refetched lots of files. Every time.
The fix for this caused these two bugs:
> And fixing that introduces [7] some more regressions that broke updating
> @system for about 3.5 days.
> - a few days of grub being uninstallable (iow, making installing
> impossible for many users)
> And the manifest issues are still [9] making life exciting.
Bug #567920 describes the issue very succinctly (mtime of a deleted file
needs to be included in the new Manifest mtime calculation).

Both of them can be worked around if the entire path (all staging nodes,
servers and end users) uses --checksum, but that's even more expensive.

I have another work-around idea, and that's simply appending a comment
of the latest commit per directory to the changelog, because that will
trigger the manifest being different ;-).

> The fix to that fix (notice a pattern here?) broke rsync for *all* users
> [8]. Almost as if no one ever tests things in a test environment ... but
> hey, we're agile, let's fix stuff in production!
We still never figured out how .git came to be added to the outgoing
data. It was NOT the rsync into staging directory, because it was only
the directory structure, and none of the files. --exclude=.git WAS used
in most of the ryncs, but not the final ones from staging to tree
distribution.

> - about 3 months of emerge --changelog being broken, just to be broken
> in a different way
This change (order of changelog entries) was explicitly to reduce your
complaint of prior excess traffic. Why Portage's parsing of the
ChangeLogs is still not handling it is an open question.

> - 3.5 days of emerge @system being broken
> - about a day of emerge --sync needing manual interaction to be able to
> update again
You missed one:
- rsync generation now halts if somebody committed some breakage to the
  tree (missing DIST entry, bad eclass).

> 
> So all in all emerge --sync && emerge -uND @system being down for >10%
> of the time.
> 
> Now, I don't know if you use Gentoo, but I do, and I use it at work, so
> having this level of randomization happen is not really useful.
> 
> Tell me then, please - what can I/we do so that this kind of breakage
> stops, and we can actually aim at having a most excellent distro? In the
> long run I am considering just creating my own clone of all
> infrastructure bits so I can fix things, but it's an option that is
> needlessly braindead, wasting effort, and not really useful to users
> that are not me.
Your own infra option would NOT have fixed [2]/[7]/[9].

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead, Foundation Trustee
E-Mail     : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85

Re: [gentoo-dev] Breakage and frustration

Reply via email to