On Thu, Nov 05, 2015 at 12:54:06PM +0100, Alexis Ballier wrote:
> It's not perfectly clean but I don't see any problem here:
> ChangeLog-2015 : all ChangeLog from CVS
> ChangeLog: autogenerated from git
FYI, this was implemented.

For reference, the old CVS changelogs are now taken from HEAD of this
repo:
https://gitweb.gentoo.org/data/gentoo-changelogs.git/

mgorny and I have been poking at the generation issue, with the features
I requested now implemented, plus one patch I pushed up to portage-dev.

There are still some issues remaining.

I filed bugs for some of them:
565536 - need to exclude some commits/paths
565538 - need to exclude some lines
565540 - need parallel threads

However, the largest sticking point, even with parallel threads, is that
it seems the base ChangeLog generation is incredibly slow. It averages
above 350ms per package right now (at 19k packages in a full cycle, it's
a long time), but some packages can take up to 5 seconds so far.

Incremental processing does help this hugely, but isn't always
available.

Right now, I'm considering promising 30 minute syncs as a best case
interval; if changelog generation causes it to take longer, then a push
window WILL be missed. 

How often might this happen? Since we converted to Git, excluding the
initial large commits, there were three instances where it would have
added more than 10 minutes without the improvements I created bugs for.
Plus, any other changes that cause loss of timestamps/reference for
comparison will trigger a full run, at ~6 hours of delay.

(Yes, that's why there hasn't been an rsync update in the last 3 hours,
and won't be for another ~3 hours: because it's crunching to generate
ChangeLogs).

-- 
Robin Hugh Johnson
Gentoo Linux: Developer, Infrastructure Lead, Foundation Trustee
E-Mail     : robb...@gentoo.org
GnuPG FP   : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85

Reply via email to