Janus Weil <ja...@gcc.gnu.org>:
> > The bad news is that my last test run overran the memnory capacity of
> > the 64GB Great Beast.  I shall have to find some way of reducing the
> > working set, as 128GB DD4 memory is hideously expensive.
> 
> Or maybe you could use a machine from the GCC compile farm?
> 
> According to https://gcc.gnu.org/wiki/CompileFarm, there are three
> machines with at least 128GB available (gcc111, gcc112, gcc119).

The Great Beast is a semi-custom PC optimized for doing graph theory
on working sets gigabytes wide - its design emphasis is on the best
possible memory caching. If I dropped back to a conventional machine
the test times would go up by 50% (benchmarked, that's not a guess),
and they're already bad enough to make test cycles very painful.
I just saw elapsed time 8h30m36.292s for the current test - I had it
down to 6h at one point but the runtimes scale badly with increasing
repo size, there is intrinsically O(n**2) stuff going on.

My first evasive maneuver is therefore to run tests with my browser
shut down.  That's working.  I used to do that before I switched from
C-Python to PyPy, which runs faster and has a lower per-object
footprint.  Now it's mandatory again.  Tells me I need to get the
conversion finished before the number of commits gets much higher.

More memory would avoid OOM but not run the tests faster.  More cores
wouldn't help due to Python's GIL problem - many of reposurgeon's
central algorithms are intrinsically serial, anyway.  Higher
single-processor speed could help a lot, but there plain isn't
anything in COTS hardware that beats a Xeon 3 cranking 3.5Ghz by
much. (The hardware wizard who built the Beast thinks he might be able
to crank me up to 3.7GHz later this year but that hardware hasn't
shipped yet.)

The one technical change that might help is moving reposurgeon from
Python to Go - I might hope for as much as a 10x drop in runtimes from
that and a somewhat smaller decrease in working set. Unfortunately
while the move is theoretically possible (I've scoped the job) that
too would be very hard and take a long time.  It's 14KLOC of the most
algorithmically dense Python you are ever likely to encounter, with
dependencies on Python libraries sans Go equivalents that might
double the LOC; only the fact that I built a *really good* regression-
and unit-test suite in self-defense keeps it anywhere near to
practical.

(Before you ask, at the time I started reposurgeon in 2010 there
wasn't any really production-ready language that might have been a
better fit than Python. I did look. OO languages with GC and compiled
speed are still pretty thin on the ground.)

The truth is we're near the bleeding edge of what conventional tools
and hardware can handle gracefully.  Most jobs with working sets as
big as this one's do only comparatively dumb operations that can be
parallellized and thrown on a GPU or supercomputer.  Most jobs with
the algorithmic complexity of repository surgery have *much* smaller
working sets.  The combination of both extrema is hard.
-- 
                <a href="http://www.catb.org/~esr/";>Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.


Reply via email to