On 18/09/2023 17:13, Alan McKinnon wrote:


On Mon, Sep 18, 2023 at 6:03 PM Peter Humphrey <pe...@prh.myzen.co.uk <mailto:pe...@prh.myzen.co.uk>> wrote:

    On Monday, 18 September 2023 14:48:46 BST Alan McKinnon wrote:
     > On Mon, Sep 18, 2023 at 3:44 PM Peter Humphrey
    <pe...@prh.myzen.co.uk <mailto:pe...@prh.myzen.co.uk>>
     >
     > wrote:
     > > It may be less complex than you think, Jack. I envisage a
    package being
     > > marked
     > > as solitary, and when portage reaches that package, it waits
    until all
     > > current
     > > jobs have finished, then it starts the solitary package with the
     > > environment
     > > specified for it, and it doesn't start the next one until that
    one has
     > > finished.
     > > The dependency calculation shouldn't need to be changed.
     > >
     > > It seems simple the way I see it.
     >
     > How does that improve emerge performance overall?

    By allocating all the system resources to huge packages while not
    flooding the
    system with lesser ones. For example, I can set -j20 for webkit-gtk
    today
    without overflowing the 64GB RAM, and still have 4 CPU threads
    available to
    other tasks. The change I've proposed should make the whole
    operation more
    efficient overall and take less time.

    As things stand today, I have to make do with -j12 or so, wasting
    time and
    resources. I have load-average set at 32, so if I were to set -j20
    generally
    I'd run out of RAM in no time. I've had many instances of packages
    failing to
    compile in a large update, but going just fine on their own; and
    I've had
    mysterious operational errors resulting, I suspect, from otherwise
    undetected
    miscompilation.

    Previous threads have more detail of what I've tried already.


I did read all those but no matter how you move things around you still have only X resources available all the time. Whether you just let emerge do it's thing or try get it to do big packages on their own, everything is still going to use the same number of cpu cycles overall and you will save nothing.

Except a big chunk off your power bill ... a system under stress uses more energy for the same amount of work.

If webkit-gtk is the only big package, have you considered:

emerge -1v webkit-gtk && emerge -avuND @world?


What you have is not a portage problem. It is a orthodox parallelism problem, and I think you are thinking your constraint is unique in the work - it isn't. With parallelism, trying to fiddle single nodes to improve things overall never really works out.

A big problem you are missing is that portage does not have control of the system. It can control its usage of the system, but if I want emerge to use as much SPARE resource IN THE BACKGROUND as it can without impacting on on-line responsiveness, that is HARD.

I would like to be able to tell portage "these programs are resource hogs, don't parallelise them". If portage has loads of little jobs, it can fire them off one after the other as resource becomes available. If it fires a hog (or worse, two) off at the same time, the system can rapidly collapse under load.

Even better, if portage knew roughly how much resource each job required, it could (within constraints) start with the jobs that required least resource and run loads of them, and by firing jobs off in order of increasing demandingness, the number of jobs running in parallel would naturally tail off.

At the end of the day, if the computer takes an extra 20% time, I'm not bothered. If I'm sat at the computer 20% time extra because the system isn't responding because emerge has bogged it down, then I do care. And when I'm building things like webkit-gtk, llvm, LO, FF and TB, they do hammer my system. If they're running in parallel, my system would be near unusable.

Cheers,
Wol

Reply via email to