On Wed, 2022-06-29 at 16:10 -0400, Aryaman Gupta wrote: > Stop the scheduler from starting new tasks if the > current cpu or io pressure is above a certain threshold, specified > through the "BB_MAX_{CPU|IO}_SOME_PRESSURE" variables in conf/local.conf. > > If the thresholds aren't specified, the default values are 100 for both > CPU and IO, which will have no impact on build times. > Arbitary lower limit of 1.0 results in a fatal error to avoid extremely > long builds. If the percentage limits are higher than 100, then the default > values are used and warnings are issued to inform users that the > specified limit is out of bounds. > > Signed-off-by: Aryaman Gupta <aryaman.gu...@windriver.com> > Signed-off-by: Randy Macleod <randy.macl...@windriver.com> > --- > bitbake/lib/bb/runqueue.py | 39 ++++++++++++++++++++++++++++++++++++++ > 1 file changed, 39 insertions(+)
This looks like a good start, thanks. There are a few things which will need cleaning up in here as this is pretty performance sensitive code (try a "time bitbake world -n" to see what I mean). Firstly, it is a bitbake patch so should really go to the bitbake mailing list, not OE-Core. > diff --git a/bitbake/lib/bb/runqueue.py b/bitbake/lib/bb/runqueue.py > index 1e47fe70ef..9667acc11c 100644 > --- a/bitbake/lib/bb/runqueue.py > +++ b/bitbake/lib/bb/runqueue.py > @@ -159,6 +159,27 @@ class RunQueueScheduler(object): > self.buildable.append(tid) > > self.rev_prio_map = None > + # Some hosts like openSUSE have readable /proc/pressure files > + # but throw errors when these files are opened. > + try: > + subprocess.check_output(["cat", "/proc/pressure/cpu", > "/proc/pressure/io"], \ > + universal_newlines=True, > stderr=subprocess.DEVNULL) > + self.readable_pressure_files = True > > except: > + if self.rq.max_cpu_pressure!=100 or self.rq.max_io_pressure!=100: > + bb.warn("The /proc/pressure files can't be read. Continuing > build without monitoring pressure") > + self.readable_pressure_files = False > + > + def exceeds_max_pressure(self): > + if self.readable_pressure_files: > + # extract avg10 from /proc/pressure/{cpu|io} > + curr_pressure_sample = subprocess.check_output(["cat", > "/proc/pressure/cpu", "/proc/pressure/io"], \ > + universal_newlines=True, > stderr=subprocess.DEVNULL) > + curr_cpu_pressure = > curr_pressure_sample.split('\n')[0].split()[1].split("=")[1] > + curr_io_pressure = > curr_pressure_sample.split('\n')[2].split()[1].split("=")[1] This is horrible, you're adding in a fork() call for every pass through the scheduler code. You can just open() and read the file instead which will have *much* lower overhead. Even then, I'm not particularly thrilled to have this level of overhead in this codepath, in some ways I'd prefer to rate limit how often we're looking up this value rather than once per pass through the scheduler path. I'm curious what the timings say. > + > + return float(curr_cpu_pressure) > self.rq.max_cpu_pressure or > float(curr_io_pressure) > self.rq.max_io_pressure > + return False > > def next_buildable_task(self): > """ > @@ -171,6 +192,8 @@ class RunQueueScheduler(object): > buildable.intersection_update(self.rq.tasks_covered | > self.rq.tasks_notcovered) > if not buildable: > return None > + if self.exceeds_max_pressure(): > + return None > > # Filter out tasks that have a max number of threads that have been > exceeded > skip_buildable = {} > @@ -1699,6 +1722,8 @@ class RunQueueExecute: > > self.number_tasks = int(self.cfgData.getVar("BB_NUMBER_THREADS") or > 1) > self.scheduler = self.cfgData.getVar("BB_SCHEDULER") or "speed" > + self.max_cpu_pressure = > float(self.cfgData.getVar("BB_MAX_CPU_SOME_PRESSURE") or 100.0) > + self.max_io_pressure = > float(self.cfgData.getVar("BB_MAX_IO_SOME_PRESSURE") or 100.0) I did wonder if this should be BB_PRESSURE_MAX_SOME_IO as the order of the information kinds of seems backwards to me. That could just be me though! :) Cheers, Richard
-=-=-=-=-=-=-=-=-=-=-=- Links: You receive all messages sent to this group. View/Reply Online (#167408): https://lists.openembedded.org/g/openembedded-core/message/167408 Mute This Topic: https://lists.openembedded.org/mt/92073312/21656 Group Owner: openembedded-core+ow...@lists.openembedded.org Unsubscribe: https://lists.openembedded.org/g/openembedded-core/unsub [arch...@mail-archive.com] -=-=-=-=-=-=-=-=-=-=-=-