I don't want to blow up the OS interference discussion too much, so I'll
just refer you to the slides I shared or the classic 2003 paper,
https://ieeexplore.ieee.org/document/1592958.

Short form: 2048 4-cpu alpha systems, quadrics interconnect, had lpd
running. lpd woke up every 30s to realize it had nothing to do and went to
sleep.

That 1/30 hz per-node interference was worth multiple tens of millions of
dollars.

At scale, very little interference can have very large impact.
ron.
p.s. the fix was to turn off lpd.

On Sun, Feb 9, 2025 at 11:44 AM <tlaro...@kergis.com> wrote:

> FWIW, I'm still interested but I'm late with my home work---I have
> just sent the iw9p wip PDF almost on the deadline...
>
> I will start to gather your late explanations to update the
> documentation that I have started, and will switch to the new git
> repository, in order to resume testing, preferably on real hardware (I'd
> like too to find a cheap x86 multicores because I'm limited in this
> area, and I'm mainly interested in RISC-V these days, if only for the
> instructions metrics of the x86 line: 1338 instructions, thousands of
> pages to try to read; I prefer "The RISC-V Reader"...).
>
> On Sun, Feb 09, 2025 at 10:43:25AM -0800, ron minnich wrote:
> > Frank, while NIX is limited to x86, for now, the four key functions are
> > pretty compact, and I'm planning to get it working on other
> architectures,
> > or help others interested, as interest allows. I've got no love for the
> x86
> > and the x86 vendors at this point.
> >
> > In terms of whether NIX goes into 9front upstream, the issue is that
> there
> > is a small footprint: a new struct member in Mach (NIX); some changes to
> > startup code (mainly in mp). It's not just "enable a device" -- it has a
> > certain impact on core code.
> >
> > Ori sensibly said that unless we can show a real demand for it, putting
> it
> > in 9front front branch might not make sense. From what I can see so far,
> > keeping it as a small branch of 9front should be easy to keep going.
> >
> > If the interest rises, then, perhaps, it goes into 9front front branch.
> We
> > have to wait and see.
> >
> > Please take a look if you can, I'm interested in comments (when phrased
> as
> > code they'll get more attention however :-)
> >
> >
> >
> > On Sun, Feb 9, 2025 at 6:09?AM Frank D. Engel, Jr. <fde...@fjrhome.net>
> > wrote:
> >
> > >
> > > Two questions I am wondering about:
> > >
> > > 1. Is this still limited to x86 only, as I think was suggested on an
> > > earlier thread related to this effort, or has it been generalized to
> work
> > > across other CPU architectures?
> > >
> > > 2. As this evidently needs to be "enabled" by configuring processors to
> > > act as ACs, once generalized across architectures to whatever extent is
> > > possible and tested as stable, would there be any reason not to roll
> this
> > > into 9front itself rather than keeping it as a separate project?
> > >
> > >
> > > On 2/9/25 02:07, ron minnich wrote:
> > >
> > > NIX this evening. Test on 4 core laptop.
> > >
> > > fshalt -r does not end well on my laptop; it's the usual issue with
> > > drivers not dealing that well with a warm reset, and hardware being
> hard.
> > >
> > > So: I modified bootrc to drop me into rc before it starts mounting file
> > > systems and such. I added execac to the image. fshalt -r 9pc64 and I
> can
> > > run execac as a test.
> > >
> > > And it works on my T420 :-)
> > >
> > > So, next step: try the fixed time quantum (FTQ) benchmark:
> > > github.com/rminnich/ftq and compare noise results from a TC and an AC.
> > >
> > > ftq measures interference that can cause scaling issues in
> supercomputers,
> > > more here:
> > >
> https://www.researchgate.net/publication/4123211_Analysis_of_microbenchmarks_for_performance_tuning_of_clusters
> > >
> > > We developed FTQ at LANL to measure noise. Some results here:
> > >
> https://docs.google.com/presentation/d/1_BsQaO_0hdz8RSW1PAzMH6rVDQmdkwupfTmr-x5EXq4/edit?usp=sharing
> > >
> > > In 2003 we got far better FTQ results on Plan 9 than on linux, which is
> > > why in 2003 I started the Plan 9 for supercomputing project. We got
> really
> > > good results on Blue Gene in 2007.
> > >
> > > in 2011, I got the best results ever measured, and still the best on
> just
> > > about everything I've ever seen in 25 years, using NIX on an AC. These
> > > would have been great results on machine in single user mode, but they
> were
> > > achieved on a fully booted machine running rio -- since the AC was left
> > > alone to do its thing.
> > >
> > > Here's hoping that still holds; I'll try it tomorrow. Will be pretty
> nice
> > > to see if it works out.
> > >
> > > But ... NIX is ready enough for you to try. I have no idea if it's any
> use
> > > for anyone, but thanks to Paul for getting this port off the ground,
> and
> > > Thierry, who kicked it into gear by asking the right question at the
> right
> > > time.
> > >
> > > github.com/rminnich/9front, 9front branch
> > >
> > > On Sat, Feb 8, 2025 at 9:16?AM Paul Lalonde <paul.a.lalo...@gmail.com>
> > > wrote:
> > >
> > >> Nice!  Congratulations!
> > >>
> > >>
> > >>
> > >> On Sat, Feb 8, 2025, 9:13?a.m. ron minnich <rminn...@gmail.com>
> wrote:
> > >>> ok that's fixed and:
> > >>>
> > >>> % ratrace -c execac -c 1 /bin/date
> > >>> 98 execac Running 204326 0x1prepage: base 0x7ffffefff000 top
> > >>> 0x7ffffffff000
> > >>> prepage: base 0x200000 top 0x400000
> > >>> prepage: base 0x400000 top 0x406000
> > >>> prepage: base 0x406000 top 0x406000
> > >>> /bin/date: timezone: file does not exist: '/env/timezone'
> > >>>  = process exited
> > >>> % k
> > >>>
> > >>> I *think* that means it worked.
> > >>>
> > >>> First execac I've run in ... well ... a long time.
> > >>>
> > >>> On Sat, Feb 8, 2025 at 8:17?AM ron minnich <rminn...@gmail.com>
> wrote:
> > >>>> The new default branch is 9front.
> https://github.com/rminnich/9front
> > >>>>
> > >>>> I just pushed a commit that:
> > >>>> 1. has the execac command use sysr1 for now
> > >>>> 2. drops bootrc into a shell before root is mounted so you can poke
> > >>>> around and run execac
> > >>>> 3. adds ratrace, execac, and date
> > >>>>
> > >>>> When you build nix, look in systab.h, replace
> > >>>> [SYSR1] sysr1,
> > >>>> with
> > >>>> [SYSR1] sysexecac
> > >>>>
> > >>>> it's just easier to coopt sysr1 for now
> > >>>>
> > >>>> when you boot, make sure you have at least 2 cores; when it drops
> to a
> > >>>> shell, try this
> > >>>> execac -c 1 /bin/date
> > >>>> That would run /bin/date on core 1.
> > >>>>
> > >>>> In a perfect world.
> > >>>>
> > >>>> well ...
> > >>>> qunlock called with qlock not held, from 0xffffffff8021e5c2
> > >>>> qunlock called with qlock not held, from 0xffffffff8021e5c2
> > >>>>
> > >>>> oops.
> > >>>>
> > >>>> If you know how to debug qemu with gdb, well, here's a place to
> start.
> > >>>> Or just look at what's at that PC in the kernel, and see what it
> might
> > >>>> be
> > >>>>
> > >>>>
> > >>>>
> > >>>> *9fans <https://9fans.topicbox.com/latest>* / 9fans / see
> discussions
> > > <https://9fans.topicbox.com/groups/9fans> + participants
> > > <https://9fans.topicbox.com/groups/9fans/members> + delivery options
> > > <https://9fans.topicbox.com/groups/9fans/subscription> Permalink
> > > <
> https://9fans.topicbox.com/groups/9fans/Tc4be07fc1c6ad31c-M554dde1d1703d80e2afe9d53
> > >
> > >
> 
> --
> Thierry Laronde <tlaronde +AT+ kergis +dot+ com>
>              http://www.kergis.com/
>             http://kertex.kergis.com/
> Key fingerprint = 0FF7 E906 FBAF FE95 FD89  250D 52B1 AE95 6006 F40C

------------------------------------------
9fans: 9fans
Permalink: 
https://9fans.topicbox.com/groups/9fans/Tc4be07fc1c6ad31c-M878d4237fc17b70c58c31c7c
Delivery options: https://9fans.topicbox.com/groups/9fans/subscription

Reply via email to